EDB Failover manager (EFM) continues to gain popularity among Postgres DBAs. EDB customers are using this tool to build a highly available EDB Postgres platform.

EFM’s primarily role is to monitor Postgres Clusters, notify the DBA of any failed clusters and automatically promote the standby cluster to function as a master.

Here are some of the high-level features of EFM:

  1. Automatic failover
  2. Switchover to a standby (s) close to master in terms of transaction xlog
  3. Set priority of standby (s)
  4. Customize notifications/alerts
  5. Transfer VIP to the new master with failover (provided, the master has been configured with VIP)

Utility “efm” is used by the DBAs to gain control of these actions. However, the need to connect to EFM cluster servers/nodes using ssh continues to not optimize the DBA experience.

Keeping this pain point in mind, we have developed “efm_sql_command” extension for the EFM 2.1.x utility command using the inherent Postgres interface. DBAs no longer need to ssh to EFM node to allow/disallow new/old nodes for EFM clusters. Additionally, through the Postgres interface itself, EFM can also be integrated with other monitoring systems like EDB Postgres Enterprise Manager, Nagios, etc.

efm_sql_command extension provides following functions:

  • efm_sql_command.efm_allow_node(‘ip address’);

This functions takes IP address as an argument and notifies EFM agents to allow the specific IP address to be part of a EFM cluster. It returns status of EFM command in 0/1. 0 means successful and 1 means failed.DBAs now can use SQL interface to notify EFM for allowing the new standby (which they are still building) without ssh to one of the EFM nodes.

  • efm_sql_command.efm_disallow_node(‘ip address’);

Similar to efm_allow_node, this function notifies to EFM agents to disallow the IP address from EFM cluster. Function report the status of the command in 0 and 1, where 0 means successfully notified and 1 means failed.

  • efm_sql_command.efm_failover();

Function notifies the EFM cluster agents to perform manual failover of master to closest standby. The function returns 0/1 status, where 0 means successfully notified EFM cluster agents and 1 means failed to notify agents.

  • efm_sql_command.efm_resume_monitoring();

If an agent is not monitoring the local EDB Postgres database, this function makes EFM agent start monitoring the local EDB Postgres database.

  • efm_sql_command.efm_set_priority(‘ip address’, ‘priority’);

Set failover priority for standby and return the status of the command in 0/1, where 0 means success and 1 means failed.

  • efm_sql_command.efm_switchover();

This function performs the switchover to closest standby of master and reconfigures the master as a new standby.

  • efm_sql_command.efm_local_properties

This a view using which DBA can view the efm properties.

  • efm_sql_command.efm_nodes_details:

This view provides the details of each node in EFM cluster. A user can use this for other the purpose. For example checking the status of standby(s), How far are the standbys from a master? etc.

  • efm_sql_command.efm_cluster_status

This function takes following arguments as text:
‘text’: To print the status of EFM cluster in TEXT
‘json’: To print the status of EFM cluster in JSON format.

This extension also gives following GUC, which DBAs can set at cluster/database/user level

ALTER SYSTEM SET efm.cluster_name TO 'clustername';
ALTER DATABASE  SET efm.cluster_name TO 'clustername';
ALTER USER  SET efm.cluster_name TO 'clustername';

Following are some snapshots of efm_sql_command’s functions

CREATE EXTENSION efm_sql_command;
edb=# select efm_extension.efm_cluster_status('text');
INFO: efm command is available
efm_cluster_status
------------------------------------------------------------------------
Cluster Status: efm
VIP:
 
Agent Type Address Agent DB Info
--------------------------------------------------------------
Idle 172.17.0.2 UP UNKNOWN
 
Allowed node host list:
172.17.0.2
 
Membership coordinator: 172.17.0.2
 
Standby priority host list:
(List is empty.)
 
Promote Status:
 
 
Idle Node Status (idle nodes ignored in XLog location comparisons):
 
Address XLog Loc Info
--------------------------------------------------------------
172.17.0.2 0/35BFC10 DB is not in recovery.
(23 rows)
edb=# select jsonb_pretty(efm_extension.efm_cluster_status('json')::jsonb);
INFO:  efm command is available 
                         jsonb_pretty                         
--------------------------------------------------------------
 {                                                           +
     "VIP": "",                                              +
     "nodes": {                                              +
         "172.17.0.2": {                                     +
             "db": "UNKNOWN",                                +
             "info": " ",                                    +
             "type": "Idle",                                 +
             "xlog": "0/35BFC10",                            +
             "agent": "UP",                                  +
             "xloginfo": "DB is not in recovery."            +
         }                                                   +
     },                                                      +
     "messages": [                                           +
         "Did not find XLog location for any non-idle nodes."+
     ],                                                      +
     "allowednodes": [                                       +
         "(List",                                            +
         "is",                                               +
         "empty.)"                                           +
     ],                                                      +
     "minimumstandbys": 0,                                   +
     "failoverpriority": [                                   +
     ],                                                      +
     "membershipcoordinator": "172.17.0.2"                   +
 }
(1 row)
edb=# select efm_extension.efm_allow_node('172.17.0.2');
INFO: efm command is available
efm_allow_node
----------------
0
(1 row)
edb=# select efm_extension.efm_disallow_node('172.17.0.2');
INFO: efm command is available
efm_disallow_node
-------------------
0
(1 row)
edb=# select * from efm_extension.efm_nodes_details ;
INFO: efm command is available
node_ip | property | value
------------+----------+--------------------------
172.17.0.2 | db | "UNKNOWN"
172.17.0.2 | info | " "
172.17.0.2 | type | "Idle"
172.17.0.2 | xlog | "0/35BC388"
172.17.0.2 | agent | "UP"
172.17.0.2 | xloginfo | "DB is not in recovery."
(6 rows)
edb=# select * from efm_extension.efm_local_properties ;
           name            |              value              
---------------------------+----------------------------------
 efm.license               |
 db.user                   | efm
 db.password.encrypted     | 074b627bf50168881d246c5dd32fd8d0
 db.port                   | 5444
 db.database               | edb
 db.service.owner          | enterprisedb
 db.service.name           | edb-as-9.6
 db.bin                    | /usr/edb/as9.6/bin
 db.recovery.conf.dir      | /pgdata
 jdbc.ssl                  | false
 jdbc.ssl.mode             | verify-ca
 user.email                | vibhor.kumar@enterprisedb.com
 script.notification       |
 bind.address              | 172.17.0.2:5430
 admin.port                | 5431
 is.witness                | false
 local.period              | 10
 local.timeout             | 60
 local.timeout.final       | 10
 remote.timeout            | 10
 node.timeout              | 50
 pingServerIp              | 8.8.8.8
 pingServerCommand         | /bin/ping -q -c3 -w5
 auto.allow.hosts          | true
 db.reuse.connection.count | 0
 auto.failover             | true
 auto.reconfigure          | true
 promotable                | true
 minimum.standbys          | 0
 recovery.check.period     | 2
 auto.resume.period        | 0
 virtualIp                 |
 virtualIp.interface       |
 virtualIp.netmask         |
 script.fence              |
 script.post.promotion     |
 script.resumed            |
 script.db.failure         |
 script.master.isolated    |
 sudo.command              | sudo
 sudo.user.command         | sudo -u %u
 jgroups.loglevel          | INFO
 efm.loglevel              | INFO
 jvm.options               | -Xmx32m
(44 rows)

I look forward to your comments on this topic. Click here to request a full demo of EFM.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s