Transitioning Synchronous Standby to Asynchronous in EDB Postgres

 

 

In the world of solution architecture, where we as solution architects are developing configurations to meet specific customer needs, we often consider a great deal of interesting scenarios. One of such scenario my team at EnterpriseDB (EDB) encountered recently bears exploring further with this blog

For a business critical application, transaction durability is crucial. Therefore, we helped one customer recently to design an architecture for near zero data loss.

One of our customers wanted to use two synchronous replication mechanisms within the same data center with a high-speed network between the master and standby for data transfer. Also, the database administrator  wanted to have read consistency from the replicas.

In Postgres 9.6, we have synchronous_commit = remote_apply, which helps in getting read-balancing consistency.

In addition to these requirements, the customer wanted to layer another capability into architecture; where if one of his synchronous standby replicas is down, then write transactions should not be stopped/hung. Transactions on the master should go into a “waiting state” if all the synchronous replication standbys are down.

EDB Postgres Advanced Server 9.6 comes with all building blocks out of the box to include this kind of logic to handle the scenario.

To achieve this capability in EDB Postgres, the following steps can be used:

1  Create the pg_background extension. The following is a link which you can use to compile pg_background module:

https://github.com/vibhorkum/pg_background
Below is a snapshot:

[root@ca420ebfb299 pg_background]# make install
/bin/mkdir -p '/usr/edb/as9.6/lib'
/bin/mkdir -p '/usr/edb/as9.6/share/extension'
/bin/mkdir -p '/usr/edb/as9.6/share/extension'
/bin/install -c -m 755  pg_background.so '/usr/edb/as9.6/lib/pg_background.so'
/bin/install -c -m 644 .//pg_background.control '/usr/edb/as9.6/share/extension/'
/bin/install -c -m 644 .//pg_background--1.0.sql  '/usr/edb/as9.6/share/extension/'
[root@ca420ebfb299 pg_background]# psql
psql.bin (9.6.2.7)
Type "help" for help.

edb=# create extension pg_background;
CREATE EXTENSION
edb=#

2  Create the dbms_scheduler and pgagent extension in EDB Postgres as shown below:

edb=# CREATE EXTENSION dbms_scheduler ;
CREATE EXTENSION
edb=# create extension pgagent;
CREATE EXTENSION

3  Now, you can create the procedure which can do following:

  • Identify all synchronous standby and check the status of each named synchronous standby in pg_stat_replication
  • If named synchronous standby doesn’t exists in pg_stat_replication, then change the synchronous_standby_names parameter in such a way that it doesn’t lose the name of synchronous standbys, however can demote the named synchronous to asynchronous standby. For that, it’s recommended to use following string for synchronous_standby_names parameter:
2(standby1, standby2…)
  • After demoting the synchronous standby to asynchronous, send an e-mail to your DBAs group to notify them about demotion and your DBAs can take necessary steps. In EDB Postgres, we have a package UTL_SMTP, which can be used for sending e-mails. The following is an example of such a procedure:
CREATE OR REPLACE PROCEDURE send_mail (
    p_sender        VARCHAR2,
    p_recipient     VARCHAR2,
    p_subj          VARCHAR2,
    p_msg           VARCHAR2,
    p_mailhost      VARCHAR2
)
IS
    v_conn          UTL_SMTP.CONNECTION;
    v_crlf          CONSTANT VARCHAR2(2) := CHR(13) || CHR(10);
    v_port          CONSTANT PLS_INTEGER := 25;
BEGIN
    v_conn := UTL_SMTP.OPEN_CONNECTION(p_mailhost,v_port);
    UTL_SMTP.HELO(v_conn,p_mailhost);
    UTL_SMTP.MAIL(v_conn,p_sender);
    UTL_SMTP.RCPT(v_conn,p_recipient);
    UTL_SMTP.DATA(v_conn, SUBSTR(
        'Date: ' || TO_CHAR(SYSDATE,
        'Dy, DD Mon YYYY HH24:MI:SS') || v_crlf
        || 'From: ' || p_sender || v_crlf
        || 'To: ' || p_recipient || v_crlf
        || 'Subject: ' || p_subj || v_crlf
        || p_msg
        , 1, 32767));
    UTL_SMTP.QUIT(v_conn);
END;

For more information on the UTL_SMTP package in EDB Postgres, click here.

  • If none of the standbys are available, then maintain the setting of synchronous_standby_names as given below:

synchronous_standby_names = 1(standby1, standby2,) The above setting will cover the scenario, where write should be stopped or should be in hanging state in case all standbys are down.

  • If replication slots are getting used, then check the lag for replication slots and reset the replication slots, so that we are not overloading pg_xlog.

Note: If you are using synchronous standby with replication slots, it’s recommended to have the same name for synchronous standbys and replication slots.

4  The following is a link procedure which covers all points mentioned in step 3.

https://github.com/vibhorkum/procedure_transition_sync_async

5  After creating the procedure, the user can schedule a job as given below:

EXEC sys.DBMS_SCHEDULER.create_program (
      'synchronous_to_asynchrnous',
      'STORED_PROCEDURE',
      '"public"."transition_sync_async"',
      3,true,  
     'Job which can transition synchronous to asynchronous');  

EXEC
  DBMS_SCHEDULER.DEFINE_PROGRAM_ARGUMENT(
    program_name         => 'synchronous_to_asynchrnous',
    argument_position    => 1,
    argument_name        => 'allowed_slots_lag',
    argument_type        => 'NUMERIC',
    default_value        => '1073741824');
EXEC
  DBMS_SCHEDULER.DEFINE_PROGRAM_ARGUMENT(
    program_name         => 'synchronous_to_asynchrnous',
    argument_position    => 2,
    argument_name        => 'p_sender',
    argument_type        => 'TEXT',
    default_value        => 'vibhor.aim@gmail.com');

EXEC
  DBMS_SCHEDULER.DEFINE_PROGRAM_ARGUMENT(
    program_name         => 'synchronous_to_asynchrnous',
    argument_position    => 3,
    argument_name        => 'p_recipient',
    argument_type        => 'TEXT',
    default_value        => 'vibhor.aim@gmail.com');

EXEC
  DBMS_SCHEDULER.CREATE_SCHEDULE (
    schedule_name    => 'every_minute',  
    start_date       => now, 
    repeat_interval  => 'FREQ=MINUTELY;',  
    comments         => 'This schedule executes every minute');  

EXEC    
   DBMS_SCHEDULER.create_job (
   job_name       => 'job_synchronous_to_asynchrnous',
   program_name    => 'synchronous_to_asynchrnous',
   schedule_name   => 'every_minute',
   comments        => 'Job which can transition synchronous to asynchronous');

Following is a snapshot of manual execution of job:

edb=# exec DBMS_SCHEDULER.RUN_JOB('job_synchronous_to_asynchrnous', TRUE);
INFO: synchronos_standby_names => 2(standby1,standby2)
INFO: standby_name => {2,standby1,standby2}
INFO: old standby count => 2
INFO: synchronous_standby_count => 2

EDB-SPL Procedure successfully completed

For more information on the DBMS_SCHEDULER package in EDB Postgres, click here.

Monitoring Amazon PostgreSQL RDS with PEM version 6.0

The EDB Postgres Enterprise Manager is an enterprise management tool designed to tune, manage, and monitor a large number of Postgres deployments on servers on-premises or spread out across geographical boundaries. It is the cornerstone of the Management Suite of tools that is integrated into the EDB Postgres Platform from EnterpriseDB® (EDB) and it emerged to solve a great many challenges database administrators faced daily. (The blog, How to Be the Master of Your Database Domain, explored EDB Postgres Enterprise Manager in greater detail.)

EDB customers use EDB Postgres Enterprise Manager for the following key features:

  1. Management of multiple EDB Postgres Advanced Server deployments
  2. Capacity planning
  3. Performance monitoring
  4. Alert Management
  5. Auditing violation
  6. Database Analysis
  7. Postgres Expert

Recently one of EDB’s customers acquired a company that was hosting its data on Amazon PostgreSQL RDS. The customer wanted to leverage EDB Postgres Enterprise Manager to monitor the database deployments on PostgreSQL RDS.

At the time, this request was considered a real possibility. A significant amount of development expertise has gone into EDB Postgres Enterprise Manager to create a sophisticated, enterprise-class tool for managing vast deployments of Postgres. As for remote monitoring, EDB Postgres Enterprise Manager had the capacity to monitor remote databases using an agent on a designated server. EDB Postgres Enterprise Manager also provides capabilities for users, such as database administrators, to create their own custom probes, custom dashboards, and custom alerts. Combined, all of these capabilities mean EDB Postgres Enterprise Manager ultimately provides the kinds of control and flexibility DBAs needed for managing databases that are hosted in different environments.

EDB worked closely with the customer to develop a process for utilizing their subscription for the EDB Postgres Enterprise Manager, which is part of the EDB Postgres Platform, with PostgreSQL databases on Amazon RDS. The steps were formalized into a repeatable process so others could follow suit and enjoy the benefits of EDB Postgres Enterprise Manager in their cloud deployments.

The following are the steps for using EDB Postgres Enterprise Manager with Amazon PostgreSQL RDS:

Amazon PostgreSQL RDS comes with an rds_superuser, which is not the same as a super user in PostgreSQL. Rds_super role has a lot of limitations, and they effect the monitoring capabilities.

To use EDB Postgres Enterprise Manager with PostgreSQL RDS, follow these steps:

  1. Use the following function to modify the EDB Postgres Enterprise Manager probes so that it can leverage user defined views on some catalog tables:
CREATE OR REPLACE FUNCTION strip_pg_catalog_from_probe()
RETURNS boolean
LANGUAGE plpgsql
AS
$function$
 DECLARE
   rec RECORD;
 BEGIN
   CREATE TABLE pem.probe_code_backup AS SELECT id, probe_code FROM pem.probe WHERE id IN (1, 2, 3, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 29, 36, 39, 47, 52, 53, 54);
   FOR rec IN SELECT id, probe_code FROM pem.probe WHERE id IN (1, 2, 3, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 29, 36, 39, 47, 52, 53, 54)
   LOOP
      UPDATE pem.probe SET probe_code=replace(rec.probe_code,'pg_catalog.','') where id=rec.id;
   END LOOP;
   RETURN true;
END;
$function$;
  1. After creating the above function, execute the above function in the EDB Postgres Enterprise Manager server database as shown below:

psql -h localhost –c “SELECT strip_pg_catalog_from_probe()” pem

  1. Connect to PostgreSQL RDS database as rds_super user and execute the following SQL in Postgres and the user’s defined database:
CREATE USER monitor PASSWORD 'password';
GRANT rds_superuser TO monitor;
CREATE SCHEMA rds_pem_views;
SET search_path TO rds_pem_views;
CREATE OR REPLACE VIEW pg_database AS SELECT oid,* FROM pg_catalog.pg_database WHERE datname <> 'rdsadmin';

CREATE OR REPLACE VIEW pg_tablespace AS SELECT oid, * FROM pg_tablespace WHERE spcname <> 'pg_global';

CREATE FUNCTION rds_pem_views.pg_ls_dir(text)
RETURNS TEXT
LANGUAGE sql
AS
$function$
   SELECT ''::TEXT;
$function$;

GRANT ALL ON FUNCTION rds_pem_views.pg_ls_dir(text) TO monitor;
GRANT ALL ON SCHEMA rds_pem_views TO monitor;
GRANT ALL ON  rds_pem_views.pg_database TO monitor;
GRANT ALL ON  rds_pem_views.pg_tablespace TO monitor;

ALTER USER monitor SET search_path TO rds_pem_views,pg_catalog, "$user", public;

The above SQL will create a database user “monitor” with “rds_superuser” privileges.

As mentioned above, rds_superuser in RDS is not the same as a true superuser in PostgreSQL, therefore a schema with a view for pg_database and for pg_tablespace is created.

Views pg_database and pg_tablespace are created because the rds_superuser role has the following restrictions:

  1. Rds_superuser or normal user in PostgreSQL RDS cannot access rdsadmin database;
  2. Rds_superuser or normal user in PostgreSQL RDS cannot calculate the size of pg_global tablespace;
  3. Rds_superuser cannot use pg_ls_dir to list the WAL files in pg_xlog directory.
  1. After executing SQLs, mentioned in step three, we can now use the database user monitor for EDB Postgres Enterprise Manager monitoring.
  2. For remote monitoring of PostgreSQL RDS, use the following link to add the following server:

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.14.html#

When you are following the documentation, please remember to click on “remote monitoring” and choose the EDB Postgres Enterprise Manager agent (PEM Agent) for remote monitoring as shown below:

snap1.png

In the above Bound Agent is pemagent running on PEM Server. Also note that the database is monitor.

After performing the above steps, the EDB Postgres Enterprise Manager server can monitor PostgreSQL RDS, and it can collect all database level statistics. Also, DBAs and other users can configure all database level alerts like autovacuum, etc. EDB Postgres Enterprise Manager provides 63 templates for database level alerts. It also provides flexibility for DBAs and other users to create their own alerts based on custom probes.

Now, some users might have questions or concerns about about system level information from Amazon PostgreSQL RDS, and whether EDB Postgres Enterprise Manager can monitor that data. The answer is yes.

As an example, the following are the steps for configuring EDB Postgres Enterprise Manager to monitor the CPU utilization of PostgreSQL RDS:

  1. Install the Amazon Cloudwatch tools on the EDB Postgres Enterprise Manager Server using following command:
wget http://ec2-downloads.s3.amazonaws.com/CloudWatch-2010-08-01.zip
unzip CloudWatch-2010-08-01.zip

For more information on setting the Amazon CloudWatch command line, please use the following link:

http://docs.aws.amazon.com/AmazonCloudWatch/latest/cli/SetupCLI.html

  1. Create an IAM user on the AWS panel, and attach the managed policy “CloudWatchReadOnlyAccess”.

For more information, please refer to the following link:

http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/iam-identity-based-access-control-cw.html

  1. Create an AWS Credential File using the following sample:
AWSAccessKeyId=
AWSSecretKey=

For more information on AWS Credential file, please use the following link:

http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html

  1. Change the permission of the above Credential file using the following command:
chmod 600 credential_file
  1. Create a following script to get the CPUUtiliztion information from Amazon CloudWatch:
#!/bin/bash
export AWS_CREDENTIAL_FILE=/usr/local/aws/credential/credential-file-path
export AWS_CLOUDWATCH_HOME=/root/CloudWatch-1.0.20.0
export PATH=$PATH:$AWS_CLOUDWATCH_HOME/bin
export JAVA_HOME=/usr/lib/jvm/jre
export AWS_CLOUDWATCH_URL=http://monitoring.us-west-2.amazonaws.com
echo -e "avg_cpu\tminimum_cpu\tmax_cpu"
mon-get-stats CPUUtilization --namespace="AWS/RDS" --dimensions="DBInstanceIdentifier=postgres" --statistics Average,Minimum,Maximum|tail -n 1|awk '{print $3"\t"$4"\t"$5}'
  1. After creating the above script, make the script executable as given below:
chmod +x rds_cpu.sh

Now you are ready to create a custom probe to use this script for monitoring with EDB Postgres Enterprise Manager.

To create a custom probe, please visit the following link:

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.29.html#pID0E0TBB0HA

Below is a snapshot of a custom probe which I have used:

snap2

By using above probes, user can create their custom alerts in EDB Postgres Enterprise Manager for monitoring CPU utilization on Amazon PostgreSQL RDS. Also they can use data for creating a custom dashboard for RDS.

The following is a snapshot of RDS CPU utilization from EDB Postgres Enterprise Manager.

snap3

If you want to know more about Custom Alerts and Custom Dashboards, please use the following links:

  1. Creating Custom Alerts templates

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.30.html#pID0E0E40HA

  1. Creating Custom Alerts

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.30.html#pID0E0Z60HA

  1. Creating Custom Charts

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.28.html#

  1. Creating Custom Ops Dashboard

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.27.html#pID0E0HFB0HA

 

As you can see, the EDB Postgres Enterprise Manager is very flexible and customizable to meet specific needs.

Partition pruning in EDB Postgres 9.5

One of my colleague who was recently working with a customer has presented a customer case. According to him, customer has a partitioned table and EDB Postgres was not applying the partition pruning in his query. So, I thought to blog about partition pruning, so that EDB Postgres developers and DBAs can benefit.

EDB Postgres supports two types of partition pruning:

Constraint exclusion pruning:

It is a feature introduced in Postgresql 8.1. This type of pruning works with PostgreSQL-style of partition. With constraint exclusion enabled, the planner will examine the constraints of each partition and try to prove that the partition need not be scanned because it could not contain any rows meeting the query’s WHERE clause. When the planner can prove this, it excludes the partition from the query plan.

However, it has some limitations. Following is the limitation of constraint_exclusion:

a. Constraint exclusion only works when the query’s WHERE clause contains constants (or externally supplied parameters). For example, a comparison against a non-immutable function such as CURRENT_TIMESTAMP cannot be optimized, since the planner cannot know which partition the function value might fall into at run time.
b. Keep the partitioning constraints simple, else the planner may not be able to prove that partitions don’t need to be visited. Use simple equality conditions for list partitioning, or simple range tests for range partitioning, as illustrated in the preceding examples. A good rule of thumb is that partitioning constraints should contain only comparisons of the partitioning column(s) to constants using B-tree-indexable operators.

For verification, below shows the behavior of constraint_exclusion pruning:
1. Let’s create PostgreSQL-style partition table using table inheritance feature.

CREATE TABLE measurement (
     city_id        int not null,
     logdate        date not null,
     peaktemp        int,
     unitsales      int
 );
CREATE TABLE measurement_y2004m02 (
     CHECK ( date_part('month'::text, logdate) = 2)
 ) INHERITS (measurement);
CREATE TABLE measurement_y2004m03 (
     CHECK ( date_part('month'::text, logdate) = 3 )
 ) INHERITS (measurement);

  1. Execute simple query to verify the constraint_exclusion behavior based on above definition:
 edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE date_part('month'::text, logdate) = 3;
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Aggregate
   ->  Append
         ->  Seq Scan on measurement
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
         ->  Seq Scan on measurement_y2004m02
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
         ->  Seq Scan on measurement_y2004m03
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
(8 rows)

Above output of the query shows EDB Postgres considered all partitions of table measurements, even though we had included partition column and constant value in WHERE clause. This is due to check constraint which has date_part function. date_part function is not immutable in Postgres, therefore at planning time, EDB Postgres doesn’t know what value it will return. And, if user doesn’t include proper WHERE clause as mentioned in check constraint, pruning will not work.

In Postgres you can make a function immutable by using ALTER FUNCTION command.

In below example, we will make date_part function immutable to check if constraint_exclusion works with date_part immutable function or not:

  1. Convert date_part function to immutable :
edb=# ALTER FUNCTION date_part (text, timestamp without time zone ) immutable;
ALTER FUNCTION
  1. Perform EXPLAIN command to check the behavior of constraint_exclusion using immutable function:
edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE date_part('month'::text, logdate) = 3;
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Aggregate
   ->  Append
         ->  Seq Scan on measurement
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
         ->  Seq Scan on measurement_y2004m03
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
(6 rows)

As you can see with immutable function EDB Postgres was able to perform constraint_exclusion pruning.

What if we change the WHERE clause little bit and include < and = operator in our SQL queries (below are examples)

edb=#  EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate < DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m02
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m03
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone) (8 rows) edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate = DATE '2004-02-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate = '01-FEB-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m02
               Filter: (logdate = '01-FEB-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m03
               Filter: (logdate = '01-FEB-04 00:00:00'::timestamp without time zone)
(8 rows)

As you can see with a change in WHERE clause and exclusion of the way constraint defined on partition, Postgres will scan all partitions.

Based on above we can conclude that if a user is planning to use Postgres way of partition then they have to be careful about the constraint definition in order to utilize constraint_exclusion pruning.

Lets modify the definition of measurement table and verify the ,=, <= and = operator in WHERE clause.

CREATE TABLE measurement (
     city_id        int not null,
     logdate        date not null,
     peaktemp        int,
     unitsales      int
 ); 
CREATE TABLE measurement_y2004m02 (
     CHECK ( logdate >= DATE '2004-02-01' AND logdate < DATE '2004-03-01' )  ) INHERITS (measurement); CREATE TABLE measurement_y2004m03 (      CHECK ( logdate >= DATE '2004-03-01' AND logdate < DATE '2004-04-01' )
 ) INHERITS (measurement);

Below is explain plan based on above definition:

edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate < DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m02
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone) (6 rows) edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate = DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate = '01-MAR-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m03
               Filter: (logdate = '01-MAR-04 00:00:00'::timestamp without time zone)
(6 rows)

Above clearly shows that with correct constraint definition, constriant_exclusion pruning can work for >,<,>=, <= and = operator in WHERE clause.

Fast pruning:

EDB Postgres has CREATE TABLE PARTITION SYNTAX since version 9.1. PARTITION SYNTAX in EDB Postgres uses one more pruning called fast pruning. Fast pruning uses the partition metadata and query predicates to efficiently reduce the set of partitions to scan. Fast pruning in EDB Postgres happens before query plan. Let’s verify the behavior of fast pruning.
As mentioned fast pruning works with partition which user created using EDB Postgres CREATE TABLE PARTITION Syntax. Let’s modify the above definition of measurement table to use CREATE TABLE PARTITION SYNTAX as given below:

CREATE TABLE  measurement (
     city_id        int not null,
     logdate        date not null,
     peaktemp        int,
     unitsales      int
 )
PARTITION BY RANGE(logdate)
(PARTITION y2004m01 VALUES LESS THAN ('2004-02-01'),
 PARTITION y2004m02 VALUES LESS THAN ('2004-03-01'),
 PARTITION y2004m03 VALUES LESS THAN ('2004-04-01')
);
edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate < DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m01
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m02
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone) (8 rows) edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate = DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate = '01-MAR-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m03
               Filter: (logdate = '01-MAR-04 00:00:00'::timestamp without time zone)
(6 rows)

edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate > DATE '2004-03-01';
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Aggregate
   ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate > '01-MAR-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m03
               Filter: (logdate > '01-MAR-04 00:00:00'::timestamp without time zone)
(6 rows)

For more information on EDB Postgres pruning please refer following link:
https://www.enterprisedb.com/docs/en/9.5/oracompat/Database_Compatibility_for_Oracle_Developers_Guide.1.327.html#

Dynamic RLS implementation in PPAS 9.3

In the course of my work at EnterpriseDB, migrating Oracle databases to EnterpriseDB’s Postgres Plus Advanced Server is a common task. However, now and then we encounter unique situations. While working on a migration project recently, we encountered a new use case for RLS (Row level Security).

The customer had a centralized database where it stored a huge number of transactions. These transactions are performed by different business units located in different parts of the world. There are certain types of transactions that should not be visible even if they are being queried by the same company. That is where RLS comes in. With RSL, specific transactions, or kinds of transactions, that can remain visible are mapped back to an attribute in the table.

The customer needed the application to authenticate users and set the context for which records in the database become visible for a specific session.
In its deployment of Oracle, the customer had used the functions/procedure in the Oracle package DBMS_SESSION. In the application, the customer used DBMS_SESSION.SET_CONTEXT to set the context. And for the Row Level Security, the customer was using the DBMS_SESSION.SYS_CONTEXT to implement security around the transactions.

Postgres Plus Advanced Server has a DBMS_SESSION package that is compatible with Oracle. However, it does not currently offer users the capability of setting the user defined context and implementing RLS based on those context. Given others may experience similar situations , as our customer, I wanted to provide the procedures and functions that users could deploy.

SET_CONTEXT procedure

The definition of this procedure is given below:

CREATE OR REPLACE PROCEDURE set_context(namespace TEXT,
                                        attribute TEXT,
                                        val       TEXT)
AS
BEGIN
    EXECUTE IMMEDIATE format('SET %s.%s TO %s',namespace, attribute,val);
END;

Using this procedure, users can set their own context at session level.

The following is a function to help view the context in session, which is set using the above procedure.

CREATE OR REPLACE FUNCTION USYS_CONTEXT(namespace TEXT,
                                        parameter TEXT,
                                        len       BIGINT DEFAULT 8)
RETURN TEXT
AS
  DECLARE
    return_val TEXT;
  BEGIN
    EXECUTE IMMEDIATE format('SHOW %s.%s',namespace,parameter) INTO return_val;
    RETURN substr(return_val,1,len);
    EXCEPTION WHEN others THEN
     RETURN NULL;
END;

The following is an example of how we can implement row level security based on the above procedure and functions:
1. Create a table which will have attribute context_check to map the context set by procedure:

CREATE TABLE test_rls(id numeric, col text, context_check text);
INSERT INTO test_rls SELECT id, 'First_check','aaa' FROM generate_series(1,10) foo(id);
INSERT INTO test_rls SELECT id, 'First_check','bbb' FROM generate_series(1,10) foo(id);
INSERT INTO test_rls SELECT id, 'First_check','ddd' FROM generate_series(1,10) foo(id);

2. Now create a function to check the application context. Below is one function:

CREATE OR REPLACE FUNCTION verify_user_context (
    p_schema       TEXT,
    p_object       TEXT
)
RETURN VARCHAR2
IS
DECLARE
   predicate TEXT;
BEGIN
    predicate := format('context_check = public.usys_context(''%s''::text,''%s''::text, 8)','CONTEXT','APP_PREDICATE');
    RETURN predicate;
END;

3. Now Apply Security Policy using Policy Functions shown below:

DECLARE
v_object_schema VARCHAR2(30) := 'public';
v_object_name VARCHAR2(30) := 'test_rls';
v_policy_name VARCHAR2(30) := 'secure_data';
v_function_schema VARCHAR2(30) := 'public';
v_policy_function VARCHAR2(30) := 'verify_user_context';
v_statement_types VARCHAR2(30) := 'INSERT,UPDATE,DELETE,SELECT';
v_update_check BOOLEAN := TRUE;
v_enable BOOLEAN := TRUE;
BEGIN
DBMS_RLS.ADD_POLICY(
v_object_schema,
v_object_name,
v_policy_name,
v_function_schema,
v_policy_function,
v_statement_types,
v_update_check,
v_enable
);
END;

Now we are set to test this implementation.

Connect to one session and try the following:
1. Set the context using procedure SET_CONTEXT as given below:

EXEC  SET_CONTEXT('CONTEXT','APP_PREDICATE','ddd');

EDB-SPL Procedure successfully completed

2. Verify in the same session to determine if we have set the Context properly:

 SELECT USYS_CONTEXT('CONTEXT','APP_PREDICATE',2000);
 usys_context
--------------
 ddd
(1 row)

3. Since in session, we have Context set as ddd, there in this session, we should be able to see rows respective to set contexts:

beta=# SELECT * FROM test_rls ;
 id |     col     | context_check
----+-------------+---------------
  1 | First_check | ddd
  2 | First_check | ddd
  3 | First_check | ddd
  4 | First_check | ddd
  5 | First_check | ddd
  6 | First_check | ddd
  7 | First_check | ddd
  8 | First_check | ddd
  9 | First_check | ddd
 10 | First_check | ddd
(10 rows)

As you can see, the DBMS_RLS package in Postgres Plus Advanced Service can help in implementing Row Level Security based on Application Context.

Compiling PLV8 with Postgres Plus Advanced Server

PLV8 is a programming language that lets users write stored procedures and triggers in JavaScript and store them in their Postgres database. This allows application programmers to write a lot of their server-side programming in the same language they use to build their web client applications. Fewer languages to learn usually means fewer mistakes and faster time to completion. The extensive language support is one of many reasons why Postgres’ use across the world is increasing lately. The recent addition of document data support with JSON and JSONB data types in PostgreSQL, and in Postgres Plus Advanced Server from EnterpriseDB, is the main reason for the increasing interest in the PL/V8 language extension.

Below are the steps you need to compile PLV8 with Postgres Plus Advanced Server 9.3/9.4.

To get started, here are the prerequisites:
1. A supported version of PostgreSQL or Postgres Plus Advanced Server, such as versions 9.1 and higher.
2. V8 version 3.14.5
3. g++ version 4.5.1

If you want to know more about V8, you can visit the following wiki page:
http://en.wikipedia.org/wiki/V8_(JavaScript_engine)

It’s important to note that when compiling PLV8 with Postgres Plus Advanced Server 9.3 or the upcoming 9.4, you will get the following two types of error messages:

The first error:

[root@localhost plv8js]# make
sed -e 's/^#undef PLV8_VERSION/#define PLV8_VERSION "1.5.0-dev1"/' plv8_config.h.in > plv8_config.h
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o plv8.o plv8.cc
plv8.cc: In function ‘void _PG_init()’:
plv8.cc:226: error: invalid conversion from ‘void (*)(XactEvent, void*)’ to ‘void (*)(XactEvent, void*, bool)’
plv8.cc:226: error:   initializing argument 1 of ‘void RegisterXactCallback(void (*)(XactEvent, void*, bool), void*)’
make: *** [plv8.o] Error 1

The above error message is a result of a different signature of

typedef void (*XactCallback)

in the Advanced Server transaction system.

To fix the above issue, the user can replace the following in plv8.cc:

static void plv8_xact_cb(XactEvent event, void *arg);

With

static void plv8_xact_cb(XactEvent event, void *arg, bool spl_context);

The second error:
After making the above changes, you may get the following error after trying to compile the source code using the “make” command:

[root@localhost plv8js]# make
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o plv8.o plv8.cc
plv8.cc:137: warning: ‘void plv8_xact_cb(XactEvent, void*, bool)’ used but never defined
plv8.cc:232: warning: ‘void plv8_xact_cb(XactEvent, void*)’ defined but not used
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o plv8_type.o plv8_type.cc
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o plv8_func.o plv8_func.cc
In file included from plv8_param.h:11,
                 from plv8_func.cc:9:
/usr/ppas-9.4/include/server/nodes/params.h:77: error: expected ‘,’ or ‘...’ before ‘typeid’
make: *** [plv8_func.o] Error 1

The above is mainly due to the use of typeid in params.h; typeid is the reserved keyword of C++ compiler.

To fix this issue, make the following changes in plv8.h

extern "C" {
#include "postgres.h"

#include "access/htup.h"
#include "fmgr.h"
#include "mb/pg_wchar.h"
#include "utils/tuplestore.h"
#include "windowapi.h"
}

with

#define typeid __typeid
extern "C" {
#include "postgres.h"

#include "access/htup.h"
#include "fmgr.h"
#include "mb/pg_wchar.h"
#include "utils/tuplestore.h"
#include "windowapi.h"
}
#undef typeid

In plv8_param.h, change the following:

extern "C" {
#include "postgres.h"

/*
 * Variable SPI parameter is since 9.0.  Avoid include files in prior versions,
 * as they contain C++ keywords.
 */
#include "nodes/params.h"
#if PG_VERSION_NUM >= 90000
#include "parser/parse_node.h"
#endif	// PG_VERSION_NUM >= 90000

} // extern "C"

With

#define typeid __typeid
extern "C" {
#include "postgres.h"

/*
 * Variable SPI parameter is since 9.0.  Avoid including files in prior versions,
 * as they contain C++ keywords.
 */
#include "nodes/params.h"
#if PG_VERSION_NUM >= 90000
#include "parser/parse_node.h"
#endif	// PG_VERSION_NUM >= 90000

} // extern "C"
#undef typeid

In plv8_param.cc, replace following:

extern "C" {

#include "catalog/pg_type.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"

} // extern "C"

with

#define typeid __typeid
extern "C" {

#include "catalog/pg_type.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"

} // extern "C"
#undef typeid

After making the above changes, you will be able to compile PLV8 with Advanced Server as shown below:

[root@localhost plv8js]# make
sed -e 's/^#undef PLV8_VERSION/#define PLV8_VERSION "1.5.0-dev1"/' plv8_config.h.in > plv8_config.h
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o plv8.o plv8.cc
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o plv8_type.o plv8_type.cc
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o plv8_func.o plv8_func.cc
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o plv8_param.o plv8_param.cc
echo "extern const unsigned char coffee_script_binary_data[] = {" >coffee-script.cc
(od -txC -v coffee-script.js | \
	sed -e "s/^[0-9]*//" -e s"/ \([0-9a-f][0-9a-f]\)/0x\1,/g" -e"\$d" ) >>coffee-script.cc
echo "0x00};" >>coffee-script.cc
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o coffee-script.o coffee-script.cc
echo "extern const unsigned char livescript_binary_data[] = {" >livescript.cc
(od -txC -v livescript.js | \
	sed -e "s/^[0-9]*//" -e s"/ \([0-9a-f][0-9a-f]\)/0x\1,/g" -e"\$d" ) >>livescript.cc
echo "0x00};" >>livescript.cc
g++ -Wall -O2 -DV8_USE_UNSAFE_HANDLES  -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -fPIC -c -o livescript.o livescript.cc
g++ -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -I/usr/include/et -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fpic -shared -o plv8.so plv8.o plv8_type.o plv8_func.o plv8_param.o coffee-script.o livescript.o -L/usr/ppas-9.4/lib -L/usr/lib64 -Wl,--as-needed -Wl,-rpath,'/usr/ppas-9.4/lib',--enable-new-dtags  -lv8 
sed -e 's/@PLV8_VERSION@/1.5.0-dev1/g' plv8.control.common | g++ -E -P -DLANG_plv8 - > plv8.control
sed -e 's/@LANG_NAME@/plv8/g' plv8.sql.common | g++ -E -P -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -DLANG_plv8 - > plv8--1.5.0-dev1.sql
sed -e 's/@PLV8_VERSION@/1.5.0-dev1/g' plv8.control.common | g++ -E -P -DLANG_plcoffee - > plcoffee.control
sed -e 's/@LANG_NAME@/plcoffee/g' plv8.sql.common | g++ -E -P -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -DLANG_plcoffee - > plcoffee--1.5.0-dev1.sql
sed -e 's/@PLV8_VERSION@/1.5.0-dev1/g' plv8.control.common | g++ -E -P -DLANG_plls - > plls.control
sed -e 's/@LANG_NAME@/plls/g' plv8.sql.common | g++ -E -P -I. -I./ -I/usr/ppas-9.4/include/server -I/usr/ppas-9.4/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -DLANG_plls - > plls--1.5.0-dev1.sql
/bin/mkdir -p '/usr/ppas-9.4/lib'
/bin/mkdir -p '/usr/ppas-9.4/share/extension'
/bin/mkdir -p '/usr/ppas-9.4/share/extension'
/usr/bin/install -c -m 755  plv8.so '/usr/ppas-9.4/lib/plv8.so'
/usr/bin/install -c -m 644 plv8.control '/usr/ppas-9.4/share/extension/'
/usr/bin/install -c -m 644 plv8.control plv8--1.5.0-dev1.sql plcoffee.control plcoffee--1.5.0-dev1.sql plls.control plls--1.5.0-dev1.sql '/usr/ppas-9.4/share/extension/'

After compiling PLV8, you now can install the PLV8 language in Advanced Server using the following command:

beta=# CREATE EXTENSION PLV8;
CREATE EXTENSION
beta=# 

To test your installed PLV8, here is some sample code:

beta=# DO $$ PLV8.elog(NOTICE, 'this', 'is', 'inline', 'code') $$ LANGUAGE PLV8;
NOTICE:  this is inline code
DO

beta=# CREATE TYPE rec AS (i integer, t text);
CREATE TYPE
beta=# CREATE FUNCTION set_of_records() RETURNS SETOF rec AS
beta-# $$
beta$#     // PLV8.return_next() stores records in an internal tuplestore,
beta$#     // and return all of them at the end of function.
beta$#     PLV8.return_next( { "i": 1, "t": "a" } );
beta$#     PLV8.return_next( { "i": 2, "t": "b" } );
beta$# 
beta$#     // You can also return records with an array of JSON.
beta$#     return [ { "i": 3, "t": "c" }, { "i": 4, "t": "d" } ];
beta$# $$
beta-# LANGUAGE PLV8;
CREATE FUNCTION
beta=# SELECT * FROM set_of_records();
 i | t 
---+---
 1 | a
 2 | b
 3 | c
 4 | d
(4 rows)

In case you need a patched version of PLV8, use the following git repository:

 https://github.com/vibhorkum/PLV8_FOR_PPAS.git plv8_ppas

To use this, execute the following command:

 git clone https://github.com/vibhorkum/PLV8_FOR_PPAS.git plv8_ppas
cd PLV8_ppas
make
make install

To test the compiled PLV8, you can use the following command:

[root@localhost plv8_ppas]# make installcheck
/usr/ppas-9.4/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --psqldir='/usr/ppas-9.4/bin'    --dbname=contrib_regression init-extension plv8 inline json startup_pre startup varparam json_conv window dialect
(using postmaster on Unix socket, port 5444)
============== dropping database "contrib_regression" ==============
DROP DATABASE
============== creating database "contrib_regression" ==============
CREATE DATABASE
ALTER DATABASE
============== running regression test queries        ==============
test init-extension           ... ok
test plv8                     ... ok
test inline                   ... ok
test json                     ... ok
test startup_pre              ... ok
test startup                  ... ok
test varparam                 ... ok
test json_conv                ... ok
test window                   ... ok
test dialect                  ... ok

======================
 All 10 tests passed. 
======================

Meet BART – A New Tool for Backup And Recovery Management

EnterpriseDB recently launched a new tool for backup and recovery – named simply EDB Backup and Recovery Tool, or BART. This tool makes the DBA’s life easier by simplifying the tasks for managing their Postgres physical backup and recovery tasks, whether they are PostgreSQL or Postgres Plus Advanced Server deployments.

BART has the following advantages over custom scripts for managing backups:

1. It’s stable and it uses the tool pg_basebackup to take a physical backup. This tool has been well defined and is well-supported by the PostgreSQL community.

2. It catalogs all of the backups users are taking, which is important in terms of:
    i. Listing the type of backups used
   ii. Listing the status of those backups with server information.

3. BART also provides functionality to restore backups, with all required archived WAL files. So automation around this tool will make DBAs’ lives easier for restore and recovery.

4. BART provides an option to validate your backup by using checksum. This is useful for confirming you took a valid backup and it is not corrupted at disk level.

5. BART provides an option to define your retention policy around the backups you are keeping.

Given all of the above advantages, I decided to give this new tool a try and share some tips. To get started, you need the following prerequisites:

1. BART currently requires a Linux 64 bit platform, CentOS 6.x or RHEL 6.x
2. Need to have password-less, direct SSH access to the target machine where you want to restore backups as well as the database servers you want backed up
3. Install the Postgres Plus Advanced Server or PostgreSQL binaries for pg_basebackup

Yum or rpm

To install this tool, you have two options that I will explore below:

1. Yum command
2. Rpm command.

Using the yum command:

To perform a yum command installation, BART users can ask EDB for credentials to the EnterpriseDB yum repository and configure the their local yum repository as follows:

echo &quot;[tools]
name=EnterpriseDB Tools
baseurl=http://username:password@yum.enterprisedb.com/tools/redhat/rhel-$releasever-$basearch
enabled=1
gpgcheck=0&quot; &gt; /etc/yum.repos.d/edbtools.repo

After creating the yum repo, the user can execute the following command to install BART:

 yum install edb-bart

If the user doesn’t want to install the EDB Backup and Recovery Tool using the yum command, then the user can download a free standing rpm using the link below from EDB’s website:

http://www.enterprisedb.com/downloads/postgres-postgresql-downloads

and then enter the rpm install command as follows:

rpm -ivh edb-bart-1.0.1-1.rhel6.x86_64.rpm

After installing BART using the above commands, the user can see the binaries in the directory:/usr/edb-bart-1.0/bin and a sample BART configuration file in /usr/edb-bart-1.0/etc

That’s a very easy installation.

For more information on configuring BART Host and Database Host, the following are some documents that will help:
1. pg_basebackup configuration for PostgreSQL:
http://www.postgresql.org/docs/current/static/app-pgbasebackup.html

2. For direct password less ssh configuration user can refer following link
http://www.enterprisedb.com/docs/en/1.0/bart/EDB_Backup_and_Recovery_Tool_Guide-17.htm#P1008_76316

After the installation of the BART binaries, the user also has to create a BART configuration file.

The following is a sample configuration file for BART:

[BART]
bart-host= enterprisedb@127.0.0.1
backup_path = /opt/backup
pg_basebackup_path = /usr/ppas-9.4/bin/pg_basebackup
logfile = /tmp/bart.log

[PG]
host = 127.0.0.1
port = 5432
user = postgres
description = &quot;Postgres server&quot;

[PPAS94]
host = 127.0.0.1
port = 5444
user = enterprisedb
description = &quot;PPAS 94 server&quot;

Global Configuration Settings

Content under the [BART] tag are called global configuration settings. Under this tag are the following:

1. bart-host: the IP address of the host on which BART is installed. The value for this parameter must be specified in the form: bart_user@bart_host_address, where bart_user is the operating system user account on the BART host that is used to run BART and owns the BART backup catalog directory. bart_host_address is the IP address of the BART host.

2. backup_path: specifies the file system parent directory where all BART database server base backups and archived WAL files are stored. This parameter is required.

3. pg_basebackup_path: specifies the path to the pg_basebackup program of the Postgres database server installed on the BART host.

4. log file: specifies the path to the BART log file. This parameter is optional. If no path to a log file is specified after logfile =, or if the parameter is commented out, BART does not create a log file.

The remaining part of configuration file is self-explanatory. The TAG: [PG]/[PPAS94] part is content for servers which the user wants to back up.

Pg_basebackup Settings

After performing the above configuration on the Backup Server, the user has to do set following settings on the servers that they want to back up. Below are the settings for enabling backup using pg_basebackup.

The user has to set a few parameters in PostgreSQL postgresql.conf file, which he wants to backup:

1. wal_level parameter to archive or hot_standby.
2. archive_mode=on
3. archive_command setting.
4. max_wal_senders to 1 or more than one, since pg_basebackup uses the replication protocol to copy data directory.

For more information on each setting please refer to the following:
1. wal_level:
http://www.postgresql.org/docs/9.4/static/runtime-config-wal.html

2. archive_mode and archive_command:
http://www.postgresql.org/docs/9.4/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVING

3. max_wal_senders:
http://www.postgresql.org/docs/9.4/static/runtime-config-replication.html
http://www.enterprisedb.com/docs/en/1.0/bart/EDB_Backup_and_Recovery_Tool_Guide-19.htm#TopOfPage

With the above settings, the user then needs to update the pg_hba.conf file for the replication connection.

Note: The above settings are for pg_basebackup to take backups using replication protocols. In case users need more information about pg_basebackup and settings, please use the above mentioned link

How BART Works

Now, since we have configured both servers, let’s have a look how BART works.

The following command executes a backup:

 bart -c bart.cfg BACKUP -s ppas94

And below is the output:

[bart@localhost ~]$ bart -c bart.cfg BACKUP -s ppas94

INFO:  creating backup for server 'ppas94'
INFO:  backup identifier: '1413852137762'
6394456/6394456 kB (100%), 1/1 tablespace

INFO:  backup checksum: 7f49ea9653511308710c174f22ec765d
INFO:  backup completed successfully
[bart@localhost ~]$ 

That was an easy way to take a backup. The DBA can also create a job to execute the above command to take backups.

If the user wants to list the backup using BART, the user can use the option SHOW-BACKUPS:

[bart@localhost ~]$ bart -c bart.cfg SHOW-BACKUPS -s ppas94
 Server Name   Backup ID       Backup Time           Backup Size  
                                                                  
 ppas94        1413852137762   2014-10-20 17:43:41   6244.59 MB   

This is useful for knowing what backups a user has available for recovery. The above command gives important information:

1.	Backup ID: It’s a unique ID for the physical backup
2.	Backup Time: Time when backup was taken
3.	Backup Size: Size of backup

This information is useful when a user wants to plan for recovery using backup. This way, the user can also plan for disk size.

Sometimes a user wants to verify their backup state. VERIFY-CHKSUM option is useful in this case:

[bart@localhost ~]$ bart -c bart.cfg VERIFY-CHKSUM -s ppas94 -i 1413852137762
 Server Name   Backup ID       Verify  
                                       
 ppas94        1413852137762   OK      

I have to say, after putting EDB BART through its paces, I think DBAs will enjoy having such a great tool for making Backup Management easy.

In my next post, I will blog about the Recovery process.

Switchover/Switchback in PostgreSQL 9.3

PostgreSQL 9.3 has two key software updates making switchover/switchback easier in High Availability configurations.

First, let’s address the software patches and their descriptions:
1. First patch was committed by Fujii Masao.
Patch commit# 985bd7d49726c9f178558491d31a570d47340459

With this patch, the walsender process tries to send all outstanding WAL records to the standby in replication when the user shuts down the master.

This means:
a. All WAL records are synced between two servers after the clean shutdown of the master
b. After promoting the standby to new master, the user can restart the stopped master as new standby without a fresh backup from new master.

2. Second patch was committed by Heikki Linnakangas in PostgreSQL 9.3.
Patch commit# abfd192b1b5ba5216ac4b1f31dcd553106304b19

Before PostgreSQL version 9.3, streaming replication used to stop replicating if the timeline on the primary didn’t match the standby. This generally happens when the user promotes one of the standbys to become the new master. Promoting a standby always results in an increment of timeline ID and after that increment, other standbys will refuse to continue replicating.

With this patch in PostgreSQL 9.3, the standby asks the primary for any timeline history files that are missing from the standby when it connects – if the standby recovery.conf file has the following setting:
recovery_target_timeline=’latest’

The missing files are sent using a new replication command TIMELINE_HISTORY, and stored in the standby’s pg_xlog directory. Using the timeline history files, the standby can follow the latest timeline present in the primary, just as it can follow new timelines appearing in an archive WAL directory.

Because of above patches, if the user performs the following sequence of steps then switchover/switchback can be easily achieved:
To switchover from Master to Standby, use following steps:
On Master:
1. Use any one of following stop options for clean shutdown.

    pg_ctl stop --pgdata=data_dire --mode=fast
    or
    pg_ctl stop --pgdata=data_directory --mode=smart
    

2. Before promoting standby:

Make sure all WAL send by Master applied on Standby. Use following functions to verify it:
    * pg_last_xlog_receive_location()
    * pg_last_xlog_replay_location()

3. Have the proper archive location and archive_command setting accessible to old Master.

Following is a presentation I delivered at a recent BPUG Meet (Boston PostgreSQL User Group) to discuss switchover/switchback.

 

Monitoring approach for Streaming Replication with Hot Standby in PostgreSQL 9.3.

The people using PostgreSQL and the Streaming Replication feature seem to ask many of the same questions:

1. How best to monitor Streaming Replication?

2. What is the best way to do that?

3. Are there alternatives, when monitoring on Standby, to using the pg_stat_replication view on Master?

4. How should I calculate replication lag-time, in seconds, minutes, etc.?

In light of these commonly asked questions, I thought a blog would help. The following are some methods I’ve found to be useful.

Monitoring is critical for large infrastructure deployments where you have Streaming Replication for:

1. Disaster recovery

2. Streaming Replication is for High Availability

3. Load balancing, when using Streaming Replication with Hot Standby

PostgreSQL has some building blocks for replication monitoring, and the following are some important functions and views which can be use for monitoring the replication:

1. pg_stat_replication view on master/primary server.

This view helps in monitoring the standby on Master. It gives you the following details:

   pid:              Process id of walsender process
   usesysid:         OID of user which is used for Streaming replication.
   usename:          Name of user which is used for Streaming replication
   application_name: Application name connected to master
   client_addr:      Address of standby/streaming replication
   client_hostname:  Hostname of standby.
   client_port:      TCP port number on which standby communicating with WAL sender
   backend_start:    Start time when SR connected to Master.
   state:            Current WAL sender state i.e streaming
   sent_location:    Last transaction location sent to standby.
   write_location:   Last transaction written on disk at standby
   flush_location:   Last transaction flush on disk at standby.
   replay_location:  Last transaction flush on disk at standby.
   sync_priority:    Priority of standby server being chosen as synchronous standby
   sync_state:       Sync State of standby (is it async or synchronous).

e.g.:

postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+---------------------------------
pid              | 1114
usesysid         | 16384
usename          | repuser
application_name | walreceiver
client_addr      | 172.17.0.3
client_hostname  |
client_port      | 52444
backend_start    | 15-MAY-14 19:54:05.535695 -04:00
state            | streaming
sent_location    | 0/290044C0
write_location   | 0/290044C0
flush_location   | 0/290044C0
replay_location  | 0/290044C0
sync_priority    | 0
sync_state       | async

2. pg_is_in_recovery() : Function which tells whether standby is still in recovery mode or not.

e.g.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

3. pg_last_xlog_receive_location: Function which tells location of last transaction log which was streamed by Standby and also written on standby disk.

e.g.

postgres=# select pg_last_xlog_receive_location();
 pg_last_xlog_receive_location
-------------------------------
 0/29004560
(1 row)

4. pg_last_xlog_replay_location: Function which tells last transaction replayed during recovery process. e.g is given below:

postgres=# select pg_last_xlog_replay_location();
 pg_last_xlog_replay_location
------------------------------
 0/29004560
(1 row)

5. pg_last_xact_replay_timestamp: This function tells about the time stamp of last transaction which was replayed during recovery. Below is an example:

postgres=# select pg_last_xact_replay_timestamp();
  pg_last_xact_replay_timestamp
----------------------------------
 15-MAY-14 20:54:27.635591 -04:00
(1 row)

Above are some important functions/views, which are already available in PostgreSQL for monitoring the streaming replication.

So, the logical next question is, “What’s the right way to monitor the Hot Standby with Streaming Replication on Standby Server?”

If you have Hot Standby with Streaming Replication, the following are the points you should monitor:

1. Check if your Hot Standby is in recovery mode or not:

For this you can use pg_is_in_recovery() function.

2.Check whether Streaming Replication is working or not.

And easy way of doing this is checking the pg_stat_replication view on Master/Primary. This view gives information only on master if Streaming Replication is working.

3. Check If Streaming Replication is not working and Hot standby is recovering from archived WAL file.

For this, either the DBA can use the PostgreSQL Log file to monitor it or utilize the following functions provided in PostgreSQL 9.3:

pg_last_xlog_replay_location();
pg_last_xact_replay_timestamp();

4. Check how far off is the Standby from Master.

There are two ways to monitor lag for Standby.



   i. Lags in Bytes: For calculating lags in bytes, users can use the pg_stat_replication view on the master with the function pg_xlog_location_diff function. Below is an example:

pg_xlog_location_diff(pg_stat_replication.sent_location, pg_stat_replication.replay_location)

which gives the lag in bytes.

  ii. Calculating lags in Seconds. The following is SQL, which most people uses to find the lag in seconds:

   SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()
                 THEN 0
               ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())
          END AS log_delay;

Including the above into your repertoire can give you good monitoring for PostgreSQL.

I will in a future post include the script that can be used for monitoring the Hot Standby with PostgreSQL streaming replication.

Postgres Plus Advanced Server 9.3 Features

Postgres Plus Advanced Server 9.3, released on Tuesday for general availability, contains all of the innovation that was part of PostgreSQL 9.3, which was released in September.
These features include the following:

1. Configuration directive ‘include_dir’
2. COPY FREEZE for more efficient bulk loading
3. Custom Background Workers
4. Data Checksums
5. JSON: Additional functionality
6. LATERAL JOIN
7. Parallel pg_dump for faster backups
8. ‘pg_isready’ server monitoring tool
9. Switch to Posix shared memory and mmap()
10. Event Triggers
11. VIEW Features:
                   Materialized Views
                   Recursive View Syntax
                   Updatable Views
12. Writeable Foreign Tables
                   postgres_fdw
13. Replication Improvements
                    Streaming-Only Remastering
                    Fast Failover
                    Architecture-Independent Streaming
                    pg_basebackup conf setup

For Postgres Plus Advanced Server, we integrated into the core PostgreSQL additional performance improvements, new packages and Object oriented features so that our database can address a wider range of enterprise use cases.

Partitioning enhancements to boost performance for INSERTS/UPDATES/SELECT was a major development for Postgres Plus. Below are graphs illustrating the performance increases of Postgres Plus Advanced Server 9.3 compared to the 9.2 version.

TPS_Select

Figure: TPS SELECT

TPS_Update

 Figure TPS UPDATE

The release features some important functions that can make a developer’s life easier, such as:

1. DBMS_RANDOM package. This packages helps users to create Random numbers, Random strings and Random dates. PostgreSQL supports random functions, which enable users to have their own function on top of random for random string and random date.

With this package, users can easily use the built-in functions for those two purposes. Below are some examples:
DBMS_RANDOM package can be used to easily generate random strings and dates, and users don’t need to make their own wrapper function for these two activities. Below is one simple example of using it.

user=# BEGIN
user$#   DBMS_OUTPUT.put_line('Run 1 : seed=0');
user$#   DBMS_RANDOM.seed (val => 0);
user$#   FOR i IN 1 ..5 LOOP
user$#     DBMS_OUTPUT.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10));
user$#   END LOOP;
user$#
user$#   DBMS_OUTPUT.put_line('Run 2 : seed=0');
user$#   DBMS_RANDOM.seed (val => 0);
user$#   FOR i IN 1 ..5 LOOP
user$#     DBMS_OUTPUT.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10));
user$#   END LOOP;
user$#
user$# END;
Run 1 : seed=0
i=1 : value=6.756044853478672
i=2 : value=7.806749390438203
i=3 : value=1.1173334638588129
i=4 : value=5.221515491604802
i=5 : value=8.445010517258194
Run 2 : seed=0
i=1 : value=6.756044853478672
i=2 : value=7.806749390438203
i=3 : value=1.1173334638588129
i=4 : value=5.221515491604802
i=5 : value=8.445010517258194

EDB-SPL Procedure successfully completed
user=# BEGIN
user$#   FOR i IN 1 .. 5 LOOP
user$#     DBMS_OUTPUT.put_line('string(''x'',10)= ' || DBMS_RANDOM.string('x',10));
user$#   END LOOP;
user$# END;
string('x',10)= 1TT23XR8X2
string('x',10)= DO5D2KUUVD
string('x',10)= AGNPAXDECT
string('x',10)= 7JC6RMU9KX
string('x',10)= 13BW6JM6KN

EDB-SPL Procedure successfully completed
user=# BEGIN
user$#   FOR i IN 1 .. 5 LOOP
user$#     DBMS_OUTPUT.put_line('date= ' || TRUNC(SYSDATE + DBMS_RANDOM.value(0,366)));
user$#   END LOOP;
user$# END;
date= 20-JUN-14 00:00:00
date= 26-MAY-14 00:00:00
date= 11-JAN-14 00:00:00
date= 27-JUN-14 00:00:00
date= 21-DEC-13 00:00:00

EDB-SPL Procedure successfully completed

2. DBMS_LOCK.sleep: Similar to pg_sleep function, the DBMS_LOCK.sleep package is meant for Oracle users/developers who are more familiar with Oracle packages.

3. DBMS_CRYPTO: DMBMS_CRYPTO is new in Postgres Plus Advanced Server 9.3 and it provides the interface to encrypt and decrypt stored data, and can be used in conjunction with PL/SQL programs for running network communications. It provides support for several industry-standard encryption and hashing algorithms.

DECLARE
   input_string       VARCHAR2 (200) :=  'Secret Message';
   output_string      VARCHAR2 (200);
   encrypted_raw      RAW (2000);             -- stores encrypted binary text
   decrypted_raw      RAW (2000);             -- stores decrypted binary text
   num_key_bytes      NUMBER := 256/8;        -- key length 256 bits (32 bytes)
   key_bytes_raw      RAW (32);               -- stores 256-bit encryption key
   encryption_type   INTEGER :=          -- total encryption type
                            DBMS_CRYPTO.ENCRYPT_DES
                          + DBMS_CRYPTO.CHAIN_CBC
                          + DBMS_CRYPTO.PAD_PKCS5;
BEGIN
   DBMS_OUTPUT.PUT_LINE ( 'Original string: ' || input_string);
   key_bytes_raw := DBMS_CRYPTO.RANDOMBYTES (num_key_bytes::INTEGER);
   encrypted_raw := DBMS_CRYPTO.ENCRYPT
      (
         src => convert_to(input_string, 'LATIN1'),
         typ => encryption_type,
         key => key_bytes_raw
      );
    -- The encrypted value "encrypted_raw" can be used here
   decrypted_raw := DBMS_CRYPTO.DECRYPT
      (
         src => encrypted_raw,
         typ => encryption_type,
         key => key_bytes_raw
      );
   output_string := convert_from(decrypted_raw,'LATIN1');

   DBMS_OUTPUT.PUT_LINE ('Decrypted string: ' || output_string);
END;

4. DBMS_SCHEDULER: With Postgres Plus 9.2, we got DBMS_JOB package, and now Postgres Plus 9.3 has come out with DBMS_SCHEDULER, which has more control options for jobs and better visibility of scheduled jobs.
5. UTL_ENCODE: This is another important package for keeping sensitive information in the database. Using this package, users can easily encode and decode their data and keep it in database. This package can be utilized in new applications, and gives flexibility around important encode/decode functions for developers and users where data is very important. This is particularly useful for users working applications that were written for Oracle but migrated onto Postgres Plus Advanced Server. .

Let’s see an example:

user=# DECLARE
user-#     v_str VARCHAR2(100);
user$#   BEGIN
user$#      --generate encoded value
user$#     v_str := utl_encode.text_encode('EnterpriseDB','gb18030', UTL_ENCODE.BASE64);
user$#     dbms_output.put_line('Encoded string => '||v_str);
user$#
user$#      --take the encoded value and decode it
user$#    v_str := utl_encode.text_decode(v_str,'gb18030', UTL_ENCODE.BASE64);
user$#    dbms_output.put_line('Decoded string => '||v_str);
user$# END;
Encoded string => RW50ZXJwcmlzZURC
Decoded string => EnterpriseDB

EDB-SPL Procedure successfully completed

6. UTL_HTTP: Advanced Server 9.3 has an UTL_HTTP package. This package provides functions for HTTP callouts from SQL/SPL. Developers/users can use this package and associated functions for accessing data on the Internet over HTTP.

7. UTL_URL: This is one of additions we developed for Postgres Plus 9.3. This package has two functions, escape and unescape mechanisms for URL characters. The escape function helps to escape a URL before the URL can be used to fetch data from a website. The unescape function can unescape any escaped character used in the URL, before fetching the data from a website. These two packages with UTL_HTTP allow users to direct the fetching of data from a website without having to write complex code in the application for handling specific data from a website.

8. New in EDB*loader. In Postgres Plus 9.3 EDB*loader has more control options for bulk loading.
a. ROWS parameter: prior to version 9.3, EDB*Loader processed entire data files as a single transaction. With parameter, users can control the processing of large amounts of data after which COMMIT needs to be executed. More control for processing/Loading data files.
b. Error Codes: EDB*loader now supports some additional exit/error codes, which will help users/developers include proper exit code checking while using a higher speed bulk data loader with parallel processing. Exit codes are shown below:

      0: Success
      1: Failure
      2: Warning
      3: Fatal

c. Streaming output files to client: Prior to version 9.3, users had to check the logfile of EDB*loader on the server side. Files: logfile, Bad file and discard file used to be created on the server side, and for troubleshooting users had to log in on the server to verify these files. Now, Postgres Plus 9.3 enables these files to be created at the client site. Users are no longer required to log into the server to verify/check these files..
d. New GUC for empty string. EDB*loader now has one GUC which users can utilize to control the default behavior of an empty_string in their datafile. edbldr.empty string has the following valid values:
1. null: empty field is treated as a null if the raw field contains no characters or a pair of delimiters with nothing in between.
2. empty_string: empty field is treated as a empty_string, if the raw field contains no characters or a pair of delimiters with nothing in between.
3. pgsql: empty field is treated as a null if the raw field contains no characters, but as an empty string if it contains a pair of delimiters with nothing in between.
Default is pgsql.

9. New REGEXP functions. Postgres Plus 9.3 nowhas three new REGEXP functions for developers.
a. REGEXP_COUNT: This searches a string for a regular expression, and returns a count of the times that the regular expression occurs. Here’s a simple example:

          user=# SELECT REGEXP_COUNT('reinitializing', 'i', 1);
          regexp_count
           --------------
            5
          (1 row)
 

b. REGEXP_INSTR: This function searches a string for a POSIX-style regular expression and returns the position within the string where the match was located.

user=# SELECT REGEXP_INSTR('800-555-1212', '[0-9][0-9][0-9]', 1, 1) ;
 regexp_instr
--------------
            1
(1 row)

c. REGEXP_SUBSTR: This function searches a string for a pattern specified by a POSIX compliant regular expression and returns the string that matches the pattern specified in the call to the function.


user=# SELECT REGEXP_SUBSTR('800-555-1212', '[0-9][0-9][0-9]', 1, 2);
 regexp_substr
---------------
 555
(1 row)

10. New exception codes for UTL_FILE package: The UTL_FILE package provides the capability to read from, and write to files on the operating system’s file system. To provide error control capabilities to developers/users, Postgres Plus 9.3 has added some new exception codes, which users can utilize in their PL/SPL code for better exception handling. One simple example is given below:

user=# CREATE DIRECTORY tempdir_fb22954 AS '/tmp/';
CREATE DIRECTORY
user=# SELECT dirname
user-# FROM edb_dir
user-# WHERE dirname='tempdir_fb22954';
     dirname
-----------------
 tempdir_fb22954
(1 row)

user=#
user=#  -- check "utl_file.invalid_operation" exception
user=# DECLARE v_testfile UTL_FILE.FILE_TYPE;
user$#
user$#  v_directory VARCHAR2(50) := 'tempdir_fb22954';
user$#
user$#  v_filename VARCHAR2(50) := 'test_file_exist.txt';
user$#
user$#  BEGIN -- Create file
user$#  v_testfile := UTL_FILE.FOPEN(v_directory,v_filename,'w');
user$#
user$#  UTL_FILE.PUT(v_testfile,'A');
user$#
user$#  UTL_FILE.NEW_LINE(v_testfile);
user$#
user$#  UTL_FILE.FCLOSE(v_testfile);
user$#
user$#  DBMS_OUTPUT.PUT_LINE('Created file: ' || v_filename);
user$#
user$#  -- It should throw exception because file is open for read.
user$#  v_testfile := UTL_FILE.FOPEN(v_directory,v_filename,'r');
user$#
user$#  UTL_FILE.PUT(v_testfile,'B');
user$#
user$#  UTL_FILE.NEW_LINE(v_testfile);
user$#
user$#  UTL_FILE.PUT(v_testfile,'C');
user$#
user$#  UTL_FILE.NEW_LINE(v_testfile);
user$#
user$#  exception
user$#  WHEN utl_file.invalid_operation THEN
user$# RAISE notice 'exception caught utl_file.invalid_operation : SQLERRM: %',
user$#                                                       sqlerrm;
user$# UTL_FILE.FCLOSE(v_testfile);
user$#  WHEN others THEN
user$# RAISE notice 'exception, others : SQLERRM: %',
user$#                         sqlerrm;
user$# UTL_FILE.FCLOSE(v_testfile);
user$#  END;
Created file: test_file_exist.txt
NOTICE:  exception caught utl_file.invalid_operation : SQLERRM: invalid file operation

EDB-SPL Procedure successfully completed

New in PostgreSQL 9.3: Server Side languages

In series of blogging about new features coming in PostgreSQL 9.3, I thought to blog about the server side language improvements in PostgreSQL. Lets see whats coming in server side language.

As PostgreSQL user, you know, PostgreSQL supports multiple server side language. In 9.3, there are some interesting features are coming. Lets look at the new improvements about to come.

1. SPI access to number of rows processed by COPY command.

This is more like new feature which is introduced in 9.3. Before 9.3, this feature was missing. i.e if user uses COPY command inside the plperl/plpython functions, then there was no way inside the procedure to trace the number of rows processed by COPY. However in 9.3, this limitation is no more exists with procedural language. There are many languages supported in PostgreSQL, However I chose to test this with mostly used language plperl and plpython. Below are some snapshot pre-9.3 and in 9.3.

Lets check with plperl. Following is a plperl function which can be use:

Definition of table is given below:

   Table "public.test_copy"
 Column |  Type   | Modifiers 
--------+---------+-----------
 id     | numeric | 

Content of data file:

cat /tmp/test.data
1
2
3
4
5

Following is a plperl function which can be use for testing in pre-9.3 and in 9.3

CREATE OR REPLACE FUNCTION test_copy() RETURNS integer
AS $$
    my $rv = spi_exec_query("COPY test_copy FROM '/tmp/test.data'");
    my $status = $rv->{status};
    my $nrows = $rv->{processed};
    return $nrows;
$$ LANGUAGE plperl;

If we execute COPY command on psql prompt, user will get message like given below:

worktest=# COPY test_copy from '/tmp/test.data';
COPY 5

which shows COPY has processed 5 rows and accordingly loaded in table.

If we use above plperl function in pre-9.3, user will get following result:

worktest=# select split_part(version(),' ',2) as version;
 version  
----------
 9.2.4.10
(1 row)

worktest=# select test_copy();
 test_copy 
-----------
         0
(1 row)

which shows function was not able to get the number of rows processed by COPY command.
However if we use same plperl function in 9.3, we will get following result:

postgres=# select split_part(version(),' ',2) as version;
 version  
----------
 9.3beta2
(1 row)

postgres=# select test_copy();
 test_copy 
-----------
         5
(1 row)

which shows that plperl function in 9.3 is able to get the number of rows processed.

Similarly we can use following plpython function to test this new feature:

CREATE FUNCTION result_copy_test(cmd text) RETURNS int
AS $$
plan = plpy.prepare(cmd)
plpy.info(plan.status()) 
result = plpy.execute(plan)
return result.nrows()
$$ LANGUAGE plpythonu;

pre-9.3

worktest=#  SELECT result_copy_test($$ COPY test_copy FROM '/tmp/test.data' $$);
INFO:  True
CONTEXT:  PL/Python function "result_copy_test"
 result_copy_test 
------------------
                0
(1 row)

which shows function was not able to get the processed rows by COPY command.

In 9.3

postgres=# SELECT result_copy_test($$ COPY test_copy FROM '/tmp/test.data' $$);
INFO:  True
CONTEXT:  PL/Python function "result_copy_test"
 result_copy_test 
------------------
                5
(1 row)

which shows function was able to get the processed rows of COPY command.

2. Allow GET DIAGNOSTICS x = ROW_COUNT to access rows processed by COPY
This is an enhancement in plpgsql, if user wants to access the rows processed by COPY command in plpgsql Block, then he can use
GET DIAGNOSTICS variable = ROW_COUNT, which was not possible before 9.3.

To check with plpgsql, user can use either create a sample function or can use following anonymous block to verify this feature. I will be using DO block of plpgsql to verify this feature.
Example is given below:
PL/pgSQL anonymous block

DO $$
  DECLARE
    r int; 
  BEGIN
     COPY test_copy FROM '/tmp/test.data';
     GET DIAGNOSTICS r = row_count;
     RAISE NOTICE 'processed rows => %',r;
  END;
$$ language plpgsql;

pre-9.3

worktest=# DO $$
worktest$#   DECLARE
worktest$#     r int; 
worktest$#   BEGIN
worktest$#      COPY test_copy FROM '/tmp/test.data';
worktest$#      GET DIAGNOSTICS r = row_count;
worktest$#      RAISE NOTICE 'processed rows => %',r;
worktest$#   END;
worktest$# $$ language plpgsql;
NOTICE:  processed rows => 0
DO

which shows pre-9.3 was not able to get processed rows.

In 9.3

postgres=# DO $$
postgres$#   DECLARE
postgres$#     r int; 
postgres$#   BEGIN
postgres$#      COPY test_copy FROM '/tmp/test.data';
postgres$#      GET DIAGNOSTICS r = row_count;
postgres$#      RAISE NOTICE 'processed rows => %',r;
postgres$#   END;
postgres$# $$ language plpgsql;
NOTICE:  processed rows => 5
DO
postgres=

In 9.3, plpgsql is able to get processed rows. Interesting.

3. Allow use of RETURN with a composite expression in PL/pgSQL.
This is new addition in PL/pgSQL and also useful for user who is reluctant to declare variable too much and wants RETURN in PL/pgSQL to return the expression.

This feature was not available pre-9.3 PL/pgSQL. However, in 9.3, user can use this easily. Below is an example.

create type footype as (x int, y varchar);

create or replace function foo() returns footype as $$
begin
  return (1, 'hello')::footype;
end;
$$ language plpgsql;

Lets try with pre-9.3 first ,
when user will try to create above function user will get following error message in pre-9.3:

worktest=# create or replace function foo() returns footype as $$
worktest$# begin
worktest$#   return (1, 'hello')::footype;
worktest$# end;
worktest$# $$ language plpgsql;
ERROR:  RETURN must specify a record or row variable in function returning row
LINE 3:   return (1, 'hello')::footype;

In 9.3, due to addition of above feature, this works great.

postgres=# create or replace function foo() returns footype as $$
postgres$# begin
postgres$#   return (1, 'hello')::footype;
postgres$# end;
postgres$# $$ language plpgsql;
CREATE FUNCTION
postgres=# 

And user can see the working output too:

postgres=# select foo();
    foo    
-----------
 (1,hello)
(1 row)

4. New in PL/Python
There are some good improvement coming in plpython too. These improvement will be useful for users who like to make plpython and wants to debug the function or wants to print some useful information. This was missing in pre-9.3. Lets look at whats new in plptyhon.

i. Addition of object string in Pl/Python
Before 9.3, whenever user wants to look at the information provided by object handler in Pl/Python, it doesn’t used to give useful information. However, In 9.3, this is going to give some useful information which will be helpful for Pl/Python users.

Lets look at example, how this change is useful. Following is small code which can be use for testing this new addition:

CREATE FUNCTION test_debug_info() RETURNS text AS $$
    try:
        rv=plpy.execute("SELECT datname FROM pg_catalog.pg_database",5)
        plpy.info(rv);
    except plpy.SPIError:
        return "Not working"
    else:
        return "Working good"
$$ LANGUAGE plpythonu;

pre 9.3

worktest=# select test_debug_info() ;
INFO:  <PLyResult object at 0x7f3594b8f270>
CONTEXT:  PL/Python function "test_debug_info"
 test_debug_info 
-----------------
 Working good
(1 row)

Above user can see INFO hasn’t given the very much useful information.

In 9.3

postgres=# select test_debug_info() ;
INFO:  <PLyResult status=5 nrows=3 rows=[{'datname': 'template1'}, {'datname': 'template0'}, {'datname': 'postgres'}]>
CONTEXT:  PL/Python function "test_debug_info"
 test_debug_info 
-----------------
 Working good
(1 row)

Above you can see INFO has given some useful information about status, number rows and rows return by the query. Very useful, if somebody wants to know what handler contains in Pl/Python.

ii. Conversion of OID values to proper Pl/Python numeric type
Before 9.3, Database type OID used to be treated as string in Pl/Python, which makes developers to convert string into number data type in plptyhon and do some processing on it, which is kind of extra coding. However in 9.3 they don’t have to worry any more.
Below is an example to test this feature.

CREATE FUNCTION test_type_conversion_oid(x oid) RETURNS oid AS $$
   plpy.info(x, type(x))
   return x +1
$$ LANGUAGE plpythonu;

Lets look at the pre 9.3

worktest=# select test_type_conversion_oid(123456);
INFO:  ('123456', <type 'str'>)
CONTEXT:  PL/Python function "test_type_conversion_oid"
ERROR:  TypeError: cannot concatenate 'str' and 'int' objects
CONTEXT:  Traceback (most recent call last):
  PL/Python function "test_type_conversion_oid", line 3, in <module>
    return x +1
PL/Python function "test_type_conversion_oid"

which shows direct number operations is not possible with OID in pre9.3, since OID in pre 9.3 is treated as string.

In 9.3

postgres=# select test_type_conversion_oid(123456)
postgres-# ;
INFO:  (123456L, <type 'long'>)
CONTEXT:  PL/Python function "test_type_conversion_oid"
 test_type_conversion_oid 
--------------------------
                   123457
(1 row)

This works great and direct number operation with OID is possible. A good addition in plpython.

iii. Handle SPI errors raised explicitly (with PL/Python’s RAISE) the same as internal SPI errors
This is new addition to plpython. Now in plpython function’s body, user can raise SPIError/exceptions using “raise” statement of plpython, which was missing in pre 9.3. Now in 9.3, user will be able to use “raise” statement to raise exceptions. Addition to this, if user sets the sqlstate attribute, plpython preserver that change.
Below is an example:

Following is definition of function which can be use for testing this addition in pre 9.3 and in 9.3

CREATE OR REPLACE FUNCTION plpy_raise_spiexception() RETURNS void AS $$
   raise plpy.spiexceptions.DivisionByZero()
$$ LANGUAGE plpythonu;

Lets use anonymous block of plpgsql to test how raise exception can be handle.
Before 9.3

DO $$
BEGIN
   SELECT plpy_raise_spiexception();
EXCEPTION WHEN division_by_zero THEN
   RAISE NOTICE 'Found Exception';
END
$$ LANGUAGE plpgsql;

ERROR:  spiexceptions.DivisionByZero: 
CONTEXT:  Traceback (most recent call last):
  PL/Python function "plpy_raise_spiexception", line 2, in <module>
    raise plpy.spiexceptions.DivisionByZero()
PL/Python function "plpy_raise_spiexception"
SQL statement "SELECT plpy_raise_spiexception()"
PL/pgSQL function inline_code_block line 3 at SQL statement

As you can see Before 9.3, user will raise statement was not working properly for raising exception in plpython.

In 9.3

postgres=# DO $$
postgres$# BEGIN
postgres$#    SELECT plpy_raise_spiexception();
postgres$# EXCEPTION WHEN division_by_zero THEN
postgres$#    RAISE NOTICE 'Found Exception';
postgres$# END
postgres$# $$ LANGUAGE plpgsql;
NOTICE:  Found Exception
DO

As you can see in 9.3. it works great!