Monitoring Amazon PostgreSQL RDS with PEM version 6.0

The EDB Postgres Enterprise Manager is an enterprise management tool designed to tune, manage, and monitor a large number of Postgres deployments on servers on-premises or spread out across geographical boundaries. It is the cornerstone of the Management Suite of tools that is integrated into the EDB Postgres Platform from EnterpriseDB® (EDB) and it emerged to solve a great many challenges database administrators faced daily. (The blog, How to Be the Master of Your Database Domain, explored EDB Postgres Enterprise Manager in greater detail.)

EDB customers use EDB Postgres Enterprise Manager for the following key features:

  1. Management of multiple EDB Postgres Advanced Server deployments
  2. Capacity planning
  3. Performance monitoring
  4. Alert Management
  5. Auditing violation
  6. Database Analysis
  7. Postgres Expert

Recently one of EDB’s customers acquired a company that was hosting its data on Amazon PostgreSQL RDS. The customer wanted to leverage EDB Postgres Enterprise Manager to monitor the database deployments on PostgreSQL RDS.

At the time, this request was considered a real possibility. A significant amount of development expertise has gone into EDB Postgres Enterprise Manager to create a sophisticated, enterprise-class tool for managing vast deployments of Postgres. As for remote monitoring, EDB Postgres Enterprise Manager had the capacity to monitor remote databases using an agent on a designated server. EDB Postgres Enterprise Manager also provides capabilities for users, such as database administrators, to create their own custom probes, custom dashboards, and custom alerts. Combined, all of these capabilities mean EDB Postgres Enterprise Manager ultimately provides the kinds of control and flexibility DBAs needed for managing databases that are hosted in different environments.

EDB worked closely with the customer to develop a process for utilizing their subscription for the EDB Postgres Enterprise Manager, which is part of the EDB Postgres Platform, with PostgreSQL databases on Amazon RDS. The steps were formalized into a repeatable process so others could follow suit and enjoy the benefits of EDB Postgres Enterprise Manager in their cloud deployments.

The following are the steps for using EDB Postgres Enterprise Manager with Amazon PostgreSQL RDS:

Amazon PostgreSQL RDS comes with an rds_superuser, which is not the same as a super user in PostgreSQL. Rds_super role has a lot of limitations, and they effect the monitoring capabilities.

To use EDB Postgres Enterprise Manager with PostgreSQL RDS, follow these steps:

  1. Use the following function to modify the EDB Postgres Enterprise Manager probes so that it can leverage user defined views on some catalog tables:
CREATE OR REPLACE FUNCTION strip_pg_catalog_from_probe()
RETURNS boolean
LANGUAGE plpgsql
AS
$function$
 DECLARE
   rec RECORD;
 BEGIN
   CREATE TABLE pem.probe_code_backup AS SELECT id, probe_code FROM pem.probe WHERE id IN (1, 2, 3, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 29, 36, 39, 47, 52, 53, 54);
   FOR rec IN SELECT id, probe_code FROM pem.probe WHERE id IN (1, 2, 3, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 29, 36, 39, 47, 52, 53, 54)
   LOOP
      UPDATE pem.probe SET probe_code=replace(rec.probe_code,'pg_catalog.','') where id=rec.id;
   END LOOP;
   RETURN true;
END;
$function$;
  1. After creating the above function, execute the above function in the EDB Postgres Enterprise Manager server database as shown below:

psql -h localhost –c “SELECT strip_pg_catalog_from_probe()” pem

  1. Connect to PostgreSQL RDS database as rds_super user and execute the following SQL in Postgres and the user’s defined database:
CREATE USER monitor PASSWORD 'password';
GRANT rds_superuser TO monitor;
CREATE SCHEMA rds_pem_views;
SET search_path TO rds_pem_views;
CREATE OR REPLACE VIEW pg_database AS SELECT oid,* FROM pg_catalog.pg_database WHERE datname <> 'rdsadmin';

CREATE OR REPLACE VIEW pg_tablespace AS SELECT oid, * FROM pg_tablespace WHERE spcname <> 'pg_global';

CREATE FUNCTION rds_pem_views.pg_ls_dir(text)
RETURNS TEXT
LANGUAGE sql
AS
$function$
   SELECT ''::TEXT;
$function$;

GRANT ALL ON FUNCTION rds_pem_views.pg_ls_dir(text) TO monitor;
GRANT ALL ON SCHEMA rds_pem_views TO monitor;
GRANT ALL ON  rds_pem_views.pg_database TO monitor;
GRANT ALL ON  rds_pem_views.pg_tablespace TO monitor;

ALTER USER monitor SET search_path TO rds_pem_views,pg_catalog, "$user", public;

The above SQL will create a database user “monitor” with “rds_superuser” privileges.

As mentioned above, rds_superuser in RDS is not the same as a true superuser in PostgreSQL, therefore a schema with a view for pg_database and for pg_tablespace is created.

Views pg_database and pg_tablespace are created because the rds_superuser role has the following restrictions:

  1. Rds_superuser or normal user in PostgreSQL RDS cannot access rdsadmin database;
  2. Rds_superuser or normal user in PostgreSQL RDS cannot calculate the size of pg_global tablespace;
  3. Rds_superuser cannot use pg_ls_dir to list the WAL files in pg_xlog directory.
  1. After executing SQLs, mentioned in step three, we can now use the database user monitor for EDB Postgres Enterprise Manager monitoring.
  2. For remote monitoring of PostgreSQL RDS, use the following link to add the following server:

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.14.html#

When you are following the documentation, please remember to click on “remote monitoring” and choose the EDB Postgres Enterprise Manager agent (PEM Agent) for remote monitoring as shown below:

snap1.png

In the above Bound Agent is pemagent running on PEM Server. Also note that the database is monitor.

After performing the above steps, the EDB Postgres Enterprise Manager server can monitor PostgreSQL RDS, and it can collect all database level statistics. Also, DBAs and other users can configure all database level alerts like autovacuum, etc. EDB Postgres Enterprise Manager provides 63 templates for database level alerts. It also provides flexibility for DBAs and other users to create their own alerts based on custom probes.

Now, some users might have questions or concerns about about system level information from Amazon PostgreSQL RDS, and whether EDB Postgres Enterprise Manager can monitor that data. The answer is yes.

As an example, the following are the steps for configuring EDB Postgres Enterprise Manager to monitor the CPU utilization of PostgreSQL RDS:

  1. Install the Amazon Cloudwatch tools on the EDB Postgres Enterprise Manager Server using following command:
wget http://ec2-downloads.s3.amazonaws.com/CloudWatch-2010-08-01.zip
unzip CloudWatch-2010-08-01.zip

For more information on setting the Amazon CloudWatch command line, please use the following link:

http://docs.aws.amazon.com/AmazonCloudWatch/latest/cli/SetupCLI.html

  1. Create an IAM user on the AWS panel, and attach the managed policy “CloudWatchReadOnlyAccess”.

For more information, please refer to the following link:

http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/iam-identity-based-access-control-cw.html

  1. Create an AWS Credential File using the following sample:
AWSAccessKeyId=
AWSSecretKey=

For more information on AWS Credential file, please use the following link:

http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html

  1. Change the permission of the above Credential file using the following command:
chmod 600 credential_file
  1. Create a following script to get the CPUUtiliztion information from Amazon CloudWatch:
#!/bin/bash
export AWS_CREDENTIAL_FILE=/usr/local/aws/credential/credential-file-path
export AWS_CLOUDWATCH_HOME=/root/CloudWatch-1.0.20.0
export PATH=$PATH:$AWS_CLOUDWATCH_HOME/bin
export JAVA_HOME=/usr/lib/jvm/jre
export AWS_CLOUDWATCH_URL=http://monitoring.us-west-2.amazonaws.com
echo -e "avg_cpu\tminimum_cpu\tmax_cpu"
mon-get-stats CPUUtilization --namespace="AWS/RDS" --dimensions="DBInstanceIdentifier=postgres" --statistics Average,Minimum,Maximum|tail -n 1|awk '{print $3"\t"$4"\t"$5}'
  1. After creating the above script, make the script executable as given below:
chmod +x rds_cpu.sh

Now you are ready to create a custom probe to use this script for monitoring with EDB Postgres Enterprise Manager.

To create a custom probe, please visit the following link:

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.29.html#pID0E0TBB0HA

Below is a snapshot of a custom probe which I have used:

snap2

By using above probes, user can create their custom alerts in EDB Postgres Enterprise Manager for monitoring CPU utilization on Amazon PostgreSQL RDS. Also they can use data for creating a custom dashboard for RDS.

The following is a snapshot of RDS CPU utilization from EDB Postgres Enterprise Manager.

snap3

If you want to know more about Custom Alerts and Custom Dashboards, please use the following links:

  1. Creating Custom Alerts templates

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.30.html#pID0E0E40HA

  1. Creating Custom Alerts

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.30.html#pID0E0Z60HA

  1. Creating Custom Charts

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.28.html#

  1. Creating Custom Ops Dashboard

https://www.enterprisedb.com/docs/en/6.0.2/pemgetstarted/getting_started_guide.1.27.html#pID0E0HFB0HA

 

As you can see, the EDB Postgres Enterprise Manager is very flexible and customizable to meet specific needs.

Data Replication from Oracle logical standby to EDB Postgres

At EnterpriseDB (EDB), we are sometimes presented with individual customer use cases or requests that call for the development of innovative solutions to support their deployment of the EDB Postgres Platform. For example, one of our customers  wanted to leverage Oracle Logical standby for replication to EDB Postgres. And they wanted to know if the EDB Postgres Replication Server supported Oracle logical replication to EDB Postgres.

DBAs and database architects can use the EDB Postgres Replication Server for replicating data from Oracle Logical replication to the EDB Postgres Platform. However, the process of setting up the replication,requires extra steps. Because this was a very interesting use case, I thought a blog containing the recommended steps for using the EDB Postgres Replication Server with Oracle Logical Standby would provide some guidance for many end users. There are instances where this could support strategic replication deployments where EDB Postgres is complementing a legacy solution

For this blog, we will use following IP addresses and hostnames:

  1. EDB Postgres server: 172.17.0.4
  2. Oracle Logical Standby server: 172.17.0.3
  3. Oracle primary server: 172.17.0.2

To use EDB Postgres Replication Server 6.0, users should make sure Oracle Logical Standby is in standby guard status. The following is a list of guard status for Oracle Logical Standby:

 ALTER DATABASE GUARD mode
Mode Detail
ALL (Default) No users (except SYS) can make changes to logical standby data
STANDBY Users may modify logical standby data unless it’s being maintained by local LSPn processes (e.g. via Logical Standby)
NONE Normal security rules are applied; any user with appropriate privileges can modify logical standby data

Following a SQL command, DBAs can use the following to check the guard_status in Oracle:

SQL> SELECT guard_status FROM v$database;
GUARD_S
-------

STANDBY

After confirming the guard status in Oracle Logical Sandby, create a database user with the following privileges:

CREATE USER pubuser IDENTIFIED BY oracle;
GRANT DBA TO pubuser;
GRANT CREATE ANY TRIGGER TO pubuser;
GRANT SELECT ANY TABLE TO pubuser;
GRANT LOCK ANY TABLE TO pubuser;

Let’s create the EDB Postgres Replication Server between Oracle Logical Standby and EDB Postgres. We will be leveraging the EDB Postgres Replication Server command line interface for building publications and subscriptions. If you are not familiar with the EDB Postgres Replication Server command line interface options, please refer to the following link for more information:

https://www.enterprisedb.com/docs/en/6.0/repguide/EDB_Postgres_Replication_Server_Users_Guide.1.59.html#

  1. Add Oracle as the publication database in the EDB Postgres Replication Server. The following command can be used to add the database:
[root@epas95 /]# . /usr/ppas-xdb-6.0/etc/sysconfig/xdbReplicationServer-60.config
[root@epas95 /]# ${JAVA_EXECUTABLE_PATH} -jar /usr/ppas-xdb-6.0/bin/edb-repcli.jar \
    -addpubdb \
>   -repsvrfile ${PUBCONF} \
>   -dbtype oracle \
>   -dbhost 172.17.0.3 \
>   -dbport 1521 \
>   -dbuser ${ORAUSER} \
>   -dbpassword ${ORA_ENCRYPT_PASSWD}  \
>   -database ORCL
Adding publication database...
Publication database added successfully. Publication database id:1

Please note the publication database id: 1 and -dbhost: 172.17.0.3 which is the Logical Standby IP Address

  1. Execute the following anonymous PL/SQL block in Oracle Logical Standby as SYSDBA:
BEGIN
 FOR s IN (SELECT TRIGGER_NAME FROM ALL_TRIGGERS WHERE OWNER ='PUBUSER')
 LOOP
         IF DBMS_DDL.IS_TRIGGER_FIRE_ONCE('PUBUSER', s.TRIGGER_NAME)
         THEN
                DBMS_DDL.set_trigger_firing_property( 
                trig_owner => 'PUBUSER',  
                trig_name => s.trigger_name,   
                fire_once => FALSE);
        END IF;
 END LOOP;
END;
/

/

The DBA/User should execute the above PL/SQL block in Oracle Logical Standby to make sure triggers fire when a process changes the base table. By default, in Oracle Logical Standby, triggers never get fired properly in STANDBY guard status.

  1. Add the subscription database in EDB Postgres Replication Server. The following command can be used for adding the subscription database. In our case, we have the EDB Postgres database add:
[root@epas95 /]# ${JAVA_EXECUTABLE_PATH} -jar /usr/ppas-xdb-6.0/bin/edb-repcli.jar \
-addsubdb \
>   -repsvrfile ${SUBCONF} \
>   -dbtype enterprisedb \
>   -dbhost 172.17.0.4 \
>   -dbport 5432 \
>   -dbuser ${PGUSER} \
>   -dbpassword ${PG_ENCRYPT_PASSWD}  \
>   -database orcl
Adding Subscription Database...
Subscription database added successfully. Subscription Database id:1006

Please note the subscription database id: 1006 and -dbhost: 172.17.0.3, which is the EDB Postgres server IP address.

  1. Before creating a publication and subscription, first change the GUARD status to NONE in the Oracle Logical Standby as given below:
SQL> ALTER DATABASE GUARD NONE;

Database altered.

The above change is needed because the EDB Postgres Replication Server acquires a LOCK on tables, managed by logical standby, for creating triggers for publication.

  1. Create publication for table: REPLICATION.REPLICATION_TABLE:
[root@epas95 /]# ${JAVA_EXECUTABLE_PATH} -jar /usr/ppas-xdb-6.0/bin/edb-repcli.jar \
-createpub replication_table \
>            -repsvrfile ${PUBCONF} \
>            -pubdbid 1 \
>            -reptype t \
>            -tables REPLICATION.REPLICATION_TABLE \
>            -repgrouptype s
Creating publication...
Tables:[[REPLICATION.REPLICATION_TABLE, TABLE]]
Filter clause:[]
Publication created.
  1. After creating the publication, please use the following anonymous block to set the EDB Postgres Replication Server trigger property to fire every time there is a change to the base table.
BEGIN
 FOR s IN (SELECT TRIGGER_NAME FROM ALL_TRIGGERS WHERE OWNER ='PUBUSER')
 LOOP
         IF DBMS_DDL.IS_TRIGGER_FIRE_ONCE('PUBUSER', s.TRIGGER_NAME)
         THEN
                DBMS_DDL.set_trigger_firing_property( 
                trig_owner => 'PUBUSER',  
                trig_name => s.trigger_name,   
                fire_once => FALSE);
        END IF;
 END LOOP;
END;
/
  1. Create subscription for publication: replication_table (created in step 5) using following command:
[root@epas95 /]# ${JAVA_EXECUTABLE_PATH} -jar /usr/ppas-xdb-6.0/bin/edb-repcli.jar \
-createsub replication_table \
>   -subsvrfile ${SUBCONF} \
>   -subdbid 1006 \
>   -pubsvrfile ${PUBCONF} \
>   -pubname replication_table
Creating subscription...
Subscription created successfully
  1. After creating the subscription, re-execute the anonymous block to set EDB Postgres Replication Server trigger property:
BEGIN
 FOR s IN (SELECT TRIGGER_NAME FROM ALL_TRIGGERS WHERE OWNER ='PUBUSER')
 LOOP
         IF DBMS_DDL.IS_TRIGGER_FIRE_ONCE('PUBUSER', s.TRIGGER_NAME)
         THEN
                DBMS_DDL.set_trigger_firing_property( 
                trig_owner => 'PUBUSER',  
                trig_name => s.trigger_name,   
                fire_once => FALSE);
        END IF;
 END LOOP;
END;
/
  1. After creating a subscription, rollback the GUARD status to STANDBY in Oracle Logical Standby as shown below:
SQL> ALTER DATABASE GUARD STANDBY;

Database altered.

Now, we are ready to replicate data from Oracle Logical Standby to EDB Postgres.

Let’s first take the snapshot of data from Oracle Logical Standby to EDB Postgres using the following command:

[root@epas95 /]# ${JAVA_EXECUTABLE_PATH} -jar /usr/ppas-xdb-6.0/bin/edb-repcli.jar\
-dosnapshot replication_table \
-repsvrfile ${SUBCONF} \
-verboseSnapshotOutput true
Performing snapshot...
Running EnterpriseDB Migration Toolkit (Build 49.1.5) ...
Source database connectivity info...
conn =jdbc:oracle:thin:@172.17.0.3:1521:ORCL
user =pubuser
password=******
Target database connectivity info...
conn =jdbc:edb://172.17.0.4:5432/orcl?loginTimeout=60&connectTimeout=30
user =pubuser
password=******
Connecting with source Oracle database server...
Connected to Oracle, version 'Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options'
Connecting with target EnterpriseDB database server...
Connected to EnterpriseDB, version '9.5.5.10'
Importing redwood schema REPLICATION...
Table List: 'REPLICATION_TABLE'
Loading Table Data in 8 MB batches...
Disabling FK constraints & triggers on replication.replication_table before truncate...
Truncating table REPLICATION_TABLE before data load...
Disabling indexes on replication.replication_table before data load...
Loading Table: REPLICATION_TABLE ...
[REPLICATION_TABLE] Migrated 10 rows.
[REPLICATION_TABLE] Table Data Load Summary: Total Time(s): 0.04 Total Rows: 10
Enabling FK constraints & triggers on replication.replication_table...
Enabling indexes on replication.replication_table after data load...
Performing ANALYZE on EnterpriseDB database...
Data Load Summary: Total Time (sec): 0.191 Total Rows: 10 Total Size(MB): 0.0
Schema REPLICATION imported successfully.
Migration process completed successfully.
Migration logs have been saved to /var/log/xdb-6.0
******************** Migration Summary ********************
Tables: 1 out of 1
Total objects: 1
Successful count: 1
Failed count: 0
Invalid count: 0
*************************************************************
Snapshot taken successfully.

After taking an initial snapshot, let’s modify the replication table from Oracle primary:

18-JAN-17 REPLICATION@ ORCL PRIMARY> UPDATE replication_table SET id=11 WHERE id=10;
Elapsed: 00:00:00.01
18-JAN-17 REPLICATION@ ORCL PRIMARY> COMMIT;
Elapsed: 00:00:00.01
18-JAN-17 REPLICATION@ ORCL PRIMARY> SELECT * FROM replication_table;
1 FIRST
2 SECOND
3 THIRD
4 FOURTH
5 FIFTH
6 SIXTH
7 SEVENTH
8 EIGHTH
9 NINTH
11 TENTH

Elapsed: 00:00:00.00

After making the changes on Oracle primary, we can execute SELECT command on Oracle Logical Standby to verify if the data change was replicated by Oracle in logical standby mode.

18-JAN-17 REPLICATION@ ORCL STANDBY> SELECT * FROM replication_table;
1 FIRST
2 SECOND
3 THIRD
4 FOURTH
5 FIFTH
6 SIXTH
7 SEVENTH
8 EIGHTH
9 NINTH
11 TENTH

10 rows selected.

Elapsed: 00:00:00.00
  1. Now perform the manual sync to verify EDB Postgres Replication Server can replicate the above changes in EDB Postgres.
[root@epas95 /]# ${JAVA_EXECUTABLE_PATH} -jar /usr/ppas-xdb-6.0/bin/edb-repcli.jar  -dosynchronize  replication_table -repsvrfile ${SUBCONF} -repgrouptype s
Performing synchronize...
Synchronize done successfully.
[root@epas95 /]# psql orcl
psql.bin (9.5.5.10)
Type "help" for help.
orcl=# select * from replication.replication_table ;
id |   col
----+---------
1 | FIRST
2 | SECOND
3 | THIRD
4 | FOURTH
5 | FIFTH
6 | SIXTH
7 | SEVENTH
8 | EIGHTH
9 | NINTH
11 | TENTH

(10 rows)

The above snapshot shows that after performing -dosynchronize, the EDB Postgres Replication Server was able to replicate incremental changes on Oracle Logical Standby to EDB Postgres.

The pg_rewind tool was first introduced in PostgreSQL 9.5. This is best used in a situation when a standby becomes a master, then the old master can be reattached to the new master as a standby without restoring it from the new base backup.

The tool examines the timeline histories of the new master (the old standby) and the old master to determine the point where they diverged, and it expects to find Write-Ahead Logs (WAL) in the old master cluster’s pg_xlog directory reaching all the way back to the point of divergence.

For pg_rewind, DBAs need to have the following:

  1. wal_log_hints = on or PostgreSQL cluster with data checksum.

DBAs can also have data checksums and wal_log_hints, both enabled in their PostgreSQL cluster.

For more information on wal_log_hints and enabling data checksum, the following links are useful:

https://www.postgresql.org/docs/9.5/static/runtime-config-wal.html#GUC-WAL-LOG-HINTS

https://www.postgresql.org/docs/9.5/static/app-initdb.html#APP-INITDB-DATA-CHECKSUMS

  1. full_page_writes should be enabled, which is by default enabled in Postgres. For more information on this parameter, see the following link:

https://www.postgresql.org/docs/9.5/static/runtime-config-wal.html#GUC-FULL-PAGE-WRITES

In Postgres 9.6, Alexander Korotkov has introduced a new feature in the pg_rewind tool. Details of the commit are given below:

 commit e50cda78404d6400b1326a996a4fabb144871151
 Author: Teodor Sigaev <teodor@sigaev.ru>
 Date:   Tue Dec 1 18:56:44 2015 +0300
 
      Use pg_rewind when old master timeline was switched
 
      Allow pg_rewind to work when old master timeline was switched. Now
      user can return promoted standby to old master.
  
      old master timeline history becomes a global variable. Index
      in old master timeline history is used in function interfaces instead of
      specifying TLI directly. Thus, SimpleXLogPageRead() can easily start
      reading XLOGs from next timeline when current timeline ends.
  
      Author: Alexander Korotkov
      Review: Michael Paquier

With the new feature, a DBA/user can now reattach the standby that got promoted due to the following reasons:

  1. User error:

By mistake, the user promoted the standby using pg_ctl promote option or by creating a trigger

  1. Failover script has some logic problem, which resulted in promotion of standby to Master
  2. A user is testing failover scenario. (In failover testing, the standby needs to be promoted and needs to bring back to a state where standby can be safely attached to PostgreSQL master)
  3. Bug in failover management tool (a typical scenario is due to a network blip and the failover management tool promoted the standby)

There could be other possible scenarios where this can be useful.

Let’s try with PostgreSQL 9.5 to see the previous behavior of pg_rewind in a use case, where the DBA wants to reattach the standby to the primary server after promoting it.

For this test, we have two PostgreSQL clusters: node1 and node2. Node1 (port: 54445) is a master/primary. Node2 (port: 54446) is a standby which streams data from master node1 (port 54445)
Below is streaming replication status:

bash-postgres-9.5 $ psql -p 54445 -x -c "select * from pg_stat_replication;"
-[ RECORD 1 ]----+---------------------------------
pid              | 4482
usesysid         | 10
usename          | enterprisedb
application_name | walreceiver
client_addr      | 127.0.0.1
client_hostname  | 
client_port      | 46780
backend_start    | 21-DEC-16 18:15:08.442887 +00:00
backend_xmin     | 
state            | streaming
sent_location    | 0/8000060
write_location   | 0/8000060
flush_location   | 0/8000060
replay_location  | 0/8000060
sync_priority    | 0
sync_state       | async


bash-postgres-9.5 $  psql -p 54446 -c "SELECT pg_is_in_recovery()"
 pg_is_in_recovery 
-------------------
 t
(1 row)

As you can see the PostgreSQL instance running on port 54446 is a standby (i.e it’s in recovery mode) and it’s streaming data from node1 (port 54445).

Now, let’s promote node2 and check the status of replication on node1.

bash-postgres-9.5 $  touch /tmp/test.trigger
bash-postgres-9.5 $  psql -p 54446 -c "SELECT pg_is_in_recovery()"
 pg_is_in_recovery 
-------------------
 f
(1 row)

bash-postgres-9.5 $  psql -p 54445 -x -c "select * from pg_stat_replication;"
(0 rows)

The above snapshot shows that node2(port 54446) is now out of recovery mode. And replication status on node1(port 54445) shows that node2 is not streaming data from node1.

Now, let’s try to use pg_rewind to reattach node2 (port 54446) as standby to node1 (port 54445) to verify behavior in version 9.5.

bash-postgres-9.5 $ pg_ctl -D /var/lib/ppas/node2/ stop
bash-postgres-9.5 $  pg_rewind -D /var/lib/ppas/node2  --source-server="port=54445 host=127.0.0.1" 
could not find common ancestor of the source and target cluster's timelines
Failure, exiting

The above messages show that the user cannot reattach a promoted standby to the old master, which is a limitation of pg_rewind in 9.5, and it’s an expected behavior of pg_rewind 9.5.

We can now try the same scenario with PostgreSQL 9.6 to check the pg_rewind improvement.

bash-postgres-9.6 $ pg_ctl --version
pg_ctl (EnterpriseDB) 9.6.1.4

bash-postgres-9.6$ psql -p 54445 -x -c "select * from pg_stat_replication;"
-[ RECORD 1 ]----+---------------------------------
pid              | 4838
usesysid         | 10
usename          | enterprisedb
application_name | walreceiver
client_addr      | 127.0.0.1
client_hostname  | 
client_port      | 46790
backend_start    | 21-DEC-16 18:35:03.216873 +00:00
backend_xmin     | 
state            | streaming
sent_location    | 0/4000060
write_location   | 0/4000060
flush_location   | 0/4000060
replay_location  | 0/4000060
sync_priority    | 0
sync_state       | async

bash-postgres-9.6$ psql -p 54446 -x -c "select pg_is_in_recovery();"
-[ RECORD 1 ]-----+--
pg_is_in_recovery | t

The above snapshot shows the streaming replication between node1 (port 54445) to node2 (port 54446).

Now, let’s promote node2. Below is a snapshot of promoting node2:

bash-postgres-9.6$ touch /tmp/test
bash-postgres-9.6$ psql -p 54446 -x -c "select pg_is_in_recovery();"
-[ RECORD 1 ]-----+--
pg_is_in_recovery | f


bash-postgres-9.6$ psql -p 54446 -x -c "select * from pg_stat_replication;"
(0 rows)

The above snapshot shows that node2 (port 54446) is out of recovery mode and there is no replication between node1(port 54445) and node2(port 54446).

Let’s try to use pg_rewind to re-attach node2 as standby with node1.

bash-postgres-9.6$ pg_ctl -D /var/lib/ppas/node2/ stop
waiting for server to shut down.... done
server stopped
bash-postgres-9.6$ pg_rewind -D /var/lib/ppas/node2 --source-server="port=54445 host=127.0.0.1" 
servers diverged at WAL position 0/4000140 on timeline 1
rewinding from last common checkpoint at 0/4000098 on timeline 1
Done!

The above snapshot shows that we were able to pg_rewind node2 (port 54446) to the last common checkpoint of node1(port 54445).

We can now start node2 and check the status of replication. Below is a snapshot of that activity:

bash-postgres-9.6$ pg_ctl -D /var/lib/ppas/node2/ start
server starting
-bash-4.1$ 2016-12-21 18:55:15 UTC LOG:  redirecting log output to logging collector process
2016-12-21 18:55:15 UTC HINT:  Future log output will appear in directory "pg_log".

bash-postgres-9.6$ psql -p 54445 -x -c "select * from pg_stat_replication;"
-[ RECORD 1 ]----+---------------------------------
pid              | 5019
usesysid         | 10
usename          | enterprisedb
application_name | walreceiver
client_addr      | 127.0.0.1
client_hostname  | 
client_port      | 46794
backend_start    | 21-DEC-16 18:55:15.635726 +00:00
backend_xmin     | 
state            | streaming
sent_location    | 0/4012BA0
write_location   | 0/4012BA0
flush_location   | 0/4012BA0
replay_location  | 0/4012BA0
sync_priority    | 0
sync_state       | async

bash-postgres-9.6$ psql -p 54446 -x -c "select pg_is_in_recovery();"
-[ RECORD 1 ]-----+--
pg_is_in_recovery | t

Partition pruning in EDB Postgres 9.5

One of my colleague who was recently working with a customer has presented a customer case. According to him, customer has a partitioned table and EDB Postgres was not applying the partition pruning in his query. So, I thought to blog about partition pruning, so that EDB Postgres developers and DBAs can benefit.

EDB Postgres supports two types of partition pruning:

Constraint exclusion pruning:

It is a feature introduced in Postgresql 8.1. This type of pruning works with PostgreSQL-style of partition. With constraint exclusion enabled, the planner will examine the constraints of each partition and try to prove that the partition need not be scanned because it could not contain any rows meeting the query’s WHERE clause. When the planner can prove this, it excludes the partition from the query plan.

However, it has some limitations. Following is the limitation of constraint_exclusion:

a. Constraint exclusion only works when the query’s WHERE clause contains constants (or externally supplied parameters). For example, a comparison against a non-immutable function such as CURRENT_TIMESTAMP cannot be optimized, since the planner cannot know which partition the function value might fall into at run time.
b. Keep the partitioning constraints simple, else the planner may not be able to prove that partitions don’t need to be visited. Use simple equality conditions for list partitioning, or simple range tests for range partitioning, as illustrated in the preceding examples. A good rule of thumb is that partitioning constraints should contain only comparisons of the partitioning column(s) to constants using B-tree-indexable operators.

For verification, below shows the behavior of constraint_exclusion pruning:
1. Let’s create PostgreSQL-style partition table using table inheritance feature.

CREATE TABLE measurement (
     city_id        int not null,
     logdate        date not null,
     peaktemp        int,
     unitsales      int
 );
CREATE TABLE measurement_y2004m02 (
     CHECK ( date_part('month'::text, logdate) = 2)
 ) INHERITS (measurement);
CREATE TABLE measurement_y2004m03 (
     CHECK ( date_part('month'::text, logdate) = 3 )
 ) INHERITS (measurement);

  1. Execute simple query to verify the constraint_exclusion behavior based on above definition:
 edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE date_part('month'::text, logdate) = 3;
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Aggregate
   ->  Append
         ->  Seq Scan on measurement
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
         ->  Seq Scan on measurement_y2004m02
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
         ->  Seq Scan on measurement_y2004m03
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
(8 rows)

Above output of the query shows EDB Postgres considered all partitions of table measurements, even though we had included partition column and constant value in WHERE clause. This is due to check constraint which has date_part function. date_part function is not immutable in Postgres, therefore at planning time, EDB Postgres doesn’t know what value it will return. And, if user doesn’t include proper WHERE clause as mentioned in check constraint, pruning will not work.

In Postgres you can make a function immutable by using ALTER FUNCTION command.

In below example, we will make date_part function immutable to check if constraint_exclusion works with date_part immutable function or not:

  1. Convert date_part function to immutable :
edb=# ALTER FUNCTION date_part (text, timestamp without time zone ) immutable;
ALTER FUNCTION
  1. Perform EXPLAIN command to check the behavior of constraint_exclusion using immutable function:
edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE date_part('month'::text, logdate) = 3;
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Aggregate
   ->  Append
         ->  Seq Scan on measurement
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
         ->  Seq Scan on measurement_y2004m03
               Filter: (date_part('month'::text, logdate) = '3'::double precision)
(6 rows)

As you can see with immutable function EDB Postgres was able to perform constraint_exclusion pruning.

What if we change the WHERE clause little bit and include < and = operator in our SQL queries (below are examples)

edb=#  EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate < DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m02
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m03
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone) (8 rows) edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate = DATE '2004-02-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate = '01-FEB-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m02
               Filter: (logdate = '01-FEB-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m03
               Filter: (logdate = '01-FEB-04 00:00:00'::timestamp without time zone)
(8 rows)

As you can see with a change in WHERE clause and exclusion of the way constraint defined on partition, Postgres will scan all partitions.

Based on above we can conclude that if a user is planning to use Postgres way of partition then they have to be careful about the constraint definition in order to utilize constraint_exclusion pruning.

Lets modify the definition of measurement table and verify the ,=, <= and = operator in WHERE clause.

CREATE TABLE measurement (
     city_id        int not null,
     logdate        date not null,
     peaktemp        int,
     unitsales      int
 ); 
CREATE TABLE measurement_y2004m02 (
     CHECK ( logdate >= DATE '2004-02-01' AND logdate < DATE '2004-03-01' )  ) INHERITS (measurement); CREATE TABLE measurement_y2004m03 (      CHECK ( logdate >= DATE '2004-03-01' AND logdate < DATE '2004-04-01' )
 ) INHERITS (measurement);

Below is explain plan based on above definition:

edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate < DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m02
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone) (6 rows) edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate = DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate = '01-MAR-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m03
               Filter: (logdate = '01-MAR-04 00:00:00'::timestamp without time zone)
(6 rows)

Above clearly shows that with correct constraint definition, constriant_exclusion pruning can work for >,<,>=, <= and = operator in WHERE clause.

Fast pruning:

EDB Postgres has CREATE TABLE PARTITION SYNTAX since version 9.1. PARTITION SYNTAX in EDB Postgres uses one more pruning called fast pruning. Fast pruning uses the partition metadata and query predicates to efficiently reduce the set of partitions to scan. Fast pruning in EDB Postgres happens before query plan. Let’s verify the behavior of fast pruning.
As mentioned fast pruning works with partition which user created using EDB Postgres CREATE TABLE PARTITION Syntax. Let’s modify the above definition of measurement table to use CREATE TABLE PARTITION SYNTAX as given below:

CREATE TABLE  measurement (
     city_id        int not null,
     logdate        date not null,
     peaktemp        int,
     unitsales      int
 )
PARTITION BY RANGE(logdate)
(PARTITION y2004m01 VALUES LESS THAN ('2004-02-01'),
 PARTITION y2004m02 VALUES LESS THAN ('2004-03-01'),
 PARTITION y2004m03 VALUES LESS THAN ('2004-04-01')
);
edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate < DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m01
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone)          ->  Seq Scan on measurement_y2004m02
               Filter: (logdate < '01-MAR-04 00:00:00'::timestamp without time zone) (8 rows) edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate = DATE '2004-03-01';                                      QUERY PLAN                                       -------------------------------------------------------------------------------------  Aggregate    ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate = '01-MAR-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m03
               Filter: (logdate = '01-MAR-04 00:00:00'::timestamp without time zone)
(6 rows)

edb=# EXPLAIN (costs off) SELECT count(*) FROM measurement WHERE logdate > DATE '2004-03-01';
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Aggregate
   ->  Append
         ->  Seq Scan on measurement
               Filter: (logdate > '01-MAR-04 00:00:00'::timestamp without time zone)
         ->  Seq Scan on measurement_y2004m03
               Filter: (logdate > '01-MAR-04 00:00:00'::timestamp without time zone)
(6 rows)

For more information on EDB Postgres pruning please refer following link:
https://www.enterprisedb.com/docs/en/9.5/oracompat/Database_Compatibility_for_Oracle_Developers_Guide.1.327.html#

Tip:: PPAS 9.4 and Global Temporary Table

Customers who moved/migrated their database from Oracle to PPAS frequently ask for Global Temporary Table in PPAS.

Currently, PPAS doesn’t support Global Temporary tables. However, there is a way user can achieve this functionality in PPAS.

Before we continue with the implementation, lets first understand characteristics of Global Temporary Table. Following are the important characteristics of Global Temporary Table.
1. Global Temporary Table gives predefined structure for storing data.
2. It’s an unlogged table which means any activity on this table will not be logged.
3. The data in a global temporary table are private, such that data inserted by a session can only be accessed by that session.

Based on the above characteristics of Global Temporary Table AKA GTT, we can define similar kind of work by using the following method:
1. Create UNLOGGED TABLE in PPAS, which activity won’t be logged.
2. Create Row Level Security in such a way that session should be able to see their information (based on PID).
3. Create a process which can cleanup data from GTT based on pids which are not active in database.

Lets see how we can implement it in Adavanced Server.

1. Create an UNLOGGED table with all columns required and extra column of Pid.

CREATE UNLOGGED TABLE test_global_temporary_table(id numeric, col text, pid bigint default pg_backend_pid());

2. Create a function to restrict the visibility of data.

CREATE OR REPLACE FUNCTION verify_pid_context (
    p_schema       TEXT,
    p_object       TEXT
)
RETURN VARCHAR2
IS
DECLARE
   predicate TEXT;
BEGIN
    IF ( current_setting('is_superuser') = 'on')
    THEN
      predicate = 'true';
    ELSE
      predicate := format('pid = %s',pg_backend_pid());
    END IF;
    RETURN predicate;
END;

3. Apply the security policy based on above function.

DECLARE
  v_object_schema VARCHAR2(30)   := 'public';
  v_object_name VARCHAR2(30)     := 'test_global_temporary_table';
  v_policy_name VARCHAR2(30)     := 'secure_by_pid';
  v_function_schema VARCHAR2(30) := 'public';
  v_policy_function VARCHAR2(30) := 'verify_pid_context';
  v_statement_types VARCHAR2(30) := 'INSERT,UPDATE,DELETE,SELECT';
  v_update_check BOOLEAN         := TRUE;
  v_enable BOOLEAN               := TRUE;
BEGIN
  DBMS_RLS.ADD_POLICY( v_object_schema,
                       v_object_name,
                       v_policy_name,
                       v_function_schema,
                       v_policy_function,
                       v_statement_types,
                       v_update_check,
                       v_enable
                     );
END;

4. Create UPDATABLE view which can hide pid column. All sessions will be using this view as GTT.

   CREATE OR REPLACE VIEW test_global_temporary AS SELECT id, col FROM test_global_temporary_table;

5. Create a backend job, which can cleanup Table based on stale/old sessions.
For job, user/developer can do following:
a. use superuser and execute DELETE command on table:
DELETE FROM test_global_temporary WHERE pid NOT in (SELECT pid FROM pg_stat_activity);
b. To Schedule above DELETE command, user can use one of the following:
i. Crontab
ii. Or PPAS DBMS_SCHEDULE Package.

6. GRANT ALL privileges to database user who can access Global Temporary Table.

    GRANT ALL on test_global_temporary TO testdbuser;
    GRANT ALL on test_global_temporary_table To testdbuser;

Now, lets try above implementation of Global Temporary Table.

Open two sessions as a normal user (testdbuser) as given below:

[vibhorkumar@localhost ~]$ psql -U testdbuser edb
psql.bin (9.4.4.9)
Type "help" for help.

edb=> 
edb=> select pg_backend_pid();
 pg_backend_pid 
----------------
          32722
(1 row)


edb=> select pg_backend_pid();
 pg_backend_pid 
----------------
          32729
(1 row)

Now from both session insert some records:
From first session:

edb=> INSERT INTO test_global_temporary VALUES(1,'FROM pid 32722');
INSERT 0 1
edb=> INSERT INTO test_global_temporary VALUES(2,'FROM pid 32722');
INSERT 0 1
edb=> INSERT INTO test_global_temporary VALUES(3,'FROM pid 32722');
INSERT 0 1

From Second session:

edb=> INSERT INTO test_global_temporary VALUES(1,'FROM pid 32729');
INSERT 0 1
edb=> INSERT INTO test_global_temporary VALUES(2,'FROM pid 32729');
INSERT 0 1
edb=> INSERT INTO test_global_temporary VALUES(3,'FROM pid 32729');
INSERT 0 1

From First Session:

edb=> SELECT * FROM test_global_temporary;
 id |      col       
----+----------------
  1 | FROM pid 32722
  2 | FROM pid 32722
  3 | FROM pid 32722
(3 rows)

From Second Session:

edb=> SELECT * FROm test_global_temporary;
 id |      col       
----+----------------
  1 | FROM pid 32729
  2 | FROM pid 32729
  3 | FROM pid 32729
(3 rows)

which shows that unlogged table with right RLS policy and backend job, can be a potential solution for Global Temporary Tables.

Postgres and Transparent Data Encryption (TDE)

Security has always been a great concern of Enterprises. Especially, if you have crucial information stored in the database, you would always prefer to have high security around it. Over the years, technologies have evolved and provided better solutions around it.

If you have very sensitive information, people try to keep this information encrypted so, that in case, somebody gets access of the system, then they cannot view this information, if they are not authorized.

For managing sensitive information, Enterprises use multiple methods:

  1. Encrypting specific information.

If you are PPAS users, you would like to use DBMS_CRYPTO package which provides a way of encrypting sensitive information in databases.

For more information, please refer following link:

http://www.enterprisedb.com/docs/en/9.4/oracompat/Database_Compatibility_for_Oracle_Developers_Guide.1.178.html#

For PostgreSQL, users can use pgcrypto module.

  1. Transparent Data Encryption (TDE) is another method employed by both Microsoft and Oracle to encrypt database files. TDE offers encryption at file level. This method solves the problem of protecting data at rest i.e. encrypting databases both on the hard drive and consequently on backup media. Enterprises typically employ TDE to solve compliance issues such as PCI DSS.

Postgres Plus, currently doesn’t have inbuilt TDE, however, if Enterprises looking for encryption at the database file level, they can use one of the following methods for protecting data at rest:

  1. Full Disk Encryption:

Full disk or partition encryption is one of the best ways of protecting your data. This method not only protects each file, however, also protects the temporary storage that may contain parts of these files.  Full disk encryption protects all of your files and then you do not have to worry about selecting what you want to protect and possibly missing a file.

RHEL (Red Hat) supports Linux Unified Key Setup-on-disk-format (or LUKS). LUKS bulk encrypts Hard Drive partition.

For more information on LUKS, please refer following link:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security_Guide/chap-Security_Guide-Encryption.html#sect-Security_Guide-LUKS_Disk_Encryption

  1. File system-level encryption:

File system-level encryption often called file/directory encryption. In this method individual files or directories are encrypted by the file system itself.

There is stackable cryptographic file system encryption available which user can utilize in their environment.

File system level Encryption gives following advantages:

  1. Flexible file-based key management, so that each file can be and usually is encrypted with a separate encryption key.
  1. Individual management of encrypted files e.g. Incremental backups of the individual changed files even in encrypted form, rather than backup of the entire encrypted volume.
  1. Access control can be enforced through the use of public-key cryptography, and the fact that cryptographic keys are only held in memory while the file that is decrypted by them is held open.

Stackable cryptographic file system encryption can be use for Postgres for Transparent Data Encryption.

In this blog, I will discuss using mount ecrpytfs as it requires less overhead in setup (LUKS requires a new disk to be configured and formatted before storing data on it. “mount ecrpytfs” works with existing directories and data).

If Enterprises want to give the control to DBAs for TDE, they can use/define few sudo rules for DBAs to execute commands for encryption.

Following is a method, which they can use:

  • Ask system admin to create sudo rules to allow DBA to execute encryption for data directory for Postgres Plus. One common way to do this is using the “mount ecryptfs” command in Linux operating systems.
  • If user needs to encrypt the /ppas94/data directory, they can use following command:
sudo mount -t ecryptfs /ppas94/data /ppas94/data
        

More information can be found in the documentation from RHEL:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/ch-efs.html

User can also specify encryption key type (passphrase, openssl), cipher (aes, des3_ede…) key byte size, and other options with above commands.

Example is given below:

# mount -t ecryptfs /home /home -o ecryptfs_unlink_sigs \

 ecryptfs_key_bytes=16 ecryptfs_cipher=aes ecryptfs_sig=c7fed37c0a341e19
<strong>


Centos 7 and RHEL 7, by default doesn’t come with ecrpytfs therefore, users can also use encfs command.

For more information on encfs, please refer following link:

https://wiki.archlinux.org/index.php/EncFS

Following are the steps to use encfs to encrypt the data directory.

  1. Create a data directory using the following command, as enterprisedb user.
   mkdir /var/lib/ppas/9.4/encrypted_data
   chmod 700 /var/lib/ppas/9.4/encrypted_data
  1. Use following encfs command to encrypt the data directory.
         encfs /var/lib/ppas-9.4/encrypted_data/ /var/lib/ppas-9.4/data

Snapshot of above command is given below:

encfs /var/lib/ppas/9.4/encrypted_data /var/lib/ppas/9.4/data

The directory "/var/lib/ppas/9.4/data" does not exist. Should it be created? (y,n) y

Creating new encrypted volume.

Please choose from one of the following options:

 enter "x" for expert configuration mode,

 enter "p" for pre-configured paranoia mode,

 anything else, or an empty line will select standard mode.

?> p




Paranoia configuration selected.




Configuration finished.  The filesystem to be created has

the following properties:

Filesystem cipher: "ssl/aes", version 3:0:2

Filename encoding: "nameio/block", version 3:0:1

Key Size: 256 bits

Block Size: 1024 bytes, including 8 byte MAC header

Each file contains 8 byte header with unique IV data.

Filenames encoded using IV chaining mode.

File data IV is chained to filename IV.

File holes passed through to ciphertext.




-------------------------- WARNING --------------------------

The external initialization-vector chaining option has been

enabled.  This option disables the use of hard links on the

filesystem. Without hard links, some programs may not work.

The programs 'mutt' and 'procmail' are known to fail.  For

more information, please see the encfs mailing list.

If you would like to choose another configuration setting,

please press CTRL-C now to abort and start over.




Now you will need to enter a password for your filesystem.

You will need to remember this password, as there is absolutely

no recovery mechanism.  However, the password can be changed

later using encfsctl.




New Encfs Password: 

Verify Encfs Password: 

  1. After encrypting, data directory, users also need to modify the postgresql-<version> service script to include proper command in it for password. For that either, they can use sshpass or they can write their own program which can pass the password for mounting directory.

As you can see, achieving Transparent Data Encryption Postgres is very easy.

Dynamic RLS implementation in PPAS 9.3

In the course of my work at EnterpriseDB, migrating Oracle databases to EnterpriseDB’s Postgres Plus Advanced Server is a common task. However, now and then we encounter unique situations. While working on a migration project recently, we encountered a new use case for RLS (Row level Security).

The customer had a centralized database where it stored a huge number of transactions. These transactions are performed by different business units located in different parts of the world. There are certain types of transactions that should not be visible even if they are being queried by the same company. That is where RLS comes in. With RSL, specific transactions, or kinds of transactions, that can remain visible are mapped back to an attribute in the table.

The customer needed the application to authenticate users and set the context for which records in the database become visible for a specific session.
In its deployment of Oracle, the customer had used the functions/procedure in the Oracle package DBMS_SESSION. In the application, the customer used DBMS_SESSION.SET_CONTEXT to set the context. And for the Row Level Security, the customer was using the DBMS_SESSION.SYS_CONTEXT to implement security around the transactions.

Postgres Plus Advanced Server has a DBMS_SESSION package that is compatible with Oracle. However, it does not currently offer users the capability of setting the user defined context and implementing RLS based on those context. Given others may experience similar situations , as our customer, I wanted to provide the procedures and functions that users could deploy.

SET_CONTEXT procedure

The definition of this procedure is given below:

CREATE OR REPLACE PROCEDURE set_context(namespace TEXT,
                                        attribute TEXT,
                                        val       TEXT)
AS
BEGIN
    EXECUTE IMMEDIATE format('SET %s.%s TO %s',namespace, attribute,val);
END;

Using this procedure, users can set their own context at session level.

The following is a function to help view the context in session, which is set using the above procedure.

CREATE OR REPLACE FUNCTION USYS_CONTEXT(namespace TEXT,
                                        parameter TEXT,
                                        len       BIGINT DEFAULT 8)
RETURN TEXT
AS
  DECLARE
    return_val TEXT;
  BEGIN
    EXECUTE IMMEDIATE format('SHOW %s.%s',namespace,parameter) INTO return_val;
    RETURN substr(return_val,1,len);
    EXCEPTION WHEN others THEN
     RETURN NULL;
END;

The following is an example of how we can implement row level security based on the above procedure and functions:
1. Create a table which will have attribute context_check to map the context set by procedure:

CREATE TABLE test_rls(id numeric, col text, context_check text);
INSERT INTO test_rls SELECT id, 'First_check','aaa' FROM generate_series(1,10) foo(id);
INSERT INTO test_rls SELECT id, 'First_check','bbb' FROM generate_series(1,10) foo(id);
INSERT INTO test_rls SELECT id, 'First_check','ddd' FROM generate_series(1,10) foo(id);

2. Now create a function to check the application context. Below is one function:

CREATE OR REPLACE FUNCTION verify_user_context (
    p_schema       TEXT,
    p_object       TEXT
)
RETURN VARCHAR2
IS
DECLARE
   predicate TEXT;
BEGIN
    predicate := format('context_check = public.usys_context(''%s''::text,''%s''::text, 8)','CONTEXT','APP_PREDICATE');
    RETURN predicate;
END;

3. Now Apply Security Policy using Policy Functions shown below:

DECLARE
v_object_schema VARCHAR2(30) := 'public';
v_object_name VARCHAR2(30) := 'test_rls';
v_policy_name VARCHAR2(30) := 'secure_data';
v_function_schema VARCHAR2(30) := 'public';
v_policy_function VARCHAR2(30) := 'verify_user_context';
v_statement_types VARCHAR2(30) := 'INSERT,UPDATE,DELETE,SELECT';
v_update_check BOOLEAN := TRUE;
v_enable BOOLEAN := TRUE;
BEGIN
DBMS_RLS.ADD_POLICY(
v_object_schema,
v_object_name,
v_policy_name,
v_function_schema,
v_policy_function,
v_statement_types,
v_update_check,
v_enable
);
END;

Now we are set to test this implementation.

Connect to one session and try the following:
1. Set the context using procedure SET_CONTEXT as given below:

EXEC  SET_CONTEXT('CONTEXT','APP_PREDICATE','ddd');

EDB-SPL Procedure successfully completed

2. Verify in the same session to determine if we have set the Context properly:

 SELECT USYS_CONTEXT('CONTEXT','APP_PREDICATE',2000);
 usys_context
--------------
 ddd
(1 row)

3. Since in session, we have Context set as ddd, there in this session, we should be able to see rows respective to set contexts:

beta=# SELECT * FROM test_rls ;
 id |     col     | context_check
----+-------------+---------------
  1 | First_check | ddd
  2 | First_check | ddd
  3 | First_check | ddd
  4 | First_check | ddd
  5 | First_check | ddd
  6 | First_check | ddd
  7 | First_check | ddd
  8 | First_check | ddd
  9 | First_check | ddd
 10 | First_check | ddd
(10 rows)

As you can see, the DBMS_RLS package in Postgres Plus Advanced Service can help in implementing Row Level Security based on Application Context.