Some UNIX Utilities for performance monitoring

vmstat command:
vmstat is a command which can be use to display system statistics. Syntax is given below:

vmstat <# second> < #number of times>  

Example

vmstat 1 10  

Above Command is to display the system statistics every second, 10 times.

Sample Output of the command:

vmstat  
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----  
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa  
 1  0      0 950888  57536 512868    0    0   559    45  479  671  9  2 73 15 

Lets understand the column definition:
Procs Column would be having three subcolumns
r: count (processes waiting for run time)
b: count( Processes in uninterruptible sleep)
w: Count (Process which are swapped out and runnable) (on RHEL)

If any process in b and w column then DBA/System admin has to check the system.
Memory Column would be having following subcolumns:
swap: swap space currently available.(in kBs)
free: Amount of idle memory(kBs)
buff: Memory used as buffer (kBs)

SWAP information
si: memory swapped in from disk (kB/s)
so: memory swapped out to disk (kB/s)

I/O information:
bi: count (Number of blocks sent to a block device)(blocks/s)
bo: count (Number of blocks received from block device) (Blocks/s)

System Information:
in: Interrupts per second, including clock.
cs: CPU Switches

Some other percentage of cpu information:
us: user time
sy: system time
id: idle time

If id is zero then cpu is overburden. Also if si and so is high then there is lot of swapping happening in system.

To find more information on CPU sar command can be use. sar command can be use for getting the information of CPU Utilization.

Syntax: sar <#seconds> <# of times>  

First row of sar command gives the information of System including version, OS information, machine name etc.
Real information sar command gives in five columns. First column (Time) is about timestamp/time i.e when information collected.
Second Column (%user) gives the information of load given by the users.
Third Column (%system) gives the information of load given by system processes including daemons.
Fourth Column (%IO) gives the information of load/processes waiting for I/O (ie hardware)
Fifth Column (%idle) gives the average idle load. In other words system is waiting for task to start.

If the idle percentage is less that equal to 20 for long time then system would be performing slow and system admin needs to look in the system. To reach at any conclusion user must need to watch the output of sar command for at least one day. If the system consistently having <=20% idle then system need to analyze the other processes and load. To find the information on I/O of System, user can use command iostat command.

syntax: iostat <# seconds> <# of times> [disk#]  

iostat command gives the I/O information of all the disks.

First column is about Device (i.e hardisk)
Second column gives the information of block transfer per second, where transfer is an I/O request to the disk
Third column gives the blocks read/second (in kBs)
Fourth column gives the information of Blocks written per second.
Fifth Column gives Amount of Block read from device
Sixth Column gives information of total number of block written.

Sample Output is given below:

Linux 2.6.28-19-generic (vibhore-laptop)  10/31/2010  _i686_ (2 CPU)  
  
avg-cpu:  %user   %nice %system %iowait  %steal   %idle  
          22.58    0.83    5.56   43.94    0.00   27.50  
  
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn  
sda             138.18      6067.27       446.48     809920      59600  
sda1            137.70      6052.41       446.48     807936      59600  
sda2              0.02         0.04         0.00          6          0  
sda5              0.31        12.42         0.00       1658          0  

Some New Security Features in PG9.0

Some New Security Features in PG9.0

1. New Grant and Revoke in PG9.0

In Previous version of PG (7.x,8.x), all the DBAs and Users used to miss the GRANT and REVOKE command which can be use to give permissions on all the tables inside the Schema. Now, they don;t have to.

From PG9.0, user can execute single GRANT and REVOKE command to give the permission on all the tables in a SCHEMA.

GRANT SELECT ON ALL TABLES in SCHEMA TEST to test_user;  

Here is output of query which shows that above command has given SELECT privileges on all the tables in SCHEMA Test.

postgres=# select * from information_schema.table_privileges  where grantee ='test_user';;  
 grantor  |  grantee  | table_catalog | table_schema | table_name | privilege_type | is_grantable | with_hierarchy   
----------+-----------+---------------+--------------+------------+----------------+--------------+----------------  
 postgres | test_user | postgres      | test         | test       | SELECT         | NO           | NO  
 postgres | test_user | postgres      | test         | test2      | SELECT         | NO           | NO  

Similarly user can execute single Revoke command to revoke a privilege from all the tables in Schema:

REVOKE SELECT ON ALL TABLES in SCHEMA test from test_user;  

2. Assign Default privileges to a Role.

In PG9.0, managing role privileges is now more easy.
PG9.0 now supports ALTER DEFAULT PRIVILGES as given below:

postgres=# alter default privileges for role newrole GRANT SELECT ON TABLES to public;  
ALTER DEFAULT PRIVILEGES  

3. Now, user can put a check for verifying the strength of Password given by user. Module passwordcheck by default comes with the PG9.0.

To enable this module user has to add following in postgresql.conf
file and has to restart the PG instance:

shared_preload_libraries = '$libdir/passwordcheck'  

Working example is given below:

postgres=# alter user test_user password 'test_user';  
ERROR:  password must not contain user name  
postgres=# alter user test_user password 'test123';  
ERROR:  password is too short  
postgres=# alter user test_user password 'test45678';  
ALTER ROLE  

This feature is also having some limitation. It does not work properly if somebody pass the encrypted password. This feature is not recommended if some security feature is already implemented i.e if pre-encrypted passwords are already passing to DB.

Understanding Real Application Cluster Processes

With all the Processes of Stand-alone Oracle Database, Real Application Cluster provides some other processes which plays important role. These Processes are:

1. LMON: Globa Enqueue Service Monitor
Responsibilities Assigned:

This Process is responsible for monitoring the entire cluster global enqueues and the resources. It also manages the instances failure and process failures and the associated recovery for Global Cache service and Global Enqueue Service.
Main responsibility of this process is recovery of Global Resources. Services provided by this process is also known as Cluster Group Services.

View:: So, we have one watcher and recovery agent for Global Resources.

2. LMDx: Global Enqueue Service Daemon (Lock Agent)
Responsibility Assigned:

This process controls the access of Global Enqueue and resources. It also handles the Deadlock detection and remote enqueue requests (i.e requests from instances).

View: Hmm, we have one recorder agent and resource information keeper.

3. LMSx: Global Cache Service Processes
Responsibility Assigned:

This is a process which works as Messages handler. RAC provokes up to 10 LMSx processes. It handles the acquisition interrupt and blocking interrupt requests from remote instances. It also keeps the consistency in read version of blocks. It also controls the flow of messages. Overall responsibilities can be described in bullets are given below:
a. Manage the resources requests and cross instance calls for shared resources
b. Listing all invalid lock elements and validating the locks during recovery.
c. Handles the deadlock detection and lock conversion timeouts.

View: So, we have clerk which can keep the records of locks and request for Global resources.

4. LCKx: Broadcaster
Responsibility Assigned:

This mainly manages the requests and Broadcast the information across the instances. Workload of this process automatically get balanced/shared if we have multiple LMSx.

View: This is a Broadcaster. I understand we require it.

4. DIAG: Diagnosability Darmon
Responsibility assigned:

It monitor the health of instances ad keeps the data of instance process failures. Naming Convention followed for this process is ora__NYDB1

View: Hmm Auditor and Nursing Assistant process.

Now lets understand the some important services and who is responsible for it.

1. Global Cache Service:
Responsibility comes under it:

a. Track the location and status of data block i.e its mode,role and access privileges of various data instances.
b. Block transfer between instances and give appropriate lock mode conversion

Who is responsible for it: LMSx and LMD

2. Global Enqueue service:
Responsibility comes under it:

a. Manage and track the status of enqueuing mechanism which includes all non cache fusion intra-instances operations.
b. Perform Concurrency Control on dictionary cache locks, library cache locks and transactions.

Who is responsible: Almost all the RAC specific processes are responsible for it.

So, above the important processes/managers which we get in RAC for Oracle Database Management to provide global services of GCS and GES.

Making Slony source compatible with EnterpriseDB Advanced Server

Since, edb-replication which comes with advanced Server is more compactly build with one particular version, therefore some times user does not able to replicate the data between two different version of Advanced Servers.

For replicating data between two advanced Server versions, it is important to have same version of Slony on Source and target database.

I did research around the slony source code and tried to make it compatible with Advanced Server.

If user plainly comile the slony source code against the Advanced Server database, then user will start to get messages like : version/ while configuring the Replication.

Reason for such messages is that slony did not able to parse the Advanced Server version, therefore it would not be able to continue further for replication.

I did some research and gone through the slony source code.
Slony uses a program called dbutil*.c for finding the version of PostgreSQL or checking the compatibility of PostgreSQL version with Slony.

Following are the main programs which calls the dbutil*.c
1. slon
2. slonik

Following are the changes if user makes in slony source code then user would be able to compile the source code against the Advanced Server different versions for replication:
Changes which is require are given below:

File: src/slon/dbutils.c   
  
static int db_get_version(PGconn *conn)  
{  
 PGresult    *res;  
 SlonDString query;  
 char     versionstr[7];  
 int     version=0;  
 int     major=0;  
 int     minor=0;  
 int     patch=0;  
 dstring_init(&query);  
 slon_mkquery(&query, "SELECT version();");  
 res = PQexec(conn, dstring_data(&query));  
  
 if ( !res || PQresultStatus(res) != PGRES_TUPLES_OK )  
        {  
  PQclear(res);  
  return -1;  
 }  
 if (sscanf(PQgetvalue(res, 0, 0), "PostgreSQL %d.%d.%d", &major, &minor, &patch) < 2)  
 {  
  PQclear(res);  
  return -1;  
 }  
 PQclear(res);  
 snprintf(versionstr, 7, "%.2d%.2d%.2d", major, minor, patch);  
 version=atoi(versionstr);  
 dstring_free(&query);  
 return version;  
}  

Replace the above function with following function:

static int db_get_version(PGconn *conn)  
{  
 PGresult    *res;  
 SlonDString query;  
 char     versionstr[7];  
 int     version=0;  
 int     major=0;  
 int     minor=0;  
 int     patch=0;  
 int     dummy=0;  
 dstring_init(&query);  
 slon_mkquery(&query, "SELECT version();");  
 res = PQexec(conn, dstring_data(&query));  
  
 if ( !res || PQresultStatus(res) != PGRES_TUPLES_OK )  
        {  
  PQclear(res);  
  return -1;  
 }  
 if (sscanf(PQgetvalue(res, 0, 0), "EnterpriseDB %d.%d.%d.%d", &major, &minor, &patch, &dummy) < 2)  
 {  
  PQclear(res);  
  return -1;  
 }  
 PQclear(res);  
 snprintf(versionstr, 7, "%.2d%.2d%.2d", major, minor, patch);  
 version=atoi(versionstr);  
 dstring_free(&query);  
 return version;  
}  
File:src/slonik/dbutils.c  
  
db_get_version(SlonikStmt * stmt, SlonikAdmInfo * adminfo)  
{  
 PGresult   *res;  
 SlonDString query;  
 char        versionstr[7];  
 int     major=0;  
 int     minor=0;  
 int         patch=0;  
 int         version=0;  
  
 if (db_begin_xact(stmt, adminfo) < 0)  
  return -1;  
  
 dstring_init(&query);  
 slon_mkquery(&query, "select version();");  
 res = db_exec_select(stmt, adminfo, &query);  
 dstring_free(&query);  
  
 if (res == NULL)  
  return -1;  
  
 if (sscanf(PQgetvalue(res, 0, 0), "PostgreSQL %d.%d.%d", &major, &minor, &patch) < 2)  
 {  
  fprintf(stderr, "%s:%d: failed to parse %s for DB version\n",  
    stmt->stmt_filename, stmt->stmt_lno,  
    PQgetvalue(res, 0, 0));  
  PQclear(res);  
  return -1;  
 }  
 PQclear(res);  
 snprintf(versionstr, 7, "%.2d%.2d%.2d", major, minor, patch);  
 version=atoi(versionstr);  
 return version;  
}  

Replace Above function with Following function:

db_get_version(SlonikStmt * stmt, SlonikAdmInfo * adminfo)  
{  
 PGresult   *res;  
 SlonDString query;  
 char        versionstr[7];  
 int     major=0;  
 int     minor=0;  
 int         patch=0;  
 int         version=0;  
 int      dummy=0;  
  
 if (db_begin_xact(stmt, adminfo) < 0)  
  return -1;  
  
 dstring_init(&query);  
 slon_mkquery(&query, "select version();");  
 res = db_exec_select(stmt, adminfo, &query);  
 dstring_free(&query);  
  
 if (res == NULL)  
  return -1;  
  
 if (sscanf(PQgetvalue(res, 0, 0), "EnterpriseDB %d.%d.%d.%d", &major, &minor, &patch, &dummy) < 2)  
 {  
  fprintf(stderr, "%s:%d: failed to parse %s for DB version\n",  
    stmt->stmt_filename, stmt->stmt_lno,  
    PQgetvalue(res, 0, 0));  
  PQclear(res);  
  return -1;  
 }  
 PQclear(res);  
 snprintf(versionstr, 7, "%.2d%.2d%.2d", major, minor, patch);  
 version=atoi(versionstr);  
 return version;  
}  

Now, if you have made the above changes you would be able to compile the slony with advanced Server and would be able to use the slony. Please Note:: I have tested above changes with Slony version 1.2.21 for replicating data between databases of version “EnterpriseDB 8.3.0.112” and EnterpriseDB 8.4.5.16

Interesting !! 🙂

pg_hotbackup utility for Backup of PG

Users always look for a Backup utility which can give some Good options and a utility can be use with all the instances of PG to take the backup.

I have thought in same way and created a pg_hotbackup script. pg_hotbackup utility which I worked on is now a server side utility which takes backup on server side and keep the backups in Backup directory.

Options which I have included in it, are following:
Compulsory Options:

-H <bin directory=""> (This is for Bin Directory location, utility will use psql command of PG Instance )  
-b <backup directory=""> (Directory where user wants to keep the backup)  
-p <port> (Port number of PG Instance)  
-U username (Username)  
-P passwd   (Password).  

Some other options:

-a: Archive Only Option [1|0]  
-r: Retention Policy [ in days ]  
-l: List Backups  
-n: Backup File Name  
-v: Validate Only [1|0]  
-R: Retention Only [|0]  
\?: Help  

So, I have all the options with me.

Now, lets understand what do we need:
1. We need a catalog file, in which utility can keep the backups information and validate the information as per new backups and retention policy.
If same binaries are getting used, I have preferred to keep that catalog file in Binary location.

2. A function, which can update the catalog file with each backup of archive/hotbackup backup:
Following is a code which can help to achieve this objective

updatecatalog()  
 {  
BCKLIST=`ls -1 $BIN/.$port*` 2>/dev/null  
if [ $? -ne 0 ];then  
               echo "No Catalog Found! Please check the Home Director and Backup Directory"  
               exit 1  
       fi  
if [ -f /tmp/list ];then  
 echo "Nothing to Update"  
 exit  
fi  
for i in `cat /tmp/list`  
do  
 cat $BCKLIST|awk -F';' '{if ($1==$i){printf "%s;%s;%s;%s;%s;%s;",$1,$2,$3,$4,$5,"Expired";}else {printf "%s;%s;%s;%s;%s;%s;",$1,$2,$3,$4,$5,$6}'>>/tmp/newcatlog  
done  
if [ -f /tmp/newcatlog ];then  
 cat /tmp/newcatlog>$BCKLIST  
 rm /tmp/newcatlog  
fi  
rm /tmp/list 2>/dev/null  
echo "Update Catalog is successful!"  
 }
[/code[  
3. A function which can validate the backups, with every backups. Validation function should validate the backup information in catalog i.e it should check if the backups are available in defined destination or not.
Following is a Validation backup Function:

/dev/null  
 if [ $? -ne 0 ];then  
  echo "No catalog Available for Mentioned Bin and Cluster"  
  exit 0  
 fi  
  for i in `cat $BCKLIST|awk -F';' '{print $1}'`  
 do   
    ls -1 $BCKDIR/$i 2> /dev/null  
    if [ $? -ne 0 ];then  
  cat $BCKLIST|grep -v $i >/tmp/valcatalog.log  
  cat /tmp/valcatalog >$BCKLIST  
    fi  
 done  
 rm /tmp/valcatalog.log 2>/dev/null  
 echo "All Backups has been validated!"   
     
   }  

4. With this we would also need a function which can implement some retention policy on backup. i.e of how many days backup which user wants to keep:

backup_retention()  
 {  
if [ -z $RETENTION ];then  
 echo "No retention days..."  
 echo "Skipping the retention.."  
 return 0  
      else  
 find $BCKDIR -mtime +$RETENTION -exec ls {} \; >/tmp/list;  
 find $BCKDIR -mtime +$RETENTION -exec rm {} \;  
fi  
echo "Updating Catalog.."  
validate_backups  
updatecatalog  
}  

With retention, policy above function will also update the catalog information.

5. Now, some functions which can be use for HotBackup. i.e Start and Stop backup functions:

 startbackup()   
   {  
     QUERY="select pg_start_backup('HOTBACKUP `date +"%d-%m-%Y"`');"  
     export PGPASSWORD="$PASSWD"  
      $BIN/psql -t -p $port  -U $USER -c "$QUERY" template1  
   }  
  
 stopbackup()  
   {  
 QUERY="select pg_stop_backup();"  
 export PGPASSWORD="$PASSWD"  
 $BIN/psql -t -p $port -U $USER -c "$QUERY" template1  
  }  
  

6. One backup function, which can be use for backup, here is that function:

 backupdata()  
    {  
 if [ -z $FILENAME ];then  
   NEWFILENAME=hotbackup_`echo $port`_`date +"%d-%m-%Y"`.tar.gz  
   LABEL=HOTBACKUP  
 else  
   NEWFILENAME=`echo $FILENAME`_hotbackup.tar.gz  
   LABEL=$FILENAME  
 fi  
     if [ -f /tmp/.backup.lock ];then   
 echo "One backup is already running.."  
 exit 1  
    fi  
      LISTFILE=".`echo $port`_`echo $BCKDIR|sed 's/\//-/g'`"  
      TYPE="Full Backup"  
      STARTTIME=`date +"%d/%m/%Y %H:%M:%S"`  
      touch /tmp/.backup.lock  
      startbackup  
      export PGPASSWORD=$PASSWD  
      DATADIR=`$BIN/psql -t -U $USER -p $port -c "show data_directory;"|sed '/^$/d'`  
      cmd="tar -zchvf $BCKDIR/$NEWFILENAME $DATADIR"  
     if [ -d $BCKDIR ];then  
         :  
     else  
        mkdir -p $BCKDIR  
     fi  
       $cmd >/tmp/Hot_backup_`date +"%d-%m-%Y"`.log 2>&1  
       if [ $? -ne 0 ];then  
  echo "Backup Failed.. Please /tmp/Hot_backup_`date +"%d-%m-%Y"`.log"  
  stopbackup  
  ENDTIME=`date +"%d/%m/%Y %H:%M:%S"`  
  STATUS="Failed!"  
     echo "$NEWFILENAME;$LABEL;$STARTTIME;$ENDTIME;$TYPE;$STATUS">>$BIN/$LISTFILE   
  rm /tmp/.backup.lock  
         exit 1  
       else   
  stopbackup  
  rm /tmp/.backup.lock  
  ENDTIME=`date +"%d/%m/%Y %H:%M:%S"`  
  STATUS="Successful"  
  echo "$NEWFILENAME;$LABEL;$STARTTIME;$ENDTIME;$TYPE;$STATUS">>$BIN/$LISTFILE  
  echo "Data Directory of port $port has been backed up successfully!"  
       fi  
    }   

Above function, will also update the catalog file, which shell scriptor can figure out from the function definition.

7. With this we also need a archive backup functions too. Below is a archive function, which update the status in catalog and also takes the backup:

backuparchive()   
{  
touch /tmp/.archive.lock  
TYPE="Archive Backup"  
STARTTIME=`date +"%d/%m/%Y %H:%M:%S"`  
       LISTFILE=".`echo $port`_`echo $BCKDIR|sed 's/\//-/g'`"  
if [ -z $FILENAME ];then  
  NEWFILENAME=archivebackup_`echo $port`_`date +"%d-%m-%Y-%H:%M"`.tar.gz  
  LABEL="Archive"  
else  
  NEWFILENAME=`echo $FILENAME`_archive.tar.gz  
  LABEL=$FILENAME  
fi  
archiveloc=`$BIN/psql -t -U $USER -p $port -c "show archive_command"|sed '/^$/d'|awk  '{k=split($0,arr," "); for(i=0;i<=k;i++){ if (match(arr[i],"%f")) print arr[i];}}'|xargs echo|awk '{print $1}'|sed 's/%f//g'`  
       cmd="tar -zchvf $BCKDIR/$NEWFILENAME $archiveloc/*"   
       $cmd >/tmp/backup_archive_`date +"%d-%m-%Y"`.log 2>&1  
if [ $? -ne 0 ];then  
 echo "Backup Failed!. Please check log file in /tmp/backup_archive_`date +"%d-%-%Y"`"  
 rm /tmp/.archive.lock  
 ENDTIME=`date +"%d/%m/%Y %H:%M:%S"`  
 STATUS="Failed!"  
        echo "$NEWFILENAME;$LABEL;$STARTTIME;$ENDTIME;$TYPE;$STATUS">>$BIN/$LISTFILE   
 exit 1  
fi  
echo "Archive Backup is successful"  
ENDTIME=`date +"%d/%m/%Y %H:%M:%S"`  
STATUS="Successful"  
echo "$NEWFILENAME;$LABEL;$STARTTIME;$ENDTIME;$TYPE;$STATUS">>$BIN/$LISTFILE  
rm /tmp/.archive.lock  

So, I have all the necessary functions, now left function is: if user wants to view the status of backup and wants to list the backups. For that purpose following is a list backup function:

listbackups()  
 {  
   
 BCKLIST=`ls -1 $BIN/.$port*`   
 if [ $? -ne 0 ];then  
    echo "No Catalog Found! Please check the Home Director and Backup Directory"  
  exit 1  
 fi  
   
 BCKDIR=`echo $BCKLIST|awk -F"/" '{print $NF}'|cut -d"." -f2|awk -F"_" '{print $2}'|sed 's/-/\//g'`  
 DATADIR=`$BIN/psql -t -U $USER -p $port -c "show data_directory;"|sed '/^$/d'` >/dev/null  
 if [ $? -ne 0 ];then  
  echo "Unable to Connect to Database for Verification. Exiting"  
  exit 1  
 fi  
 echo ""|awk 'BEGIN{printf "\t\t%-20s\n","LISTING BACKUPS"}END{printf "\t\t%-20s\n\n","---------------"}'  
 echo "PORT: $port"  
 echo "DATADIR: $DATADIR"  
 echo "LOCATION: $BCKDIR"  
 echo ""  
        awk -F';' 'BEGIN{ printf "%-20s\t%-20s\t%-20s\t%-20s\t%-20s \n","Filename","START DATE","END DATE", "TYPE", "STATUS"}{printf "%-20s\t%-20s\t%-20s\t%-20s\t%-20s\n",$2,$3,$4,$5,$6}' $BCKLIST    
 }  

Now, I have all the required functions. Using above functions I can write a command pg_hotbackup which I can use for Base Backup of data directory, archive log backup, implementing retention policy and listing all the backups.

Few Snapshots are given below:

Hotbackup:  
./pg_hotbackup -H $PGHOME/bin -b /Users/postgres/BackupDir -U postgres -p 5432 -P postgres  
0/46000020
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
0/460000D8
Data Directory of port 5432 has been backed up successfully!
Archive Backup is successful
No retention days...
Skipping the retention..
  LISTING BACKUPS       
  ---------------       
  
PORT: 5432  
DATADIR:  /Library/PostgreSQL/9.0/data  
LOCATION: /Users/postgres/BackupDir  
  
Filename             START DATE           END DATE             TYPE                 STATUS                 
HOTBACKUP            19/10/2010 06:11:41  19/10/2010 06:12:07  Full Backup          Successful            
Archive              19/10/2010 06:12:07  19/10/2010 06:12:29  Archive Backup       Successful            
HOTBACKUP            22/10/2010 02:04:40  22/10/2010 02:05:16  Full Backup          Successful            
Archive              22/10/2010 02:05:16  22/10/2010 02:05:44  Archive Backup       Successful     

ROLLUP Analytical function in PostgreSQL.

Currently, there is no version of PG, which supports the rollup.

However, people look for this analytical function features.

ROLLUP queries result can be achieve using the UNION of Queries.
First Let’s Understand what does rollup do:

If SQL Query has col1,col2,col3,aggregate(col4) then rollup

Processing would be something like this.
1. Show the aggregate of col4 as per the col1,col2,col3
2. Then rollup will do the subtotal and will show the result as per the as aggregate of based on col1,col2
3. Then it will show the aggregate/subtotal as per the col1.
4. And at end Total/Sum

In short, it creates progressively higher-level subtotals, moving from right to left through the list of grouping columns. Finally, it creates a grand total.

In PG, this can be achieve by writing a SubQueries and and UNION those.

So, if the rollup query is something like given below:

select col1,col2,col3,agg(col4) from relation group by rollup(col1,col2,col3)   

Then in PG above can be written as:

select col1, col2,col3 agg(col4) from relation group by col1,col2,col3  
UNION  
select col1,col2,NULL,agg(col4) from relation group by col1,col2  
UNION  
select col1,NULL,NULL,agg(col4) from relation group by col1  
UNION  
select NULL,NULL,NULL, agg(col4) from relation;  

Following is an example:
In Oracle:

select manager_id,job_id,sum(salary) from hr.employees group by rollup(manager_id,job_id);  

In PG:

select manager_id,job_id,sum(salary) from hr.employees group by manager_id,job_id  
UNION  
select manager_id , NULL,sum(salary) from hr.employees group by manager_id  
UNION  
select NULL,NULL,sum(salary) from hr.employees;  

PG_REORG Utility for VACUUM FULL online

pg_reorg is a utility made by NTT for reorganizing the table structure.

Concept is simple, if you have all the require pointers and data in same page, then accessing those is much more faster. This is what pg_reorg provides to a user.

Following are some options, which pg_reorg provides.

-o [ —order-by] columns

This option makes pg_reorg to oraganise the table data as per the mentioned column. At the backend pg_reorg will creates a new table using CTAS and SELECT Query include ORDER BY clause with columns mentioned with -o.

-n [—no-order] tablename
[/cod]
When this option is being used, then pg_reorg, does the VACUUM FULL ONLINE. Now, question is how it must be doing. Simple Concept, create a new table using CTAS and create a trigger on current table to track the DML. As the New table got created play those tracked DML on new table. It works well. This option is only for table which has primary key.

pg_reorg by default does the CLUSTER of tables and it follows same concept, i.e without locking table do the CLUSTER.

After performing all the options, pg_reorg does the ANALYZE on the table.

Following are some elog information, which it performs at backend:

elog(DEBUG2, "---- reorg_one_table ----");  
elog(DEBUG2, "target_name    : %s", table->target_name);  
elog(DEBUG2, "target_oid     : %u", table->target_oid);  
elog(DEBUG2, "target_toast   : %u", table->target_toast);  
elog(DEBUG2, "target_tidx    : %u", table->target_tidx);  
elog(DEBUG2, "pkid           : %u", table->pkid);  
elog(DEBUG2, "ckid           : %u", table->ckid);  
elog(DEBUG2, "create_pktype  : %s", table->create_pktype);  
elog(DEBUG2, "create_log     : %s", table->create_log);  
elog(DEBUG2, "create_trigger : %s", table->create_trigger);  
elog(DEBUG2, "create_table   : %s", table->create_table);  
elog(DEBUG2, "delete_log     : %s", table->delete_log);  
elog(DEBUG2, "lock_table     : %s", table->lock_table);  
elog(DEBUG2, "sql_peek       : %s", table->sql_peek);  
elog(DEBUG2, "sql_insert     : %s", table->sql_insert);  
elog(DEBUG2, "sql_delete     : %s", table->sql_delete);  
elog(DEBUG2, "sql_update     : %s", table->sql_update);  
elog(DEBUG2, "sql_pop        : %s", table->sql_pop);  
Interesting Hunh.

Some Comparison between clusterdb and pg_reorg is given in below link:
http://reorg.projects.postgresql.org/index.html

With everything, DBA has to take care of few things:
While pg_reorg is going on one table, the DBA should not let anyone to execute DDL Changes and Index on the currently reorganizing table.