Migrate an on-premises PostgreSQL cluster to Google Cloud

This article explains how to migrate an on-premises PostgreSQL cluster to Google Cloud. This approach uses PgBouncer as a connection pooler to minimize application downtime, and helps set up tools for monitoring the results. This article is designed for PostgreSQL administrators and sysadmins working in a Linux environment.

The following diagram shows a PgBouncer instance placed in front of an on-premises master.

Architecture of a PgBouncer instance placed in front of an on-premises master.

With connection pooling, clients are routed to an alternative node when the failover to Google Cloud happens, without having to redeploy application configurations or make application-level changes.

The following diagram illustrates the migration.

Architecture of migration to Google Cloud.

To perform the migration, you shut down the current master and then promote the subordinate Google Cloud replica to master. PgBouncer reroutes traffic to the new master node on Google Cloud.


This tutorial uses the following billable components of Google Cloud:

You can use the pricing calculator to generate a cost estimate based on your projected usage.


  • Set up PostgreSQL in Google Cloud.
  • Set up replication on Compute Engine.
  • Seed your data to a new instance.
  • Set up the PostgreSQL cluster on Google Cloud.
  • Switch over to the new server.
  • Implement monitoring.

Before you begin

This tutorial uses gcloud and gsutil commands, which you can run from a Cloud Shell instance launched from the Cloud Console . If you want to use gcloud and gsutil on your local workstation, install the Cloud SDK. The tutorial shows you how to run commands in Cloud Shell; if you use the Cloud SDK on your workstation, adjust the instructions accordingly.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Enable the Compute Engine API.

    Enable the API

Setting up your environment

To begin, you perform the following tasks:

  • Set up PostgreSQL in Google Cloud.
  • Set up replication on Compute Engine.

Then, you'll seed the data to a new instance before you start replicating the master.

Set up PostgreSQL in Google Cloud

Follow the steps in How to Set Up PostgreSQL for High Availability and Replication with Hot Standby to set up PostgreSQL on Google Cloud in just a few minutes. You configure PostgreSQL on an Ubuntu virtual machine instance on Compute Engine.

Set up replication on Compute Engine

Follow the steps in How to Set Up PostgreSQL on Google Compute Engine to configure PostgreSQL to run in Hot Standby mode on Compute Engine. You'll use two Compute Engine instances. One instance will run the primary PostgreSQL server and the other instance will run the standby server.

Although the properties to configure PostgreSQL differ between a master and subordinate instance, the property files should be identical to allow for seamless failovers.

A subordinate instance in a PostgreSQL cluster is denoted by the presence of the recovery.conf file.

In most cases, it's important to separate the data directory for the database from the boot disk. This example stores database files under a mount at /database.

Use the following commands to modify the postgresql.conf file on the master instance to set up replication:

wal_level = 'hot_standby'
archive_mode = on
archive_command = 'test ! -f /postgresql/archivedir/%f && cp %p /postgresql/archivedir/%f'
max_wal_senders = 3
listen_addresses = '*'
wal_keep_segments = 8

Use the following commands to modify the postgresql.conf file on the replica:

hot_standby = on
standby_mode = on
primary_conninfo = 'host=${PRIMARY_IP} port=5432 user=repuser'

You can offset some of the load on the master if you send Read requests to the replica.

Seed the data

Because the master database contains a capped transaction log, most PostgreSQL migrations require the data to be seeded to a new instance before you can start replicating the master. You can seed the data in one of the following ways:

  • Dump a single database into a script or archive file using Pg_dump.
  • Take a binary copy of a running database cluster using Pg_basebackup.
  • Copy the data folder to the replica using rsync.
  • Restore a previous backup to the replica.

Of these options, we recommend that you restore a previous backup to the replica. This solution does not cause any performance hit while large volumes of data are being transferred and the current cluster can continue to operate as normal.

After the initial seeding of the database, you can use the rsync command to feed changes to the replica that have occurred since the backup; the command syncs the data directories between the two instances. This step is important if the backup has fallen too far behind the master to catch up through normal replication.

Setting up the PostgreSQL cluster on Google Cloud

You can create the PostgreSQL cluster using cascade replication. First, migrate the database, as shown in the following diagram.

Architecture of your current environment migrated to Google Cloud.

Migrate the database

  1. Take a full backup from the running master server (label could be any label):

    echo "select pg_start_backup('label',true);" |sudo su - postgres -c psql
    sudo tar cvfz postgresql_AAAAMMDD.tar.gz $PGDATA

    $PGDATA is the main data directory for PostgreSQL.

  2. Create a bucket in your Google Cloud project named gs://pg-repo/.

  3. Transfer the backup to the bucket you just created:

    master$ gsutil cp postgresql_AAAAMMDD.tar.gz gs://pg-repo/
  4. Transfer the backup file to the Google Cloud master:

    new_master$ gsutil cp gs://pg-repo/postgresql_AAAAMMDD.tar.gz
  5. Restore the backup file into the Google Cloud master:

    new_master$ (cd / ; tar xvf postgresql_AAAAMMDD.tar.gz)
  6. Create a recovery.conf file in the $PG_DATA directory that includes the following:

    standby_mode     = 'on'
    primary_conninfo = 'port=5432 host=${running_master_ip} user=${replication_user} application_name=cloud_master'
    trigger_file     = '/tmp/failover.postgresql.5432'
  7. Start the PostgreSQL service:

    sudo service postgresql start
  8. Wait until the Google Cloud master server syncs with the running master. In the log, you can see something like:

    tail -f /var/log/postgresql/postgresql*log
    2018-09-22 17:59:54 UTC LOG:  consistent recovery state reached at 0/230000F0
    2018-09-22 17:59:54 UTC LOG:  database system is ready to accept read only connections

    Additionally, you can search in master pg_stat_replication to determine whether the new subordinate (called cloud_master) is connected:

    postgres=# \x
    Expanded display is on.
    postgres=# select * from pg_stat_replication where application_name='cloud_master';
    -[ RECORD 1 ]----+------------------------------
    pid              | 16940
    usesysid         | 16402
    usename          | repmgr
    application_name | cloud_master

Create a subordinate database

  1. Shut down the database and the server:

    sudo service postgresql stop
    sudo shutdown -h now
  2. To verify the service has stopped, run the following command:

    gcloud compute instances describe master-instance-name | grep status

    The output shows the status of the instance as TERMINATED:

    status: TERMINATED

    Next, create snapshots of the data disk to help create new subordinates.

  3. In the Cloud Console, go to the Snapshots page.

    Go to the Snapshots page

  4. From the PostgreSQL disk, create a new snapshot.

  5. Start the Google Cloud master server.

  6. Go to the VM instances page, click master-instance-name, and then click Start.

    The PostgreSQL service starts automatically.

  7. To check, run the following command:

    ps ax | grep postgres

    The result should be something like the following:

     1398 ?     S   0:00 /usr/lib/postgresql/9.3/bin/postgres -D /var/lib/postgresql/9.3/main -c config_file=/etc/postgresql/9.3/main/postgresql.conf
     1454 ?     Ss  0:00 postgres: checkpointer process
     1455 ?     Ss  0:00 postgres: writer process
     1456 ?     Ss  0:00 postgres: wal writer process
     1457 ?     Ss  0:00 postgres: stats collector process
  8. In the Cloud Console, go to the VM instances page, and then click Create instance.

  9. For the boot disk, choose Ubuntu 14.04.

  10. Click Management, disks, networking, ssh-keys and add a new disk based on the snapshot that you created earlier.

  11. Start the new server and mount the disk:

    sudo mkdir /database && sudo mount /dev/sdb1 /database
  12. Install PostgreSQL:

    sudo apt-get install postgresql && sudo service postgresql stop
  13. Configure data directory and replication values. Copy the postgresql.conf file and the pg_hba.conf file from the Google Cloud master, and edit the recovery.conf file to include the following:

    standby_mode = 'on'
    primary_conninfo = 'port=5432 host=${cloud_master_ip} user=${replication_user} application_name=cloud_slave_${identifier}'
    recovery_target_timeline = 'latest'
  14. Start the PostgreSQL service with the new config file, pointing to the Google Cloud master:

    sudo service postgresql restart
  15. Verify that the service is running:

    ps ax | grep postgres
  16. Check the Google Cloud master server with this query:

    postgres=# \x
    Expanded display is on.
    postgres=# select * from pg_stat_replication where application_name like 'cloud_slave%';
    -[ RECORD 1 ]----+-----------------------------
    pid              | 2466
    usesysid         | 16402
    usename          | repmgr
    application_name | cloud_slave_1
  17. Repeat these steps to create additional subordinates.

Switching over to the new server

  1. Change the config file in PgBouncer to point to the new Google Cloud master server.
  2. In the PgBouncer instance, shut down PgBouncer, promote the new master using the failover.postgresql.5432 trigger file, and then restart PgBouncer:

    service pgbouncer stop ; ssh ${cloud_master_ip} 'touch /tmp/failover.postgresql.5432' ; service pgbouncer start

Setting up monitoring

Google has partnered with Blue Medora to provide PostgreSQL metrics. You can collect metrics and logs from your existing PostgreSQL instance and from your new Google Cloud instance.

For more information about Blue Medora's PostgreSQL monitoring support, visit their Getting started page.

Sample queries

Check all concurrent connections on the server:

 select * from pg_stat_activity;

(Master) Check the replication status:

select * from pg_stat_replication;

(Master) Check to see the lag in applying data on the replica:

select pg_xlog_location_diff(write_location, replay_location) from pg_stat_replication;

(Master) Check the byte lag in replication:

select client_hostname, client_addr, pg_xlog_location_diff(pg_stat_replication.sent_location,
  pg_stat_replication.replay_location)AS byte_lag from pg_stat_replication;

(Subordinate) Check if the database is a replica or not:

select pg_is_in_recovery();

(Subordinate) Check the last data received from the master:

select pg_last_xlog_receive_location();

(Subordinate) Check the data last applied from the master:

select pg_last_xlog_replay_location();

(Subordinate) Check replication delay in seconds:

select now() - pg_last_xact_replay_timestamp();

Cleaning up

Delete the project

  1. In the Cloud Console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Next steps

  • Explore reference architectures, diagrams, tutorials, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.
  • Learn how to use Google Cloud products to build end-to-end solutions.