Configure Cloud SQL and the external server for replication

Stay organized with collections Save and categorize content based on your preferences.

This page describes how to configure your external server for replication to Cloud SQL, create a source representation instance on Cloud SQL, and replicate the data to Cloud SQL. You need to go through all the steps on this page before proceeding to the replication steps.

An alternative to the steps described on this page is the Database Migration Service, which offers continuous replication or one-time database migration from an external server to Cloud SQL.

Before you begin

Terminology

  • External server. The PostgreSQL server external to Cloud SQL that you want to replicate data from. It's also referred to as the source database or the external database server. It can be another Cloud SQL instance or any other database server, such as on-premises, Amazon Relational Database Service (RDS), and so on.

  • Source representation instance. A mock of a Cloud SQL instance that represents the external server to the Cloud SQL replica. It's visible in the Google Cloud console and appears like a regular Cloud SQL instance, but it doesn't contain data, require configuration or maintenance, or affect billing.

  • Cloud SQL replica. The Cloud SQL instance that replicates from the external server. Also known as the external primary read replica.

  • Replication user account. The PostgreSQL user account on the external server with sufficient permissions to allow replication between the external server and the Cloud SQL replica.

  • Managed import. The process of importing data directly from the external server to the Cloud SQL replica. In this situation, Cloud SQL connects to the external server using the replication user account and runs the data dump directly on the external server to import data to the Cloud SQL replica.

Set up a Google Cloud project

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  4. Enable the Cloud SQL Admin API.

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  7. Enable the Cloud SQL Admin API.

    Enable the API

  8. Make sure you have the Cloud SQL Admin, Storage Admin, and Compute Viewer roles on your user account.

    Go to the IAM page

Install the Google Cloud SDK

To configure replication, install Google Cloud SDK for your external server. You might want to install the SDK on your external server unless it's already installed elsewhere.

Set up the external server for replication

Cloud SQL supports continuous migrations from source databases to Cloud SQL destination databases.

Supported source databases for PostgreSQL include:

  • Self-managed (on premises or on any cloud VM that you fully control) PostgreSQL 9.4, 9.5, 9.6, 10, 11, 12, 13, and 14
  • Amazon RDS 9.6.10+, 10.5+, 11.1+, 12, 13, and 14
  • Amazon Aurora 10.11+, 11.6+, 12.4+, and 13.3+
  • Cloud SQL 9.6, 10, 11, 12, 13, and 14

Configuring your source requires configuring both the source instance and underlying source databases.

Configure your source instance

To configure your source instance, follow these steps:

  1. If your source instance does not include the postgres database, create it.
  2. Install the pglogical package on the source instance.
  3. Set the following parameters, as needed.

    If the source PostgreSQL instance is Amazon RDS, include these parameters in a new parameter group and attach the parameter group to the instance.

    • If the source is Cloud SQL, set the cloudsql.logical_decoding and cloudsql.enable_pglogical flags to on.

      To enable flags in Cloud SQL, see Configuring database flags.

    • Set shared_preload_libraries to include pglogical by using the following command:

      ALTER SYSTEM SET shared_preload_libraries = 'pglogical';
            
    • Set wal_level to logical by using the following command:

          ALTER SYSTEM SET wal_level = 'logical';
          

      If the source PostgreSQL instance is Amazon RDS, to enable WAL logs at the logical level, set the rds.logical_replication parameter to 1.

    • Set wal_sender_timeout to 0 by using the following command:

      ALTER SYSTEM SET wal_sender_timeout = 0;

      The value 0 disables the timeout mechanism that's used to terminate inactive replication connections.

    • Set max_replication_slots to the maximum number of replication slots that the source instance can support. Use the following command, after replacing # with the number:

      ALTER SYSTEM SET max_replication_slots = #;

      Cloud SQL requires one slot for each database that's migrated. Specify at least the number of subscriptions expected to connect, with some reserves for table synchronization.

      For example, if the source instance has 5 databases and 2 migration jobs are created for the source, then the number of replication slots must be at least 5 * 2 = 10, in addition to the number of replication slots that you already use.

    • Set max_wal_senders to at least the same as max_replication_slots, in addition to the number of senders already used on your instance. Use the following command, replacing # with the total number of WAL sender processes running simultaneously:

      ALTER SYSTEM SET max_wal_senders = #;

      For example, if the max_replication_slots parameter is set to 10, and you're already using 2 senders, then the number of WAL sender processes running at the same time would be 10 + 2 = 12.

    • Set max_worker_processes to at least the number of databases in the source instance, in addition to the number of worker processes already used on your instance. Use the following command, after replacing # with the total number:

      ALTER SYSTEM SET max_worker_processes = #;
  4. The parameters that you're setting in this step apply to a PostgreSQL database server that's running. You can also make these changes persistent by including them in the postgresql.conf file.

  5. If the Cloud SQL replica will use a private IP, configure the external server's firewall to allow the internal IP range allocated for the private services access of the VPC network of the replica.
  6. To apply the configuration changes, restart the source instance.

Enable replication delay monitoring for PostgreSQL versions preceding 9.6

If you're migrating from a PostgreSQL version lower than 9.6, then the replication delay metric isn't available by default. You can use one of three alternatives to track this metric and ensure minimal downtime when you promote the database:

  • Option 1: Enable the Cloud SQL external server to track the replication delay by granting access to a specific query. Using a user with the SUPERUSER privilege, perform the following:

    1. Define the following function to allow the external server to query for the replication delay.

      CREATE OR REPLACE FUNCTION pg_stat_replication_user()
      RETURNS TABLE (
      pid               integer                  ,
      usesysid          oid                      ,
      username          name                    ,
      application_name  text                     ,
      client_addr       inet                     ,
      client_hostname   text                     ,
      client_port       integer                  ,
      backend_start     timestamp with time zone ,
      backend_xmin      xid                      ,
      state             text                     ,
      sent_location     pg_lsn                   ,
      write_location    pg_lsn                   ,
      flush_location    pg_lsn                   ,
      replay_location   pg_lsn                   ,
      sync_priority     integer                  ,
      sync_state        text
      )
      LANGUAGE SQL
      SECURITY DEFINER
      AS $$
       SELECT *
       FROM pg_catalog.pg_stat_replication;
      $$;
      
    2. Grant the EXECUTE permission to the user by running the following commands:

      1. REVOKE EXECUTE ON FUNCTION pg_stat_replication_user() FROM public;
      2. GRANT EXECUTE ON FUNCTION pg_stat_replication_user() to {replication_user};
  • Option 2: Grant the SUPERUSER privilege directly to the user used to connect to the source instance. This allows the external server to read the replication delay directly.

  • Option 3: Track the replication delay independently by using the following query:

        SELECT current_timestamp, application_name,
        pg_xlog_location_diff(pg_current_xlog_location(), pg_stat_replication.sent_location) AS sent_location_lag,
        pg_xlog_location_diff(pg_current_xlog_location(), pg_stat_replication.write_location) AS write_location_lag,
        pg_xlog_location_diff(pg_current_xlog_location(), pg_stat_replication.flush_location) AS flush_location_lag,
        pg_xlog_location_diff(pg_current_xlog_location(), pg_stat_replication.replay_location) AS replay_location_lag
        FROM pg_stat_replication
        WHERE application_name like 'cloudsql%';
      

    In this option, Cloud SQL doesn't reflect the replication delay metric in the graphs or API responses.

Configure your source databases

The Cloud SQL external server migrates all databases under your source instance other than the following:

  • For on-premise sources: template databases template0 and template1
  • For Amazon RDS sources: template0, template1, and rdsadmin
  • For Cloud SQL sources: template databases template0 and template1

Do the following on each database in your source instance that isn't included in the preceding list:

  1. To install the pglogical extension, run the following command on every database on your source instance:

    CREATE EXTENSION IF NOT EXISTS pglogical

    For tables that don't have primary keys, Cloud SQL supports the migration of the initial snapshot and INSERT statements during the change data capture (CDC) phase. Migrate UPDATE and DELETE statements manually.

  2. Connect to the instance and run the following commands to set privileges for the user on each of the migrated databases, as well as the default postgres database.

    The user that you use to connect to the source instance is configured as the user in the Connection Profiles page. You can create a new user or reuse an existing one.

    • On all schemas on each database to migrate, except for the information schema and schemas starting with pg_, run the following command:

      GRANT USAGE on SCHEMA schema to username;
      GRANT SELECT on ALL SEQUENCES in SCHEMA schema to username;
      GRANT SELECT on ALL TABLES in SCHEMA schema to username;
    • On each database to migrate, run the following command:

      GRANT USAGE on SCHEMA pglogical to PUBLIC;
    • To get replication information from source databases,on all databases, run the following command:

      GRANT SELECT on ALL TABLES in SCHEMA pglogical to username;
    • If your source is Amazon RDS, run the following command:

      GRANT rds_replication to username;

      Otherwise, run the following command:

      ALTER USER username with REPLICATION;

Set up a source representation instance

The source representation instance references the external server. It contains only the request data from the external server. Create the request data and use it in a curl command that creates the source representation instance in Cloud SQL.

Create the request data

The request data contains basic information about your external server in JSON format. The request data can be configured for a Cloud SQL replica on a public or private network and should contain this information:

source.json

    {
      "name": "SOURCE_NAME",
      "region": "REGION",
      "databaseVersion": "DATABASE_VERSION",
      "onPremisesConfiguration": {
        "hostPort": "SOURCE_HOST",
        "username": "USERNAME",
        "password": "PASSWORD",
        "caCertificate": "SOURCE_CERT",
        "clientCertificate": "CLIENT_CERT",
        "clientKey": "CLIENT_KEY"
      }
    }

managed import example

// example of source.json for external server that
// - initiates replication from a Cloud SQL managed import
// - doesn't use SSL/TSL

{
  "name": "cloudsql-source-instance",
  "region": "us-central1",
  "databaseVersion": "POSTGRES_9_6",
  "onPremisesConfiguration": {
    "hostPort": "192.0.2.0:3306",
    "username": "replicationUser",
    "password": "486#@%*@"
  }
}
Property Description
SOURCE_NAME The name of the source representation instance to create.
REGION The region where you want the source representation instance to reside.
DATABASE_VERSION The database version running on your external server. The options are POSTGRES_9_6, POSTGRES_10, POSTGRES_11, POSTGRES_12, POSTGRES_13, or POSTGRES_14.
SOURCE HOST The IPv4 address and port for the external server, or the the DNS address for the external server. If the external server is hosted on Cloud SQL, the port is 5432.
USERNAME The replication user account on the external server.
PASSWORD The password for the replication user account.
BUCKET The name of the bucket that contains the dump file. Include only if you're setting up replication with a dump file that exists in a Cloud Storage bucket.
DUMP_FILE A file in the bucket that contains the data from the external server.
CLIENT_CA_CERT The CA certificate on the external server. Include only if SSL/TLS is used on the external server.
CLIENT_CERT The client certificate on the external server. Required only for server-client authentication. Include only if SSL/TLS is used on the external server.
CLIENT_KEY The private key file for the client certificate on the external server. Required only for server-client authentication. Include only if SSL/TLS is used on the external server.

Create a source representation instance

Before you start this step, create a JSON file that contains your source request data.

Then, to create the source representation instance in Cloud SQL, open a terminal and run the following commands :

curl

    gcloud auth login
    ACCESS_TOKEN="$(gcloud auth print-access-token)"
    curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
         --header 'Content-Type: application/json' \
         --data @JSON_PATH \
         -X POST \
         https://sqladmin.googleapis.com/sql/v1beta4/projects/PROJECT-ID/instances

example

gcloud auth login
ACCESS_TOKEN="$(gcloud auth print-access-token)"
curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
     --header 'Content-Type: application/json' \
     --data @./source.json \
     -X POST \
     https://sqladmin.googleapis.com/sql/v1beta4/projects/MyProject/instances
Property Description
PROJECT_ID The ID for your project in Google Cloud.
JSON_PATH The path to the JSON file that contains the request data for the external server.

Set up a Cloud SQL replica

The Cloud SQL replica eventually contains the data from the external server. In this step, you create the request data and use it in a curl command that creates the Cloud SQL replica in Cloud SQL.

Create the request data

The request data contains basic information about your external server and Cloud SQL replica in JSON format. The request data can be configured for a Cloud SQL replica on a public or private network and should contain this information:

replica.json

    {
        "settings": {
            "tier": "TIER",
            "dataDiskSizeGb": "DISK_SIZE",
            "ipConfiguration": {
                "privateNetwork": "projects/PROJECT_ID/global/networks/NETWORK_NAME"
             },
             "availabilityType": "AVAILABILITY_TYPE"
        },
        "masterInstanceName": "SOURCE_REPRESENTATION_INSTANCE_NAME",
        "region": "SOURCE_REGION",
        "databaseVersion": "DATABASE_VERSION",
        "name": "REPLICA_NAME"
    }

example

    {
        "settings": {
            "tier": "db-custom-4-15360",
            "dataDiskSizeGb": "100"
        },
        "masterInstanceName": "source-instance",
        "region": "us-central1",
        "databaseVersion": "POSTGRES_14",
        "name": "replica-instance"
    }
Property Description
TIER The type of machine to host your replica instance. If you do not know which machine type to use, start with db-custom-2-7680. You can change its size and other supported values later if needed.
DISK_SIZE The storage size for the Cloud SQL replica, in GB.
PROJECT_ID The ID of your project in Google Cloud. Include ipConfiguration only if the Cloud SQL replica is on a private network.
NETWORK_NAME The name of the private network to use with the Cloud SQL replica.
AVAILABILITY_TYPE The availability type of the Cloud SQL replica. By default, the value is ZONAL. To make the replica HA, set the value to REGIONAL. To learn about the allowed values, see SqlAvailabilityType.
After you create an external server HA replica, you cannot change it to a non-HA replica. This is true conversely as well. You cannot change an external server non-HA replica to an HA replica.
Manual failover might lead to unrecoverable migration if attempted when the instance is still loading the initial data, or to temporary downtime if the instance is already replicating from the source. Check the replication status.
SOURCE_REPRESENTATION_INSTANCE_NAME The name of the source representation instance.
SOURCE_REGION The region assigned to the source representation instance.
DATABASE_VERSION The database version to use with the Cloud SQL replica. Your choices are POSTGRES_9_6, POSTGRES_10, POSTGRES_11, POSTGRES_12, POSTGRES_13, or POSTGRES_14. This can match the database version running on your external server or it can be at most one version higher.
REPLICA_NAME The name of the Cloud SQL replica to create.

Create the Cloud SQL replica

Before you start this step, create a JSON file that contains your replica request data. Then, to create a Cloud SQL replica, open a Cloud Shell terminal and run these commands:

curl

    gcloud auth login
    ACCESS_TOKEN="$(gcloud auth print-access-token)"
    curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
         --header 'Content-Type: application/json' \
         --data @JSON_PATH \
         -X POST \
         https://sqladmin.googleapis.com/sql/v1beta4/projects/PROJECT-ID/instances

example

    gcloud auth login
    ACCESS_TOKEN="$(gcloud auth print-access-token)"
    curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
         --header 'Content-Type: application/json' \
         --data @./replica.json \
         -X POST \
         https://sqladmin.googleapis.com/sql/v1beta4/projects/MyProject/instances
Property Description
PROJECT_ID The ID of your project in Google Cloud, which must be the same as that of the source instance.
JSON_PATH The path to the JSON file that contains the request data for the Cloud SQL replica.

Verify your setup

To ensure your instances were set up correctly, go to the Cloud SQL Instances page.

You should see your source representation instance and the Cloud SQL replica, in a listing similar to the following:

Instance ID Type Public IP
(-) source-representation-instance Database external primary 10.68.48.3:5432
     replica-instance Database read replica 34.66.48.59

Add users to the Cloud SQL replica

You cannot import database user accounts from the external server, but you can create them on a Cloud SQL replica. Do this before you replicate from the external server.

Get the Cloud SQL replica's outgoing IP address

You can use the outgoing IP address of the Cloud SQL replica to create a secure connection between the external server and the Cloud SQL replica. You won't be charged for this IP address.

Console

To get the outgoing IP address for the replica, do the following:

  1. In the Google Cloud console, go to the Cloud SQL Instances page.

    Go to Cloud SQL Instances

  2. Next to the Cloud SQL replica's public IP address, hold the pointer over the More info tooltip and retrieve the outgoing IP address.

    Note that the outgoing IP address is not the IP address displayed in the main listing for the replica in the Google Cloud console.

gcloud

To get the outgoing IP address for the replica, run the following command:

gcloud sql instances describe REPLICA_NAME --format="default(ipAddresses)"
Property Description
REPLICA_NAME The name of the Cloud SQL replica whose outgoing public IP address you want to retrieve.

Allow incoming connections on the external server

The Cloud SQL replica needs to connect to the external server for replication to succeed. You must configure the network firewall for your external server to accept connections from the Cloud SQL replica's outgoing IP address if the following conditions apply:

  • The external server is behind a firewall or some other network restriction.
  • Your Cloud SQL replica is using a public IP.

To connect to the Cloud SQL replica, you use the replica's primary IP address. This IP address is displayed in the Google Cloud console.

Update the source representation instance to allow replication to the Cloud SQL replica

After you set up the source representation instance for the Cloud SQL replica, you might need to update the source representation instance. For example, these scenarios require an update to your configurations:

  • The host, port, or IP of the external server changes.
  • You want to use a different PostgreSQL replication user.
  • The password of the PostgreSQL replication user changes.
  • The SSL certificates used to securely connect to the external server change.

Seed the Cloud SQL replica

For the initial loading of data from the external server into the Cloud SQL replica, use a managed import. It uses a service that extracts data from the external server and imports it into the Cloud SQL instance directly. For more information, see Using a managed import to set up replication from external databases.

Monitor replication

When the Cloud SQL replica finishes the initial data load, it connects to the external server and applies all updates that were made after the export operation. Confirm your replication status.

It's important to check the replication status before promoting the replica to a standalone instance. If the replication process isn't successfully completed, a promoted replica doesn't have all the changes from your external server.

If replication delay is not trending toward 0, take steps to address it. You might want to check these metrics: /postgresql/external_sync/initial_sync_complete, postgresql/external_sync/max_replica_byte_lag, and database/replication/state. View the list of Cloud SQL metrics.

Once the Cloud SQL replica has caught up with the external server and there's no replication delay on the Cloud SQL replica, connect to your database. Run the appropriate database commands to make sure that the contents are as expected when compared with the external server. Retain your external server until the necessary validations are done.

Promote the replica

Promote your replica by following these steps:

  1. Promote the replica to a primary instance.
  2. Add read replicas to your instance.
  3. Optional: Configure your instance for high availability (HA). To prevent additional downtime, you can enable HA while setting up a replica by setting AVAILABILITY_TYPE to REGIONAL.

What's next