Use a managed import to set up replication from external databases

This page describes how to set up and use a managed import for data when replicating from an external server to Cloud SQL.

You must complete all the steps on this page. When finished, you can administer and monitor the source representation instance the same way as you would any other Cloud SQL instance.

Before you begin

Before you begin, complete these steps:

  1. Configure the external server.

  2. Create the source representation instance.

  3. Set up the Cloud SQL replica.

Verify your replication settings

After your setup is complete, ensure that the Cloud SQL replica can replicate from the external server.

The following external sync settings must be correct.

  • Connectivity between the Cloud SQL replica and external server
  • Replication user privileges
  • Version compatibility
  • The Cloud SQL replica is not already replicating

To verify these settings, open a Cloud Shell terminal and enter the following commands:

curl

gcloud auth login
ACCESS_TOKEN="$(gcloud auth print-access-token)"
curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
     --header 'Content-Type: application/json' \
     --data '{
         "syncMode": "SYNC_MODE"
       }' \
     -X POST \
     https://sqladmin.googleapis.com/sql/v1beta4/projects/PROJECT_ID/instances/REPLICA_INSTANCE/verifyExternalSyncSettings

example

gcloud auth login
ACCESS_TOKEN="$(gcloud auth print-access-token)"
curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
     --header 'Content-Type: application/json' \
     --data '{
         "syncMode": "online"
       }' \
     -X POST \
     https://sqladmin.googleapis.com/sql/v1beta4/projects/myproject/instances/myreplica/verifyExternalSyncSettings

These calls return a list of type sql#externalSyncSettingErrorList.

If the list is empty, there are no errors. A response without errors appears like this: { "kind": "sql#externalSyncSettingErrorList" }

Property Description
SYNC_MODE Ensures that you can keep the Cloud SQL replica and the external server in sync after replication is set up. Sync modes include EXTERNAL_SYNC_MODE_UNSPECIFIED, ONLINE, and OFFLINE.
PROJECT_ID The ID of your project in Google Cloud.
REPLICA_INSTANCE The ID of your Cloud SQL replica.

Start replication on the external server

After you have verified that you can replicate from the external server, you are ready to perform the replication. Expect the replica to import about 25-50 GB per hour.

curl

gcloud auth login
ACCESS_TOKEN="$(gcloud auth print-access-token)"
curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
     --header 'Content-Type: application/json' \
     --data '{
         "syncMode": "SYNC_MODE",
         "skipVerification": "SKIP_VERIFICATION"
       }' \
     -X POST \
     https://sqladmin.googleapis.com/sql/v1beta4/projects/PROJECT_ID/instances/REPLICA_INSTANCE/startExternalSync

example

gcloud auth login
ACCESS_TOKEN="$(gcloud auth print-access-token)"
curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
     --header 'Content-Type: application/json' \
     --data '{
         "syncMode": "online"
       }' \
     -X POST \
     https://sqladmin.googleapis.com/sql/v1beta4/projects/MyProject/instances/replica-instance/startExternalSync
Property Description
SYNC_MODE Verifies that you can keep the Cloud SQL replica and external server in sync after replication is set up.
SKIP_VERIFICATION Whether or not to skip the built-in verification step before syncing your data. Only recommended if you have already verified your replication settings.
PROJECT_ID The ID of your project in Google Cloud.
REPLICA_INSTANCE The ID of your Cloud SQL replica.

Monitor the migration

Once you start replication from the external server, you need to monitor replication. To learn more, see Monitoring replication. You can then complete your migration.

Troubleshoot

Consider the following troubleshooting options:

Issue Troubleshooting
Read replica did not start replicating on creation. There's probably a more specific error in the log files. Inspect the logs in Cloud Logging to find the actual error.
Unable to create read replica - invalidFlagValue error. One of the flags in the request is invalid. It could be a flag you provided explicitly or one that was set to a default value.

First, check that the value of the max_connections flag is greater than or equal to the value on the primary.

If the max_connections flag is set appropriately, inspect the logs in Cloud Logging to find the actual error.

Unable to create read replica - unknown error. There's probably a more specific error in the log files. Inspect the logs in Cloud Logging to find the actual error.

If the error is: set Service Networking service account as servicenetworking.serviceAgent role on consumer project, then disable and re-enable the Service Networking API. This action creates the service account necessary to continue with the process.

Disk is full. The primary instance disk size can become full during replica creation. Edit the primary instance to upgrade it to a larger disk size.
The replica instance is using too much memory. The replica uses temporary memory to cache often-requested read operations, which can lead it to use more memory than the primary instance.

Restart the replica instance to reclaim the temporary memory space.

Replication stopped. The maximum storage limit was reached and automatic storage increase isn't enabled.

Edit the instance to enable automatic storage increase.

Replication lag is consistently high. The write load is too high for the replica to handle. Replication lag takes place when the SQL thread on a replica is unable to keep up with the IO thread. Some kinds of queries or workloads can cause temporary or permanent high replication lag for a given schema. Some of the typical causes of replication lag are:
  • Slow queries on the replica. Find and fix them.
  • All tables must have a unique/primary key. Every update on such a table without a unique/primary key causes full table scans on th replica.
  • Queries like DELETE ... WHERE field < 50000000 cause replication lag with row-based replication since a huge number of updates are piled up on the replica.

Some possible solutions include:

  • Edit the instance to increase the size of the replica.
  • Reduce the load on the database.
  • Index the tables.
  • Identify and fix slow write queries.
  • Recreate the replica.
Errors when rebuilding indexes in PostgreSQL 9.6. You get an error from PostgreSQL informing you that you need to rebuild a particular index. This can be done only on the primary instance. If you create a new replica instance, you soon get the same error again. Hash indexes are not propagated to replicas in PostgreSQL versions below 10.

If you must use hash indexes, upgrade to PostgreSQL 10+. Otherwise, if you also want to use replicas, don't use hash indexes in PostgreSQL 9.6.

Replica creation fails with timeout. Long-running uncommitted transactions on the primary instance can cause read replica creation to fail.

Recreate the replica after stopping all running queries.

Review your replication logs

When you verify your replication settings, logs are produced.

You can view these logs by following these steps:

  1. Go to the Logs Viewer in the Google Cloud console.

    Go to the Logs Viewer

  2. Select the Cloud SQL replica from the Instance dropdown.
  3. Select the replication-setup.log log file.

If the Cloud SQL replica is unable to connect to the external server, confirm the following:

  • Any firewall on the external server is configured to allow connections from the Cloud SQL replica's outgoing IP address.
  • Your SSL/TLS configuration is correct.
  • Your replication user, host, and password are correct.

What's next