This page describes read replica operations. These operations include disabling and enabling replication. Additionally, this page describes how to:
- Promote a replica to a stand-alone instance
- Configure parallel replication
For more information about working with read replicas, see Replication in Cloud SQL.
Disabling replication
By default, a replica starts with replication enabled. However, you can disable replication, for example, to debug or analyze the state of an instance. When you are ready, you explicitly re-enable replication. Disabling or re-enabling replication restarts the replica.
Disabling replication does not stop the replica instance; it becomes a read-only instance that is no longer replicating from its primary instance. You continue to be charged for the instance. You can re-enable replication on the disabled replica, delete the replica, or promote the replica to a stand-alone instance. You cannot stop the replica.
To disable replication:
Console
- Go to the Cloud SQL Instances page in the Google Cloud Console.
- Open a replica instance by clicking its name.
- Click Disable replication in the button bar.
- Click OK.
gcloud
gcloud sql instances patch [REPLICA_NAME] --no-enable-database-replication
REST v1beta4
To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:patch page to send the REST API request.
Before using any of the request data below, make the following replacements:
- project-id: The project ID
- replica-name: The name of the replica instance
HTTP method and URL:
PATCH https://www.googleapis.com/sql/v1beta4/projects/project-id/instances/replica-name
Request JSON body:
{ "settings": { "databaseReplicationEnabled": "False" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
Enabling replication
If a replica has not been replicating for a long time, it will take longer for it to catch up to the primary instance. In this case, delete the replica and create a new one.
To enable replication:
Console
- Go to the Cloud SQL Instances page in the Google Cloud Console.
- Select a replica instance by clicking its name.
- Click Enable replication in the button bar.
- Click OK.
gcloud
gcloud sql instances patch [REPLICA_NAME] --enable-database-replication
REST v1beta4
To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:patch page to send the REST API request.
Before using any of the request data below, make the following replacements:
- project-id: The project ID
- replica-name: The name of the replica instance
HTTP method and URL:
PATCH https://www.googleapis.com/sql/v1beta4/projects/project-id/instances/replica-name
Request JSON body:
{ "settings": { "databaseReplicationEnabled": "True" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
Promoting a replica
Promoting a read replica stops replication and converts the instance to a standalone Cloud SQL primary instance with read and write capabilities. This can't be undone.
Before promoting a read replica, if the primary is still available and serving
clients, you should stop all writes to the primary instance and check the replication status of the replica
(follow the instructions in the psql Client tab). You should verify
that the replica is replicating and then wait until the replication lag reported
by the replay_lag
metric is 0. Otherwise, the
newly promoted instance may be missing some of the transactions that were
committed to the primary instance.
To promote a replica to a standalone instance:
Console
- Go to the Cloud SQL Instances page in the Google Cloud Console.
- Select a replica instance by clicking its name.
- Click Promote replica in the button bar.
- Click OK.
gcloud
gcloud sql instances promote-replica [REPLICA_NAME]
REST v1beta4
To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:promoteReplica page to send the REST API request.
Before using any of the request data below, make the following replacements:
- project-id: The project ID
- replica-name: The name of the replica instance
HTTP method and URL:
POST https://www.googleapis.com/sql/v1beta4/projects/project-id/instances/replica-name/promoteReplica
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
Confirm that the promoted instance is configured correctly. In particular, enable automated backups and consider configuring the instance for high availability if needed.
Checking replication status
When you view a replica instance using the Google Cloud Console or log into the
instance using an administration client, you get details about replication,
including status and metrics. When you use the gcloud
command-line tool, you
get a brief summary of the replication configuration.
The following metrics are available for replica instances. (Learn more about additional metrics available for all instances, including non-replica instances.)
Metric | Description |
---|---|
Replication State ( cloudsql.googleapis.com ) |
Indicates whether replication is actively streaming logs from the primary to the replica. Possible values are:
This metric reports
For more information, see The Statistics Collector and System Administration Functions in the PostgreSQL Reference Manual. |
Lag Bytes ( cloudsql.googleapis.com ) |
Reports the number of bytes by which the read replica lags the primary. Four time series are produced for each replica, showing the number of bytes in the primary's write-ahead log that have not yet been…
These metrics serve different purposes; for example,
These metrics are computed by comparing
|
Max Lag Bytes ( cloudsql.googleapis.com ) |
For a replica of an external primary, reports the maximum replication lag (in bytes) over all databases that are being replicated to this instance. For each database, this is defined as the number of bytes in the primary's write-ahead log that have not been confirmed to be received by the replica. This metric is computed by sending a query to the primary to compare
|
To check replication status:
Console
Cloud SQL reports the
Replication State
metric on the
default
Cloud SQL monitoring dashboard.
To view other metrics for in-region and cross-region replicas, and replicas of external servers, create a custom dashboard and add the metrics you wish to monitor to it:
- Go to the Monitoring page.
- Select the Dashboards tab.
- Click + CREATE DASHBOARD on the button bar in the top of the page.
- Give the dashboard a name and click OK.
- Click ADD CHART in the upper right-hand corner of the page.
- For Resource Type select Cloud SQL Database.
- Do any of the following:
- To monitor the replication state metric: in the Select a
metric field, type
Replication state
. Then add a filter forstate = "Running"
. The chart shows 1 if replication is running and 0 otherwise. - To monitor the replication lag, in bytes, for a read replica: in
the Select a metric field, type
Lag Bytes
. Then add a filter onreplica_lag_type = "replay_location"
. The chart shows the number of bytes associated with transactions that have been committed on the primary but have not yet been replayed on the replica. - To monitor the replication lag, in bytes, for a replica of an
external primary: in the Select a metric field, type
Max Lag Bytes
. The chart shows the number of bytes associated with transactions that have been committed on the primary but have not yet been confirmed received by the replica.
gcloud
For a replica instance, check the replication status with:
gcloud sql instances describe [REPLICA_NAME]
In the output, look for the properties databaseReplicationEnabled
and masterInstanceName
.
For a primary instance, check if there are replicas with:
gcloud sql instances describe [PRIMARY_INSTANCE_NAME]
In the output, look for the property replicaNames
.
psql Client
Some replication status metrics are produced by the primary and some are produced by the replica. For the following steps, connect to the replica or primary instance (as directed below) with a PostgreSQL client.
For information, see Connection options for external applications.
- To check the replica's status from the primary instance:
select * from pg_stat_replication;
Look for the following metrics in the output of the command:client_addr
: The IP address of the replica instance.state
: Indicates whether the SQL thread for executing events in the relay log is running. The value isstreaming
when replication is started.replay_lag
: The number of bytes that the replica SQL thread is behind the primary instance. The value isO
or a small number of bytes.
- To check the replica's status from the replica instance:
select * from pg_stat_wal_receiver;
Look for the following metrics in the output of the command:
sender_host
: The IP address of the primary instance.status
: Indicates whether the SQL thread for executing events in the relay log is running. The value isstreaming
when replication is started.last_msg_send_time
andlast_msg_receipt_time
: The difference between these two timestamps is the lag time.
To check whether replication has been paused:
select pg_is_wal_replay_paused();
The value is
t
if replication is paused andf
otherwise.To check whether there are transactions that have been received from the primary but not yet applied:
# for PostgreSQL 9.6 select pg_catalog.pg_last_xlog_receive_location(), pg_catalog.pg_last_xlog_replay_location(); # for PostgreSQL 10 and above select pg_catalog.pg_last_wal_receive_lsn(), pg_catalog.pg_last_wal_replay_lsn();
If the two values are equal, then the replica has processed all of the transactions it has received from the primary.
For more details about the output from these commands, see the PostgreSQL documentation on The Statistics Collector.
Troubleshooting
Click the links in the table for details:
For this problem... | The issue might be... | Try this... |
---|---|---|
Read replica did not start replicating on creation. | There could be many root causes. | Check the logs to find more information. |
Unable to create read replica - unknown error. | There could be many root causes. | Check the logs to find more information. |
Disk is full. | The primary instance disk size can become full during replica creation. | Upgrade the primary instance to a larger disk size. |
Replica instance is using too much memory. | Replicas can cache often-requested read operations. | Restart the replica instance to reclaim the temporary memory space. |
Replication stopped. | Max storage space was reached and automatic storage increase is not enabled. | Enable automatic storage increase. |
Replication lag is consistently high. | There can be many different root causes. | Here are a few things to try. |
Read replica did not start replicating on creation
The read replica did not start replicating on creation.
The issue might be
There is probably a more specific error in the log files.
Things to try
Inspect the logs in Cloud Logging to find the actual error.
Unable to create read replica - unknown error
Unable to create read replica - unknown error
.
The issue might be
There is probably a more specific error in the log files.
Things to try
Inspect the logs in Cloud Logging to find the actual error.
If the error is: set Service Networking service account as
servicenetworking.serviceAgent role on consumer project
, then disable and
re-enable the Service Networking
API
. This action creates the service account necessary to continue
with the process.
Disk is full
error: disk is full
The issue might be
The primary instance disk size can become full during replica creation.
Things to try
Edit the primary instance to upgrade it to a larger disk size.
Replica instance is using too much memory
The replica instance is using too much memory.
The issue might be
The replica uses temporary memory to cache often-requested read operations, which can lead it to use more memory than the primary instance.
Things to try
Restart the replica instance to reclaim the temporary memory space.
Replication stopped
Replication stopped.
The issue might be
The maximum storage limit was reached and >automatic storage
increase is disabled
.
Things to try
Edit the instance to enable automatic storage increase
.
Replication lag is consistently high
Replication lag is consistently high.
The issue might be
The write load is too high for the replica to handle. Replication lag takes place when the SQL thread on a replica is unable to keep up with the IO thread. Some kinds of queries or workloads can cause temporary or permanent high replication lag for a given schema. Some of the typical causes of replication lag are:
- Slow queries on the replica. Find and fix them.
- All tables must have a unique/primary key. Every update on such a table without a unique/primary key causes full table scans on th replica.
- Queries like
DELETE ... WHERE field < 50000000
cause replication lag with row-based replication since a huge number of updates are piled up on the replica.
Things to try
Some possible solutions:
- Edit the instance to increase the size of the replica.
- Reduce the load on the database.
- Index the tables.
- Identify and fix slow queries.
- Recreate the replica.
What's next
- Learn how to create a read replica.
- Learn more about requirements and best practices for replication.