Managing replicas

This page describes how you can disable and enable replication for a read replica, as well as how to promote a replica to a stand-alone instance or delete it. For information about working with read replicas, see Replication in Cloud SQL.

Disabling replication

By default, a replica starts with replication enabled. However, you can disable replication, for example, to debug or analyze the state of an instance. When you are ready, you explicitly re-enable replication. Disabling or re-enabling replication restarts the replica.

Disabling replication does not stop the replica instance; it becomes a read-only instance that is no longer replicating from its primary instance. You continue to be charged for the instance. You can re-enable replication on the disabled replica, delete the replica, or promote the replica to a stand-alone instance. You cannot stop the replica.

To disable replication:

Console

  1. Go to the Cloud SQL Instances page in the Google Cloud Console.

    Go to the Cloud SQL Instances page

  2. Open a replica instance by clicking its name.
  3. Click Disable replication in the button bar.
  4. Click OK.

gcloud

gcloud sql instances patch [REPLICA_NAME] --no-enable-database-replication

REST v1beta4

To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:patch page to send the REST API request.

Before using any of the request data below, make the following replacements:

  • project-id: The project ID
  • replica-name: The name of the replica instance

HTTP method and URL:

PATCH https://www.googleapis.com/sql/v1beta4/projects/project-id/instances/replica-name

Request JSON body:

{
  "settings":
  {
    "databaseReplicationEnabled": "False"
  }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

Enabling replication

If a replica has not been replicating for a long time, it will take longer for it to catch up to the primary instance. In this case, delete the replica and create a new one.

To enable replication:

Console

  1. Go to the Cloud SQL Instances page in the Google Cloud Console.

    Go to the Cloud SQL Instances page

  2. Select a replica instance by clicking its name.
  3. Click Enable replication in the button bar.
  4. Click OK.

gcloud

gcloud sql instances patch [REPLICA_NAME] --enable-database-replication

REST v1beta4

To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:patch page to send the REST API request.

Before using any of the request data below, make the following replacements:

  • project-id: The project ID
  • replica-name: The name of the replica instance

HTTP method and URL:

PATCH https://www.googleapis.com/sql/v1beta4/projects/project-id/instances/replica-name

Request JSON body:

{
  "settings":
  {
    "databaseReplicationEnabled": "True"
  }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

Promoting a replica

Promoting a read replica stops replication and converts the instance to a standalone Cloud SQL primary instance with read and write capabilities. This can't be undone.

Before promoting a read replica, if the primary is still available and serving clients, you should stop all writes to the primary instance and check the replication status of the replica (follow the instructions in the MySQL Client tab). You should verify that the replica is replicating and then wait until the replication lag reported by the Seconds_Behind_Master metric is 0. Otherwise, the newly promoted instance may be missing some of the transactions that were committed to the primary instance.

To promote a replica to a standalone instance:

Console

  1. Go to the Cloud SQL Instances page in the Google Cloud Console.

    Go to the Cloud SQL Instances page

  2. Select a replica instance by clicking its name.
  3. Click Promote replica in the button bar.
  4. Click OK.

gcloud

gcloud sql instances promote-replica [REPLICA_NAME]
  

REST v1beta4

To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:promoteReplica page to send the REST API request.

Before using any of the request data below, make the following replacements:

  • project-id: The project ID
  • replica-name: The name of the replica instance

HTTP method and URL:

POST https://www.googleapis.com/sql/v1beta4/projects/project-id/instances/replica-name/promoteReplica

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

Confirm that the promoted instance is configured correctly. In particular, enable automated backups and consider configuring the instance for high availability if needed.

Checking replication status

When you view a replica instance using the Google Cloud Console or log into the instance using an administration client, you get details about replication, including status and metrics. When you use the gcloud command-line tool, you get a brief summary of the replication configuration.

The following metrics are available for replica instances. (Learn more about additional metrics available for all instances, including non-replica instances.)

MetricDescription
Replication Lag
(cloudsql.googleapis.com/database/replication/replica_lag)

The amount of time that the replica's state is lagging behind the state of the primary instance. This is the difference between (1) the current time and (2) the original timestamp at which the primary committed the transaction that is currently being applied on the replica. In particular, writes may be counted as lagging even if they have been received by the replica, if the replica has not yet applied the write to the database.

This metric reports the value of Seconds_Behind_Master when SHOW SLAVE STATUS is run on the replica. For more information, see Checking Replication Status in the MySQL Reference Manual.

Slave I/O thread running state
(cloudsql.googleapis.com/database/mysql/replication/slave_io_running_state)

Indicates whether the I/O thread for reading the primary instance's binary log is running on the replica. Possible values are:

  • Yes
  • No
  • Connecting

This metric reports the value of Slave_IO_Running when SHOW SLAVE STATUS is run on the replica. For more information, see Checking Replication Status in the MySQL Reference Manual.

Slave SQL thread running state
(cloudsql.googleapis.com/database/mysql/replication/slave_sql_running_state)

Indicates whether the SQL thread for executing events in the relay log is running on the replica. Possible values are:

  • Yes
  • No
  • Connecting

This metric reports the value of Slave_SQL_Running when SHOW SLAVE STATUS is run on the replica. For more information, see Checking Replication Status in the MySQL Reference Manual.

To check replication status:

Console

Cloud SQL reports the Replication Lag metric on the default Cloud SQL monitoring dashboard.

To view other metrics for in-region and cross-region replicas, and replicas of external servers, create a custom dashboard and add the metrics you wish to monitor to it:

  1. Go to the Monitoring page.
  2. Select the Dashboards tab.
  3. Click + CREATE DASHBOARD on the button bar in the top of the page.
  4. Give the dashboard a name and click OK.
  5. Click ADD CHART in the upper right-hand corner of the page.
  6. For Resource Type select Cloud SQL Database.
  7. Do any of the following:
    1. To monitor the replication lag metric: in the Select a metric field, type replica_lag. The chart shows the amount of time that the replica's state lags behind that of its primary.
    2. To monitor the status of the replica's I/O thread: in the Select a metric field, type Slave I/O thread running state. Then add a filter on state = "Yes". The chart shows 1 if the thread is running and 0 otherwise.
    3. To monitor the status of the replica's SQL thread: in the Select a metric field, type Slave SQL thread running state. Then add a filter on state = "Yes". The chart shows 1 if the thread is running and 0 otherwise.

gcloud

For a replica instance, check the replication status with:

gcloud sql instances describe [REPLICA_NAME]

In the output, look for the properties databaseReplicationEnabled and masterInstanceName.

For a primary instance, check if there are replicas with:

gcloud sql instances describe [PRIMARY_INSTANCE_NAME]

In the output, look for the property replicaNames.

MySQL Client

  1. Connect to the replica with a MySQL client.

    For information, see Connection Options for External Applications.

  2. Check the replica's status:
    SHOW SLAVE STATUS \G

    Look for the following metrics in the output of the command:

    • Master_Host: The name of the primary instance.
    • Slave_IO_Running, Slave_SQL_Running: Whether the I/O and SQL threads, respectively, are running. These threads are responsible for transferring events from the primary to the replica's relay log and executing those events from the relay log. The value of the metric is Yes if the thread is running. Both threads must be running for replication to be active.
    • Seconds_Behind_Master: The amount of time, in seconds, by which the replica lags in processing the primary's transactions, i.e. the difference between (1) the current time and (2) the original timestamp at which the primary committed the transaction that is currently being applied on the replica. The value is NULL if replication is broken.
    • Master_Log_file, Read_Master_Log_Pos, Relay_Master_Log_File, Exec_Master_Log_Pos: These metrics show the coordinates (filename and offset) that the I/O thread has read events up to (Master_Log_file and Read_Master_Log_Pos) and that the SQL thread has executed events up to (Relay_Master_Log_File and Exec_Master_Log_Pos). If they are the same (i.e. Master_Log_file is equal to Relay_Master_Log_File and Read_Master_Log_Pos is equal to Exec_Master_Log_Pos) then the replica has processed all of the events it has received from the primary.

For more details about the output from this command, see the MySQL documentation on Checking Replication Status.

Troubleshooting

Click the links in the table for details:

For this problem... The issue might be... Try this...
Read replica did not start replicating on creation. At least one backup must have been created after binary logging was enabled. Wait until at least one backup has been created after enabling binary logs.
Unable to create read replica - unknown error. There could be many root causes. Check the logs to find more information.
Disk is full. The primary instance disk size can become full during replica creation. Upgrade the primary instance to a larger disk size.
Replica instance is using too much memory. Replicas can cache often-requested read operations. Restart the replica instance to reclaim the temporary memory space.
Replication stopped. Max storage space was reached and automatic storage increase is not enabled. Enable automatic storage increase.
Replication lag is consistently high. There can be many different root causes. Here are a few things to try.

Read replica did not start replicating on creation

Read replica did not start replicating on creation.

The issue might be

The primary instance must have at least a week's worth of binlogs or else replicas cannot start replicating.

Things to try

Wait until there are enough binlogs.


Unable to create read replica - unknown error

Unable to create read replica - unknown error.

The issue might be

There is probably a more specific error in the log files.

Things to try

Inspect the logs in Cloud Logging to find the actual error. If the error is: set Service Networking service account as servicenetworking.serviceAgent role on consumer project, then disable and re-enable the Service Networking API. This action creates the service account necessary to continue with the process.


Disk is full

UPDATE_DISK_SIZE or mysqld: disk is full error.

The issue might be

The primary instance disk size can become full during replica creation.

Things to try

Edit the primary instance to upgrade it to a larger disk size.


Replica instance is using too much memory

The replica instance is using too much memory.

The issue might be

The replica uses temporary memory to cache often-requested read operations, which can lead it to use more memory than the primary instance.

Things to try

Restart the replica instance to reclaim the temporary memory space.


Replication stopped

Replication stopped.

The issue might be

The maximum storage limit was reached and >automatic storage increase is disabled.

Things to try

Edit the instance to enable automatic storage increase.


Replication lag is consistently high

Replication lag is consistently high.

The issue might be

The write load is too high for the replica to handle. Replication lag takes place when the SQL thread on a replica is unable to keep up with the IO thread. Some kinds of queries or workloads can cause temporary or permanent high replication lag for a given schema. Some of the typical causes of replication lag are:

  • Slow queries on the replica. Find and fix them.
  • All tables must have a unique/primary key. Every update on such a table without a unique/primary key causes full table scans on th replica.
  • Queries like DELETE ... WHERE field < 50000000 cause replication lag with row-based replication since a huge number of updates are piled up on the replica.

Things to try

Some possible solutions:

  • Edit the instance to increase the size of the replica.
  • Reduce the load on the database.
  • Index the tables.
  • Identify and fix slow queries.
  • Recreate the replica.

What's next