Overview of the High Availability Configuration

This page describes the high availability configuration for Second Generation instances.

For help with configuring high availability, see Configuring an Instance for High Availability.

What the high availability configuration is

A Second Generation instance is in an high availability configuration when it has a failover replica. The failover replica must be in a different zone than the original instance, also called the master. All changes made to the data on the master, including to user tables, are replicated to the failover replica using semisynchronous replication.

What the high availability configuration provides

If the zone where the master is located experiences an outage, Cloud SQL automatically fails over to the failover replica, and your data continues to be available to clients. This is called a 'failover'.

This capability is built in for First Generation instances. Cloud SQL Second Generation provides the high availability configuration as an option so you can reduce your costs for non-production instances.

How to configure an instance for high availability

The easiest way to configure a Second Generation instance for high availability is when you create the instance. You can also configure an existing instance for high availability. For more information, see Configuring an Instance for High Availability.

Which instances should be configured for high availability

You should configure all of your instances that contain production data for high availability.

Requirements for the high availability configuration

Failover replicas must be in the same project and region as the master instance.

How configuring an instance for high availability affects your charges

The failover replica is billed as a separate instance.

When failover is triggered

If the zone where the master is located experiences an outage, Cloud SQL initiates a failover. If your master instance is experiencing issues not caused by a zone outage, a failover is not initiated.

You can also initiate failover manually. For information, see Initiating failover.

How failover affects your applications and your instances

When a master fails over to its failover replica, any existing connections to the instance are closed. However, your application can reconnect using the same connection string or IP address; you do not need to update your application after a failover.

For some time during the failover, your applications experience an outage. A major factor in outage duration is the size of the replication lag (how far the replica is behind the master) when the failover starts. This is because the replica cannot start servicing requests until it "catches up" to the state of the master. You should monitor replication lag and take steps to address it if it becomes too large for your failover duration requirements.

After the failover, the instance resumes serving data, from the zone where the replica was located. You can see what zone your instance is serving data from by going to its Overview page in the Cloud Platform Console.

You should initiate a failover in a test environment to see exactly how your applications are affected.

About using the failover replica as a read replica

You can use the failover replica as a read replica, to offload read operations from the master.

You can create only one failover replica for every master. You can create additional read replicas to offload read operations from the master.

For more information about creating read replicas, see Configuring Replication.

How the failover replica is configured

The failover replica is configured with the same database flags, users (including root) and passwords, maintenance window, authorized applications and networks, and databases as the master. You cannot change the replica's activation policy or maintenance window, nor can you enable backups on the replica. Backups must be performed on the master instance.

When replication can be disabled

A master instance falls out of high availability mode when the failover replica becomes unavailable. This can happen, for example, if the network connection between the master instance and failover replica is interrupted, or if the failover replica is down due to its own zone failure. During this time, the master instance is not in high availability mode, and you will not be able to failover to the replica, because it is not safe to do so. The failover replica resumes replication on reconnection, and high availability mode is reenabled when the failover replica catches up completely and returns to semi-synchronous replication.

How the high availability configuration affects backups and restores

You can only perform backups on the master instance. Before you can restore a master instance from a backup, or perform a point-in-time recovery on a master, you must delete all replicas. After the restore completes, you must recreate the replicas.

How the high availability configuration differs between MySQL and PostgreSQL

There are some differences in the high availability configuration between MySQL and PostgreSQL instances that impact how you work with highly-available instances:

  • Highly-available PostgreSQL instances do not have a separate failover instance the way MySQL instances do.

    This has the following consequences:

    • There is no concept of replication lag, as there is for MySQL instances. As long as the secondary zone is healthy, failover can occur.

    • If you need to offload read operations, you must create a read replica.

  • If a failover occurs, read replicas replicating from a PostgreSQL regional instance do not change zones; they continue to serve data, even if they are now in a different zone than the primary instance. You can initiate another failover, in this case called a 'failback', to return the regional instance to serving data from its original zone.

  • Enabling automatic backups is not required for highly-available PostgreSQL instances, as it is for MySQL. However, enabling automatic backups is recommended for increased data durability.

How you view the health of your high availability configuration

Replica availability

The state of failover replica availability (true or false) is available as a metric of the master:

cloudsql.googleapis.com/database/mysql/replication/available_for_failover

This state is also included in the response of the Get request of the master instance in the failoverReplica.available field.

You can also see metrics for your high availability configuration by using Stackdriver. For a complete list of Cloud SQL metrics provided by Stackdriver, see the Cloud SQL metrics list. For more information about using Stackdriver with Cloud Platform, see the Stackdriver documentation.

Replication lag

Pending events are replicated to the failover replica instance as part of semisynchronous replication. A failover operation waits for these pending events on the failover replica to be committed. Consequently, its runtime is dependent on the lag between the failover replica and its master.

The state of replication lag is available as a metric of the replica:

cloudsql.googleapis.com/database/mysql/replication/seconds_behind_master

The value for this metric represents the number of seconds the replica is behind the master.

For information about setting up a Stackdriver alert for this metric, see Creating an alert for replication lag.

For information on viewing replication metrics, see Viewing and exporting MySQL error logs. You can also see metrics for your high availability configuration by using Stackdriver. For a complete list of Cloud SQL metrics provided by Stackdriver, see the Cloud SQL metrics list. For more information about using Stackdriver with Cloud Platform, see the Stackdriver documentation.

What's next

Send feedback about...

Cloud SQL for MySQL