High availability

This page describes high availability for Memorystore for Redis instances in the Standard Tier. The Standard Tier provides high availability through replication and automatic failover capability. Memorystore for Redis does not use Redis Sentinel for high availability.

What high availability is

Memorystore for Redis provides high availability by replicating a primary Redis node to a replica node. The replica node is a copy of the primary node that replicates any changes made to the primary node.

Each Memorystore for Redis instance in the Standard Tier is provisioned with a primary node and replica node. Application requests are directed to the primary node. Changes made to the data on the primary node are copied to the replica using the Redis asynchronous replication protocol.

What high availability provides

If the primary node fails, the Memorystore for Redis service triggers a failover. The service promotes the replica to be the new primary and, after recovery, the former primary becomes the replica. In essence, the nodes switch roles.

To tolerate zone failures, the primary node and replica node are located in different zones within the same region. To specify the zones for the primary and replica nodes you must create the Redis instance using the gcloud tool.

A Standard Tier instance preserves instance data in the case of a single node failure because the data is backed up in the node that doesn't fail. If the primary node and replica node fail at the same time due to multi-zone failure, data cannot be recovered.

When a failover is triggered

A failover occurs when the primary Redis node fails. During failover, all requests to the new primary are redirected automatically to the replica, and the Memorystore for Redis instance continues to respond to your application.

How failover affects your applications

When the primary node fails over to the replica, existing connections to Memorystore for Redis are dropped. However, on reconnect, your application is redirected automatically to the new primary node using the same connection string or IP address. You do not need to update your application after a failover.

While the Memorystore for Redis service promotes the replica to be the primary, your Memorystore for Redis instance is temporarily unavailable. Each node is located in a single zone, so zonal failures might cause prolonged recovery time. During this time, there is only one copy of the data.

Retrying the instance connection after failover

A failover always leads to dropped connection. This is necessary because internally the primary and the replica switch roles and IP addresses. However, you must still access your Redis instance using the instance's static IP address.

Due to this loss of connection, your application needs to retry in order to reestablish the connection. The retry logic should use exponential backoff to ensure that you don't overload your instance with too many retry requests. In addition to including retry logic, you should test how a failover affects your application by testing with a manual failover.

Most Redis clients have built-in retry capabilities that you should leverage in the event of a connection drop due to failover.

A failover occurs in the following scenarios:

If you implement retry logic in your application to handle connection drops due to failovers, your instance should see no significant performance impact. Usually issues only arise as a result of not having retry logic in place.

How you view the status for high availability

You can see high availability metrics for your Redis instance by using Google Cloud's operations suite. For information about the metrics that Google Cloud's operations suite provides for Memorystore for Redis, see Monitoring Redis Instances. For more information about using Google Cloud's operations suite with Google Cloud, see the Stackdriver Monitoring documentation.

To see the native replication status that Redis provides, you can issue the Redis INFO command to the Memorystore for Redis instance.

What's next