Zero-downtime migrations to Memorystore for Redis Cluster
Jason Massie
Customer Engineer, Data Management
Chris Mague
Customer Engineer, Data Management, West Coast
With the announcement of Memorystore for Redis Cluster at Google Next, we have many Redis administrators and developers asking how they can migrate from their existing Redis cluster environments. We understand that these are business-critical applications and need a zero downtime migration.
Adopting Memorystore helps remove repetitive tasks like scaling, patching, backing up and configuring observability. This frees up Redis developers and administrators to focus on activities that provide direct value to their users like releasing features and applications. It can also reduce costs.
Memorystore for Redis Cluster is a managed service offering that is fully OSS compatible and easy to set up. Memorystore for Redis Cluster serves the most demanding use cases like caching, leaderboards and stream processing. Memorystore for Redis Cluster provides automatic zonal distribution of nodes for high availability, automated replica management and promotion, and zero-downtime scale in and out with automatic key redistribution. You can migrate from a variety of standalone node or clustered Redis sources, including from self-managed Redis on Compute Engine, Google Kubernetes Engine, or from third-party platforms like Redis Enterprise or Elasticache. You can learn more about Memorystore for Redis Cluster in our documentation.
In this blog, we will describe how to use RIOT, “Redis Input/Output Tool” for an online migration from an existing Redis cluster to a fully managed Memorystore for Redis Cluster. We will provide some guidelines to enable a problem-free migration.
What is RIOT?
RIOT is an open-source tool developed by Julien Ruaux, Principal Field Engineer, at Redis. RIOT is for data migration between various sources and targets including files, relational databases and Redis instances. We will focus on how it facilitates a hassle-free migration from one Redis cluster to another. Note that RIOT only migrates to instances running a newer Redis version or the same Redis version as the Redis source, i.e. RIOT can't migrate from Redis 7.2 to Redis 7.0 but it can migrate from 7.0 to 7.0 (see Memorystore for Redis Cluster supported versions here).
Ensuring a smooth migration
We recommend the following additional efforts take place to ensure a smooth migration:
- Planning - Write a detailed migration project plan including dependencies, time estimates and tasks owners.
- Automation - Any actions should be scripted.
- Testing - Test the migration and incorporate lessons learned into the migration plan and automation. Iterate the tests several times.
This methodology will help eliminate downtime and human error. Further information about migration planning can be found in this blog.
Migration workflow overview
Before we get started, let’s review a high level plan for the migration. The following diagrams are a logical overview of using RIOT for a no-downtime migration, though there may be some backfill as replication catches up at cutover.
- Deploy a Memorystore for Redis Cluster instance sized similarly to your existing cluster.
- Deploy a Compute Engine VM with Java Virtual Machine (JVM) and RIOT installed to manage the data movement. When you start RIOT, it will take a full snapshot of the current production Redis cluster instance and write the snapshot to your new Memorystore for Redis Cluster instance. This could take some time depending on the size of the cluster and the network connectivity.
3. RIOT propagates new changes from your existing Redis cluster to your new Memorystore for Redis Cluster instance while your application is live. Replication lag can range from milliseconds to seconds depending on the rate of change and network connectivity. A typical migration on the GCP network with a source and target that has adequate resources can have a replication latency measured in milliseconds.
4. When ready for cutover, stop traffic to the existing Redis cluster. Reconfigure the application to point to the new Memorystore for Redis Cluster instance.
Now that we’ve discussed the process at a high level. Let’s get into the finer details.
Guide: Performing the migration
The following step by step instructions can be used as a guide for your near zero downtime migration.
Step 1: Create a VM to run RIOT
You can create the RIOT VM and the Memorystore for Redis Cluster from the console or with a similar gcloud command. Edit the project, zone, network and service account as needed.
Note: Networking will be needed to the source and target Memorystore instance on the Redis ports. Memorystore uses the default Redis port 6379.
Create the RIOT VM on GCP
Create the Memorystore for Redis Cluster
Step 2: Install RIOT on a GCP VM.
Run the following command to download RIOT. Check for the latest versions.
Extract RIOT:
Step 3: Install Certificate Authorities on the GCP VM with RIOT
Download the certificate authorities. It can be made through console or gcloud command:
On the GCP VM, create a file called server_ca.pem in your client:
Paste certificate into the server_ca.pem file. The text of the CAs must be formatted correctly.
Let’s set up the environment. We need to edit host and port variables for the Memorystore target and the Redis source. You can get the Memorystore information from the console.
Both commands should return with 100% of success as see below
Install the Redis CLI:
Enable Key Space Notifications on the source Redis instance
RIOT uses keyspace notifications to capture any updates to the database for replication.
Step 4: Use RIOT to begin the migration
Start RIOT
RIOT will provide the status of the initial sync (Scanning) and the changes being streamed in real time (Listening)
Step 5: Validation
There are many ways you can validate the success of your migration such as dumping each database and comparing or checking the number of total keys. For the sake of this walkthrough, we will be validating by comparing key counts between source and target to ensure that the replication is caught up. Note: On instances with a high rate of change, this could be hard to get extremely accurate. We have written a simple code to automate this completely. Please follow below steps:
- Download the repo from releases page here on github
- Unzip the file
- Excute commands as shown on this page
Get Keyspace on all shards
Step 6: Cutover production traffic and decommission the old instance:
You have two options for production cutover:
- If your application does not require strong consistency between the source and destination, simply modify your redis client to point to the new Memorystore for Redis Cluster instance and go live. This will result in zero downtime and your databases will be eventually consistent.
- For use cases where strong consistency is required, stop write traffic to your Redis database. Wait for RIOT to complete the replication of the remaining changes to the new Memorystore for Redis Cluster instance. Update your redis client configuration to point to the new Memorystore for Redis Cluster instance and go live. This will result in a few seconds to a few minutes of downtime based on write frequency and replication lag, but will provide strong consistency.
You are now live on Memorystore for Redis Cluster. You can now review the Monitoring tab in the console to see usage metrics of your production workload.
Final considerations
The launch of Memorystore for Redis Cluster will allow you to take your applications to the highest scale while providing microseconds latency. Memorystore for Redis Cluster removes the burden of managing Redis, so that you can focus on shipping new features and applications that provide value to your users.
With this migration guide, you have a framework for an easy migration with zero downtime with some back fill if there is replication lag. GCP is here to support your adoption of Memorystore. To learn about the latest releases for Memorystore for Redis Cluster, we suggest following our Release Notes.