This document describes how to plan and execute the process of migrating an unpartitioned MySQL 5.7 database to Cloud SQL, a fully managed database service on Google Cloud. This document assumes that you have a MySQL database and are generally familiar with MySQL and Google Cloud concepts.
The concepts discussed in this document apply to other versions of MySQL and to MariaDB, a popular open-source fork of MySQL. However, for versions other than MySQL 5.7, the processes might need minor modification. There are many third-party tools and Cloud SQL Partners that you can use for planning a database migration. This document discusses only native MySQL and Google Cloud functionality, and does not discuss any additional tools or partners.
A companion tutorial walks you through one of the migration processes described in this document.
Overview
Migrating a MySQL database to another environment requires you to move two elements:
- The data in the database. This is the storage layer of the database.
- The management system that handles consistent reads and writes to the storage layer. This is the management layer.
The challenge is that these elements have to be constantly in sync for the database to work. Even a small database might get thousands of requests per second. Any delay in transactions or any network problems between the storage and management layers will have negative consequences on the database, on the dependent applications, and on user satisfaction.
Migration methods
MySQL has two features that can help you migrate to Cloud SQL. These features are the basis for two strategies for migrating MySQL databases: export/import migration and external replica promotion migration.
Export/import migration
In the export/import strategy, you export all of the data in the source
database storage layer by using the MySQL mysqldump
command. You then import
this data directly to a new database management layer. This normally requires
downtime for the database during the entire migration in order to keep all data
in sync.
External replica promotion migration
In the migration strategy of external replica promotion, you create an external database replica and synchronize the existing data to that replica. This can happen with minimal downtime to the existing database.
When you have a replica database, the two databases have different roles that are referred to in this document as primary and replica.
After the data is synchronized, you promote the replica to be the primary in order to move the management layer with minimal impact to database uptime.
In Cloud SQL, an easy way to accomplish the external replica promotion is to use the automated migration workflow. This process automates many of the steps that are needed for this type of migration.
Preparing any MySQL database for migration
Before you perform the migration steps, you need to ensure that your database is prepared for either option. MySQL is installed with some default configurations, and can be tuned and customized according to the requirements of the application. Before you begin a migration, make sure MySQL is configured as described in this section. This helps lower the risk of migration failure and helps limit downtime.
Perform a logical backup
With either migration method, you'll be making changes to your MySQL database.
To protect the data in your database, before you perform any migration actions,
you should use the MySQL mysqldump
command to create a data export that you
keep as a backup. This backup is used as the starting point for both migration
methods.
The mysqldump
command takes a snapshot of all data in the database. Depending
on the size of your database, this process can take from less than a second to
several hours. If any data is changed after the snapshot is taken, those
changes are not captured in the final snapshot. If you're using the external
replica promotion migration process, this will not be a concern. However, for
the export/import migration process, this can cause data inconsistencies and
lost data.
Store the complete backup in a safe location. Options include Cloud Storage, a local object storage system, and offsite backups.
Benchmark database performance metrics
Before you begin any migration, you should have a complete collection of the performance benchmarks of the existing application and database. This allows you to understand the expected behavior and current performance metrics.
Make sure that the benchmarks you have represent the use cases for your apps and your database. Later, when you use the benchmarks to compare performance of the migrated database, you want to be sure that you can make meaningful comparisons between the old and new database deployments.
Determine your connectivity options
Network connectivity requirements are different for each migration strategy. The most important question to help you decide what method you can use is whether your database can be accessed using a public IPv4 address.
When you perform an external replica promotion migration using the Cloud SQL automated migration workflow, your existing database must be publicly accessible using an IPv4 address. The destination database in Cloud SQL requires a persistent network connection throughout the promotion or migration process. Depending on the size of your database, this could be an extended amount of time.
You should also consider connectivity options on your destination database. Cloud SQL can use private IP addresses. You can also configure it to use public IP connectivity.
Determine your root password
During migration, you run many privileged MySQL commands and tasks. Therefore,
you must have access to the MySQL root
user, or access to an account with the
SUPER
and GRANT
privileges. If you don't know the username and password for
one of these privileged accounts, you need to consult with your database
administrators or follow the MySQL
password reset
procedures.
Perform a test migration and create a runbook
Before you perform a production migration, you should perform test migrations. Doing so lets you verify your steps, gain confidence in your methods, and predict the time required for the migration. You can use the lessons learned from this experience to generate a runbook that supports your migration. Use the runbook as a working document to support coordination, troubleshooting, and execution of your migration.
This document can serve as a primer for the creation of your database migration runbook.
Preparing the application for a database migration
The last preparatory step is to prepare your application for the migration. A discussion of all of the application-to-database connection methods is beyond the scope of this document. However, it's important to understand how IP addresses and domain name system (DNS) connectivity impact your application.
Using public or internal DNS is a common method to access a database; using DNS allows the underlying IP address of the destination database to change without an update to application code. If your application relies on DNS records for database connectivity, you usually want to lower the DNS time-to-live (TTL) values for those DNS records for the migration.
DNS TTL values are measured in seconds, and they represent the validity of a DNS record stored on a client or host. This time is also known as a DNS cache window. Default DNS TTL values are different for every DNS provider. For example, Cloud DNS sets the TTL by default at 5 minutes (300 seconds). Because database DNS entries rarely change, your DNS administrator might have set this much higher.
When you migrate your database, if your DNS TTLs are high, it could prolong the database downtime if clients have a cached DNS value. By lowering the TTL value, you force clients to check more often for updated records. This action helps to keep database downtime to a minimum.
In anticipation of making changes to the DNS records for the database, you should set this number to the lowest possible configurable value. If your DNS provider is SaaS-based, this can lead to increased costs. Make sure that you check with your provider or administrator before making this change.
Export/import migration
The export/import migration strategy relies on the MySQL mysqldump
command to
export a logical snapshot from the source environment database. You then import
the snapshot into the target environment database.
The simplicity of this method is offset by the fact that the database experiences downtime during migration. Because transactions are continuously occurring in large databases, administrators commonly do the following:
- Disable the source database's ability to write new data.
- Create the snapshot from the source database.
- Import the snapshot to the destination database.
- Update the application configuration to point to the new database.
- Verify performance on the new database.
The following diagram illustrates this sequence:
This sequence ensures that no changes are made to the data between the time you stop writes on the source database and the time you start writes on the new database. Depending on the size of your data, this process can result in prolonged downtime for your application. This is often unacceptable for many businesses and applications.
When you use the export/import migration strategy, your database does not need
external connectivity to Cloud SQL. However, you need to be able to
export the mysqldump
files to a location that's accessible by
Cloud SQL. Depending on the size of your mysqldump
file, this data
import/export process can take a long time, and it can require complex file-copy
and verification techniques to ensure that all data rows and all transaction
information are copied correctly.
In addition, your organization's data classification rules can have an impact
on your ability to store the mysqldump
files in other locations. Make sure
that you consult these rules before you choose any method of storing data.
Export/import migration step by step
The following sections provide details about each of the steps of the overall sequence.
Disable writes
The administrator configures the database to prevent write operations. This is also referred to as locking or freezing the database. After this is accomplished, any write transaction that's sent to the database fails. However, any read transactions against the database, or against any replicas, continue to function normally. You should run a few test queries to verify this behavior before you move to subsequent steps.
Make sure that you understand which applications might be affected by locking the database. You might need to notify users, customers, administrators, and other departments that depend on the database in order to prevent a larger system failure.
Export data
After the database is locked, you execute the mysqldump
command in the old
database in order to produce a full export of the data. When the export process
is completed, this file must be moved to a location where it's accessible by the
new database.
Make sure that you send backup files to a physical drive separate from the one that contains the production database. This separation reduces disk I/O contention and can increase the speed of the backup. Taking the write load away from the main drive decreases the effect on your production database operations. This separation also reduces the chance of running out of space on the disk containing your production database.
Compressing the output of the mysqldump
command can also help reduce the total
number of writes needed for the backup. Compression can require increased CPU
resources, so be sure to check how this might affect the performance of the
database management layer.
For more information, see Exporting Data for Import into Cloud SQL.
Import data
The new database must import the dump file from the old database. The import process re-creates the database tables and all of the rows from the old database. The time needed for an import is comparable to the time it took for the export, given similar system resources. If you're upgrading system resources as part of your migration, the import process could be faster than the export.
For more information, see Importing Data into Cloud SQL.
After the data is imported, you should review the benchmark performance information that you got from testing the source database. You can then run read-based production queries against the new database to ensure that the new database meets the performance needs of your applications.
Update applications
When you're ready to start using the new database, you need to update your applications to support the new database. There are a few methods to accomplish this task:
- If your application has the database endpoint or IP address defined directly in the source code, update the code and deploy this update to all applications that need to access the database.
- If your application points to a DNS record to access the database, update the DNS record through your DNS provider to reflect the new database endpoint address.
- If your application points to a load balancer to access the database, register the new database endpoint with the load balancer.
When you cut over your applications to use Cloud SQL, you might want to warm up your buffer pool. This helps minimize the query latency that your applications might experience before the database stores data into memory. You can use the MySQL slow-query log from your source production database to create a script to execute all SELECT statements that will populate the cache on your new instance. You should do this just before you enable writes.
There are many third-party tools and partners that can help with warming up your buffer pool.
Enable writes
After you verify that the updated application code can communicate with the new database, you need to remove the read-only lock on the new database. After that step, you can verify the proper function of the database through your application. You should also perform another round of benchmark testing to ensure that your database performance still meets the needs of your application.
When you've finished verification and determined that all of the data is consistent between both databases, you can complete a few clean-up tasks:
- Notify users that downtime is over.
- Update status pages and remove any status messages from the application.
- Revert any modified DNS TTLs to their original values.
Suggested practices for optimizing import time for export/import migrations
There are a number of things that you can do to help optimize the import speed
of a mysqldump
file. This in turn decreases the total amount of time that a
server needs to be in a read-only state.
- Optimize your schema. Structure your data to avoid foreign keys. If possible, export the data in primary key order.
- Optimize your resources. If possible, launch your new database with enough RAM to support the entire dataset. In addition, using SSDs for your database storage can greatly increase your data throughput. SSDs allow more input/output operations per second (IOPS) than mechanical disks. Google Cloud SSDs also increase IOPS as disk size increases. This means that it might be worth the extra cost and over-provisioned space of a larger disk in order to gain additional IOPS.
- Optimize monitoring. Having a robust monitoring solution in place to detect any problems with the old or new database can be extremely useful to a migration. For example, you can use Cloud Monitoring and the MySQL plugin. Monitoring allows you to detect problems early, identify what normal database behavior looks like, and compare the new database against your previous database baseline.
Exporting and importing large databases
The mysqldump
process works quickly on databases that are smaller than 10 GB.
Importing databases larger than 10 GB can require you to use advanced strategies
to minimize the amount of database downtime you experience during the final
cutover.
Export the schema and data in subsections
One strategy is to export your database schema and data in sections, without
secondary keys. This allows you to add the secondary keys using ALTER TABLE
statements at the end of the migration. Using this approach, you export
different sets of tables to different files based on their key relationships.
You can then run the MySQL import processes in parallel from a
Compute Engine instance. One way to export data for parallel import is
by manually editing the export files themselves. However, that can be time
consuming and error prone.
Take advantage of timestamps
If any of your tables use timestamp columns, you can use a feature of the
mysqldump
command that allows you to
specify a WHERE clause.
This allows you to load time-stamped data into Cloud SQL over multiple
imports.
When you're ready for the final migration cutover, you can use this technique to export only rows in the source database that have changed since your last export. If you don't have timestamped tables, and you are comfortable introducing a timestamp column to your tables in the source database, you can add a MySQL trigger on each source table to set the timestamp whenever a row changes.
A concern with this method is tracking deleted rows. The timestamp column
tracks only rows that have changed through INSERT
or UPDATE
statements; rows
that are deleted are removed from the table. Therefore, using a timestamp to
find rows that have been deleted after the initial migration works only if your
application and database were built to use a soft delete mechanism. In this
technique, instead of issuing a DELETE
statement, you set an is_deleted
column to true to indicate that a row is deleted. Alternatively, you can create
a table to track deleted rows for each of your real tables, and a matching
on_delete
trigger that inserts deleted rows into the deletion-tracking table.
During the final phase of your migration, you can create the necessary DELETE
statements to remove those rows from your Cloud SQL instance.
Validating your export files
Whenever you modify your export files manually, or take exports of different tables, you should verify that your data was accurately exported.
Comparing export files
One way to validate the exported data is to implement an idle test database.
You can create an idle test database by restoring a recent production backup to
a new, stand-alone database instance. Then you restore your mysqldump
file to
the new idle test database. Next, you can test your multi-phased mysqldump
procedure, and then restore that in parallel to yet another idle test database.
This should provide you two identical copies of the database. You can then
verify the data by running the mysqldump
command on the two idle test
databases and comparing all three using a diff
command or file comparison
tool.
The following diagram illustrates this sequence and shows the artifacts created during the process.
External replica promotion
The second option for migrating your MySQL database is to use an external replica promotion. In this strategy, you create a replica database and set your existing database as the primary. You wait until the two databases are in sync, and you then promote your MySQL replica database to be the primary. This process minimizes database downtime related to the database migration.
Prerequisites
Using the external replica promotion migration method requires two additional prerequisites: creating a replication user and enabling GTID replication.
Establish a replication user
Certain versions of MySQL, including version 5.7 (which is used for the
examples in this document), store the username and password of any user account
used for replication in clear text in the primary database's logs. If you use
your root
user account to perform replication, you're storing your root
password in your plain text logs. Using the root
user for replication
therefore poses a security risk.
To help limit the possibility of compromising other aspects of your database, you should create a completely separate account for performing the replication. The new account should have only the privileges that are needed for the replication process. This account should also be limited to allow replication only from the IP address of the primary database.
Configure GTID
Your source database must have GTID replication enabled in order to perform an external replica promotion migration. If you haven't configured replication in your source database, GTID replication might not be enabled.
A global transaction identifier (GTID) is a unique identifier that's associated with each transaction committed on the primary database server. These identifiers are used to build the replication process between a replica and primary database.
New instances of Cloud SQL for MySQL use GTID replication by default. You cannot disable this feature, and therefore certain SQL statements and operations are not allowed. For more information, read the Differences between Cloud SQL and standard MySQL functionality and the MySQL restrictions on replication with GTIDs.
External replica promotion migration overview
Compared to an export/import migration, external replica promotion migration can be significantly easier. Cloud SQL can help automate this process if you have completed all of the prerequisites described earlier. The automated migration workflow still requires an export/import process, but it abstracts away some of the manual steps as part of the automation.
The automated migration workflow supports the following scenarios:
- Migrating from on-premises to Cloud SQL.
- Migrating from another cloud provider to Cloud SQL.
- Migrating from one Google Cloud project to another Google Cloud project.
During the automated migration workflow, you do the following:
- Provide details about your data source.
- Create a Cloud SQL read replica.
- Synchronize the read replica with the source.
- Promote the Cloud SQL read replica to be the primary instance.
Provide details on your data source
When you run the automated migration workflow with Cloud SQL, you provide a reference name for the data source. This name is used to refer to the data in the migration, and does not need to match the actual name of your source.
You also provide the public IP address and port of the source database. You need to ensure that your Cloud SQL database has direct connectivity to the source database using an IPv4 address. You also need to ensure that any firewalls or security appliances protecting the source database are configured to allow connectivity from the new Cloud SQL database IP address.
You must also provide the automated migration workflow with your MySQL replication user credentials and with the version of MySQL that you want to migrate. This is needed in order to authenticate with the server and to support a secure transfer in the next step. You can also configure SSL/TLS options as needed.
Create a Cloud SQL read replica
When you run the automated migration workflow, you set options for creating the new Cloud SQL instance in Google Cloud. This requires a few steps:
- Setting a database instance ID.
- Choosing a Google Cloud region and zone.
- Selecting an instance type.
- Selecting the disk storage type.
- Choosing total storage capacity. We recommend enabling automatic storage increases. This option allows your database to grow as needed, and reduces the risk that your database will run out of disk space.
Finally, you provide a link to the snapshot that you stored in Cloud Storage. Make sure you have the latest export files in a Cloud Storage bucket that's in the same project as the Cloud SQL instance.
Synchronize the read replica with the source
After you begin the migration process, the new Cloud SQL VM launches and begins to replicate from the source. When the synchronization is completed, you can compare the data in the database to confirm that replication is complete. You can also begin benchmarking to test the performance of the Cloud SQL database against the requirements of your application and against the previously measured baseline performance of your source database.
Promote the read replica to be the primary instance
When the data is replicated, you place the old database into read-only mode. This prevents any writes to the database, and ensures that no data is changed during the promotion process. Then you update the application code or DNS entries to support the new endpoint in Cloud SQL, as described earlier.
When the source database is in read-only mode and your application has been updated with the new Cloud SQL endpoint address, you can go to your new database instance in Cloud SQL and promote the replica to be the primary server. This promotion process automatically re-enables writes on the newly promoted Cloud SQL primary.
After you promote the replica to be the primary server, your database might experience several minutes of downtime, regardless of the database size. This is the amount of time that Cloud SQL needs in order to perform all of the tasks and configuration work involved in promoting the database.
When the promotion is finished, your primary database is running in Cloud SQL and your migration is complete. This is a good time to run another set of performance tests against the application requirements and against the source database baseline performance. When you have the new set of Cloud SQL performance benchmarks, you can plan for other database optimizations.
Optimizing after the migration
When migration is complete, regardless of your migration method, you can optimize your Cloud SQL database in these ways:
- Right-size your VM. By analyzing your monitoring data, you can see if you need to choose a differently sized VM for the database. You might be able to reduce costs by using a smaller instance. Conversely, your database might perform better on a larger VM with more resources.
- Add additional indexes. If your database could benefit from additional indexes or other optimizations, you can add them after you've finished the migration.
- Turn on additional transactional logging. If you explore your server logging level, you can investigate the benefits from enabling binary logging or other enhanced logging options.
- For the external replica promotion migration method, re-enable binary logging and take advantage of automated backups. (After replica promotion migration, Cloud SQL automated backups are disabled because binary logging is off.)
What's next
- Walk through a tutorial on migrating MySQL to Cloud SQL using the automated migration workflow.
- Explore migrating a monolithic application to microservices on Google Kubernetes Engine.
- Learn about Cloud SQL performance tips and how to diagnose issues with Cloud SQL instances.
- Browse the Cloud SQL partners page for help extending the power of Google Cloud.
- Try out other Google Cloud features for yourself. Have a look at our tutorials.