When the unexpected happens, your business can suffer. Hardware failures, database corruption, user mistakes, and even malicious attacks can create risk of data loss. Database backups help you recover and keep your business up and running–even if something goes wrong. And when things are going right, database backups help you meet compliance and audit requirements by letting you restore your database to a specific point in the past.
MySQL supports several options to back up your databases, each with different strengths. You can choose any combination of methods that best suit your needs.
There are two main ways to backup a MySQL database: physical and logical. The physical backup copies the actual data files and logical backup generates statements such as CREATE TABLE and INSERT that can recreate the data.
A physical backup contains the raw copies of all of the files and directories as they exist on the disk. This type of backup is suitable for large databases because copying the raw files is much faster compared to taking a logical backup.
Advantages:
Disadvantages:
Some common physical backup tools across the MySQL ecosystem
A logical backup contains the data in the database in a form that MySQL can interpret either as SQL or as delimited text. It is a representation of your database as a sequence of SQL statements that can be executed to recreate database objects and import data. This type of backup is suitable for small databases because logical backups can take substantially longer to restore than restoring a physical backup. This backup type is also useful when migrating to or from managed database services in the cloud.
Advantages:
Disadvantages:
Some common logical backup tools
As the name suggests, point-in-time recovery helps you recover an instance to a specific point in time. For example, if an error causes a loss of data, you can recover a database to its state before the error occurred.
PITR is a two step process, which relies on binary logs:
Binary logs contain any changes made to the database instance, such as create table, row insert/update/delete, and others. For example, say your daily backups run at 6:00 AM and you want to be able to recover your instance at any point until 10:15 AM. To recover your state at 10:15 AM, you would first restore the full backup from 6:00 AM, and then you would replay the binary log events from 6:00 AM until 10:15 AM. This will bring the server up to the desired state at the desired time.
Binary logs are required for replication and PITR; they are as important as the underlying data. Binary logs need to be backed up in real time for them to be effective in your recovery planning, and you can download them using the mysqlbinlog command.
For example:
mysqlbinlog --host=<hostname> --port=<port> --user=<user> --password=<password> --read-from-remote-server --raw --stop-never <binlog_filename> |
The <binlog_filename> will be the first file to download, and mysqlbinlog will automatically switch to the next file(s) afterward. Those binary logs get written in the current directory of the server which runs the mysqlbinlog command. You can change the file name and location using the --result-file option. You can also use the --stop-never option to enable mysqlbinlog binlog to remain connected to the server and download new changes as they are written.
You can learn more from the mysqlbinlog manual about how to connect to a remote server and download the binary logs as they are written.
As Google Cloud’s managed database service for MySQL, Cloud SQL offers automated backups and point-in-time-recovery (PITR) for data protection. These are enabled by default and required for enabling high availability (HA) on a Cloud SQL for MySQL instance.
Any Cloud SQL backup is a type of physical backup, as they are snapshots taken on Persistent Disk (PD). Cloud SQL offers the flexibility to make backups happen automatically, or you can take them on demand at any time. You can still take a logical backup by using the standard MySQL logical backup tools such as mysqldump, mydumper, mysqlpump, and others without interfering with your managed backups.
Let’s dig into how Cloud SQL’s backups work.
Cloud SQL uses Persistent Disk (PD) for storage, and every Cloud SQL instance has a persistent data disk attached for storing the database files and directories.
Cloud SQL uses PD snapshots for backups, and those snapshots refer to the state of the data disk at a given point in time. The first snapshot of a PD is a full snapshot that contains all the data on the PD, the subsequent snapshots are incremental and only contain any new data or modified data since the previous snapshot. You can think of these snapshots as cloud-managed physical copies of persistent data disks, and they are the basis of all of Cloud SQL’s backup and restore features, including PITR.
Cloud SQL performs two types of managed backups: automated, and on demand. Both types are stored in the closest multi-region location to the instance by default. For example, if your Cloud SQL instance is in us-central1, your backups are stored in the US multi-region by default. However, a default location like australia-southeast1 is outside of a multi-region and will be placed in the closest multi-region, which is asia. You can also choose a custom location for your backups.
Automated backups
Automated backups are taken daily during a 4-hour window of your choosing. The backup starts during the backup window and may continue outside the backup window until it completes. You can configure how many automated backups to retain, from 1 to 365. The default retention policy is to retain the seven most recent backups.
On-demand backups
You can also create on-demand backups anytime. These are useful if you need a backup and don’t want to wait for your backup window. Also, unlike automated backups, on-demand backups are not automatically deleted. They persist until you delete them or until their instance is deleted.
You can restore a backup to the same instance where it was taken or to a different instance in the same project. Note that the target instance shouldn’t be a read replica itself, nor it should have read replica(s) at the time of restoring the backup, as read replicas are copies of a primary instance and creates the risk that the replicas would get out of sync with the primary. You can always add read replicas afterward.
Restoring the backup creates a new PD from that backup snapshot and attaches it to the instance. The database starts using the new disk and performs the standard MySQL crash recovery process before coming online as a full-fledged MySQL database freshly restored from the snapshot.
Restore to the same instance
When you restore from a backup to the same instance, you return the data on that instance to its state when you took the backup.
Restore to a different instance
When you restore from a backup to a different instance, you update the data on the target instance to the state of the source instance when you took the backup.
Important: Restoring a backup overwrites all the current data on the target instance, including previous point-in-time recovery logs. Overwritten data cannot be recovered.
In Cloud SQL, PITR always creates a new instance which inherits the settings of the source instance, similar to the clone instance operation. This feature requires both automated backups and point-in-time recovery (binary logs) to be enabled on the source instance. The default retention policy on binary logs is 7 days, and you can configure the retention period to be anywhere from 1 to 7 days.
PITR is achieved by creating a new instance from backup of the original instance and replaying the binary logs stored on the original instance data disk to the given point.
When performing PITR in Cloud SQL, customers can choose to either clone the current state of the instance or clone to a timestamp in the past.
The UI to perform the PITR looks like the following.
Start building on Google Cloud with $300 in free credits and 20+ always free products.