Cloud Spanner Backup and Restore lets you create backups of Cloud Spanner databases on demand, and restore them to provide protection against operator and application errors which result in logical data corruption. Backups are highly available, encrypted, and can be retained for up to a year from the time they are created. If you need longer retention times, we recommend exporting your database.
You can use Backup and Restore in the following ways:
- In the Google Cloud Console
- Using the gcloud command-line tool
- Using the client libraries
- Using the REST or RPC APIs
For logical data corruption, Cloud Spanner also offers point-in-time recovery.
Data consistency: Backups are a transactionally and externally consistent copy of a Cloud Spanner database at the
version_timeof the backup.
Replication: Backups reside in the same instance as their source database and are replicated in the same geographic locations. For regional instances, a copy of the backup is stored in each of the three read-write zones. For multi-regional instances, a copy is stored in all zones that contain either a read-write or read-only replica.
Automatic expiration: All backups have a user-specified expiration date which determines when it will be automatically deleted.
Choose between Backup and Restore or Import and Export
|Backup and Restore||Import and Export|
|Data consistency||Both backups and exported databases are transactionally and externally consistent.|
|Performance impact||Both do not run at high priority to minimize impact on database performance. Export runs at medium-priority whereas backup runs at low-priority. For more information, see task priority.|
|Storage format||Uses a proprietary, encrypted format designed for fast restore.||Supports both CSV and Avro file formats.|
|Portability||Backups reside in the same instance as their source database and cannot be moved.
You can restore a database to any instance in the project with the same instance configuration as the backup.
|Exported databases reside in Google Cloud Storage and the data can be migrated to any system that supports CSV or Avro.|
|Retention||Backups can be retained for up to 1 year.||Exported databases are stored in Cloud Storage where, by default, they are retained until they are deleted. You can customize lifecycle and retention policies.|
|Billing||Backups are billed to your Cloud Spanner project based on the storage used per unit time. For more details, see the Billing section.||Billing for import and export is more complicated due to its use of Google Cloud Storage and Dataflow. For more information, see Database export and import pricing.|
|Restore time||Restore happens in two operations: restore and optimize. The restore operation offers fast time-to-first-byte because the database directly mounts the backup without copying the data. After the restore operation completes, the database is ready for use, though read latency might be slightly higher while it is optimizing. For more information, see How restore works.||Import is slower. You need to wait for all the data to be written into the database.|
How backup works
Users can create a backup of any Cloud Spanner database. These backups are complete, in the sense that they contain all of the data in the database (including the schema and secondary indexes) at the
version_time of the backup. Any modifications to the data or schema after the
version_time will not be included in the backup. Backups do not contain database metadata such as Identity and Access Management (IAM) policies.
Cloud Spanner backups, like databases, can be protected by CMEK or Google-managed encryption. By default, a backup uses the same encryption config as its database, but you can override this behavior by specifying a different encryption config when creating the backup. If the backup is CMEK-enabled, it is encrypted using the primary version of the KMS key at the time of backup creation. Once the backup is created, its key and key version cannot be modified, even if the KMS key is rotated. For more information, see create a CMEK-enabled backup.
When you create a backup, you must specify a source database, a name for the backup resource, and an expiration date (up to 1 year from backup creation time). You can also optionally specify a
version_time, which lets you backup your database at an earlier point in time. The
version_time field is typically used to either synchronize the backups of multiple databases or recover data using point-in-time recovery. If
version_time is not specified, then it is set to the
create_time of the backup. The system creates a backup resource and a long-running backup operation to track the progress of the backup.
To ensure external consistency of the backup, Cloud Spanner pins the contents of the database at create time. This prevents the garbage collection system from removing the relevant data values for the duration of the backup operation. Then, every zone in the instance begins copying the data in parallel. If a zone is temporarily unavailable, the backup is not complete until the zone comes back online and finishes. Backups are restorable as soon as the operation is done.
Backups are resources in Cloud Spanner. Each backup resource is organized under the same instance as its source database in the resource hierarchy and has a resource path in the form
projects/<project>/instances/<instance>/backups/<backup>. A backup continues to exist even after its source database has been deleted, but cannot outlive its parent instance. To prevent accidental deletion of backups, you cannot delete a Cloud Spanner instance if there are backups. For users who want to delete the instance, we recommend restoring the backup and then exporting the restored database, before deleting the backup and the instance.
Backup time and performance
The amount of time it takes to create a backup depends on various factors, but is primarily determined by the size of the database vs the number of nodes. If you need faster backup times, you can increase the number of nodes, but keep in mind that changes to node count take effect for subsequent backups.
Backup creation also uses idle CPU, so make sure your CPU usage falls within the recommended guidelines. Overloading the CPU could result in very long backup times and could also adversely affect database latency.
The CPU in an instance is shared by all of the ongoing backups in the instance. Creating backups of different databases in the same instance at the same time, can result in long backup times.
How restore works
When you restore, you must specify a source backup and a new target database. You cannot restore to an existing database. The new database must be in the same project as the backup and be in an instance with the same instance configuration as the backup. For example, if a backup is in an instance configured
us-west3, it can be restored to any instance in the project that is also configured
us-west3. The node count of the instances does not need to be the same. The restored database will have all the data and schema from the original database at the
create_time of the backup. It will not have any IAM permissions (except for those inherited from the instance containing the restored database) and users should apply appropriate IAM permissions after the restore completes. The restore process is designed for high-availability as the database can be restored as long as the majority quorum of the regions and zones in the instance are available.
To restore a CMEK-enabled backup, both the key and key version must be available to Cloud Spanner. The restored database, by default, uses the same encryption config as the backup. You can override this behavior by specifying a different encryption config when restoring the database. For more information, see restore from a CMEK-enabled backup.
It's important to understand that a restored database transitions between three
States which are tracked by two operations.
CREATINGstate: When you initiate a restore, the system creates a new database and long-running database operation with
RestoreDatabaseMetadatato track the progress of the restoration. The new database begins and remains in the
CREATINGstate, which means it is not ready for use, until the restore operation is complete. To provide fast restore times (typically under 10 minutes), the restore operation works by mounting the files in the backup without copying them to the database.
READY_OPTIMIZINGstate: Once the restore operation completes, the database transitions to the
READY_OPTIMIZINGstate. In this state, it is ready for use, but you may experience some slightly higher read latencies while the database reads data from the backup. Any attempt to delete the backup will fail while it is still in use for database restoration or optimzation.
When the restore operation completes, you will get another long-running database operation with
OptimizeRestoredDatabaseMetadatato track the progress of the optimization. The optimize operation copies the data from the backup to the database. If you would like to speed up the optimize process, you can add more nodes to the instance.
READYstate: Once the optimization operation completes, the database transitions to the
READYstate. At this point, the restored database is fully performant and no longer references the backup.
An instance can have at most one database in the restore
CREATING state. You will not be able to restore another backup to the instance until the restored database transitions to the
You are billed based on the amount of storage used by your backups per unit time. Billing begins once the backup operation is complete and will continue until the backup has been deleted. There is no charge for restoring from a backup.
A completed backup is billed for a minimum of 24 hours. If you create a backup, then delete it a minute after it finishes, you are still billed for 24 hours.
For more complete information on backup costs, see the Cloud Spanner Pricing page.
Access control (IAM)
IAM lets you control access to Cloud Spanner resources, which include backups and restored databases. If you are new to IAM, roles, and permissions, see IAM Overview for an introduction.
Backup resources are organized under instances in the Cloud Spanner resource hierarchy. We recommend applying IAM policies at the project level or instance level. If you need finer grain control, IAM policies can also be applied at the backup and database level as well, but this is not recommended due to complexity. Remember that backups do not contain database metadata such as IAM policies, so when you restore a database, the database will initially inherit policies from its parent instance.
This section describes the predefined roles that have access to backup and restore.
The following roles are designed specifically for backup and restore:
spanner.backupAdmin: has access to create, view, update, delete backups. This role can also view and manage a backup's IAM policy. This role cannot restore a database from a backup.
spanner.restoreAdmin: has access to restore databases from backups. If you need to restore a backup to a different instance, apply this role at the project level or to both instances. This role cannot create backups.
spanner.backupWriter: has access to create backups, but cannot update, or delete them. This role is intended to be used by scripts that automate backup creation.
The following roles also have access to backup and restore:
spanner.admin: has full access to backup and restore. This role has complete access to all Cloud Spanner resources.
owner: has full access to backup and restore
editor: has full access to backup and restore
viewer: has access to view backups, backup operations, and restore operations. This role cannot create, update, delete, or restore a backup.
For more information, see Cloud Spanner IAM.