Cloud Spanner backup and restore features let you create backups of Cloud Spanner databases on demand, and restore them to provide protection against operator and application errors, which can result in logical data corruption. Backups are highly available, encrypted, and can be retained for up to a year from the time they are created. If you need longer retention times, we recommend exporting your database.
You can perform backup and restore in the following ways:
- In the Google Cloud console
- Using the gcloud command-line tool
- Using the client libraries
- Using the REST or RPC APIs
For logical data corruption, Cloud Spanner also offers point-in-time recovery.
Data consistency: Backups are a transactionally and externally consistent copy of a Cloud Spanner database at the
version_timeof the backup.
Replication: Backups reside in the same instance as their source database and are replicated in the same geographic locations. For regional instances, a copy of the backup is stored in each of the three read-write zones. For multi-regional instances, a copy is stored in all zones that contain either a read-write or read-only replica.
Automatic expiration: All backups have a user-specified expiration date which determines when it will be automatically deleted. Cloud Spanner deletes expired backups asynchronously, so there can be a lag between when a backup is expired and when it's actually deleted.
Choose between backup and restore or import and export
|Backup and Restore||Import and Export|
|Data consistency||Both backups and exported databases are transactionally and externally consistent.|
|Performance impact||Backups have no impact on an instance's performance. Cloud Spanner performs backups using dedicated jobs that do not draw upon an instance's server resources.||Export runs as a medium-priority task to minimize impact on database performance. For more information, see task priority.|
|Storage format||Uses a proprietary, encrypted format designed for fast restore.||Supports both CSV and Avro file formats.|
|Portability||Backups reside in the same instance as their source database and cannot be moved.
You can restore a database to any instance in the project with the same instance configuration as the backup.
|Exported databases reside in Google Cloud Storage and the data can be migrated to any system that supports CSV or Avro.|
|Retention||Backups can be retained for up to 1 year.||Exported databases are stored in Cloud Storage where, by default, they are retained until they are deleted. You can customize lifecycle and retention policies.|
|Billing||Backups are billed to your Cloud Spanner project based on the storage used per unit time. For more details, see the Billing section.||Billing for import and export is more complicated due to its use of Google Cloud Storage and Dataflow. For more information, see Database export and import pricing.|
|Restore time||Restore happens in two operations: restore and optimize. The restore operation offers fast time-to-first-byte because the database directly mounts the backup without copying the data. After the restore operation completes, the database is ready for use, though read latency might be slightly higher while it is optimizing. For more information, see How restore works.||Import is slower. You need to wait for all the data to be written into the database.|
How backup works
Users can create a backup of any Cloud Spanner database. These backups are complete, in the sense that they contain all of the data in the database (including the schema and secondary indexes) at the
version_time of the backup. Any modifications to the data or schema after the
version_time will not be included in the backup.
Backups include all database
options that are set with the
ALTER DATABASE SET OPTIONS command, but do not
include Identity and Access Management (IAM) policies.
Backups also include the schema of a database's change streams, but not any existing change records. Change stream data is meant to be streamed out and consumed near-simultaneously with the changes it describes. As such, Spanner excludes this data from backups.
Cloud Spanner backups, like databases, are encrypted by either Google-managed or customer-managed (CMEK) encryption. By default, a backup uses the same encryption config as its database, but you can override this behavior by specifying a different encryption config when creating the backup. If the backup is CMEK-enabled, it is encrypted using the primary version of the KMS key at the time of backup creation. Once the backup is created, its key and key version cannot be modified, even if the KMS key is rotated. For more information, see create a CMEK-enabled backup.
When you create a backup, you must specify a source database, a name for the backup resource, and an expiration date (up to 1 year from backup creation time). You can also optionally specify a
version_time, which lets you backup your database at an earlier point in time. The
version_time field is typically used to either synchronize the backups of multiple databases or recover data using point-in-time recovery. If
version_time is not specified, then it is set to the
create_time of the backup. The system creates a backup resource and a long-running backup operation to track the progress of the backup.
To ensure external consistency of the backup, Cloud Spanner pins the contents of the database at create time. This prevents the garbage collection system from removing the relevant data values for the duration of the backup operation. Then, every read/write and read-only zone in the instance begins copying the data in parallel. If any zone is temporarily unavailable, the backup is not complete until the zone comes back online and finishes. Backups are restorable as soon as the operation is done. For multi-region instances, all read/write and read-only zones in all regions must complete their backup replicas before the backup is marked as restorable.
Backups are resources in Cloud Spanner. Each backup resource is organized under the same instance as its source database in the resource hierarchy and has a resource path in the form
projects/<project>/instances/<instance>/backups/<backup>. A backup continues to exist even after its source database has been deleted, but cannot outlive its parent instance. To prevent accidental deletion of backups, you cannot delete a Cloud Spanner instance if there are backups. For users who want to delete the instance, we recommend restoring the backup and then exporting the restored database, before deleting the backup and the instance.
Backup time and performance
When performing a backup, Cloud Spanner creates a backup job to copy data directly from the database to backup storage, and sizes this job based on the size of the database. This backup job does not use CPU resources allocated to database's instance and so does not affect the instance's performance. Moreover, compute load on the database's instance does not affect the speed of the backup operation.
To track progress and completion of a backup operation, see Show backup progress.
If a backup is taking longer than usual when no other factors have changed, it might be due to a delay in scheduling the backup task in a zone. This can sometimes take up to 30 minutes. We recommend that you do not cancel and restart the backup, as it's likely you'll encounter the same scheduling delay with the new backup as well.
How restore works
When you restore a Cloud Spanner database, you must specify a source backup
and a new target database.
You cannot restore to an existing database. The new database must be in the same
project as the backup and be in an instance with the same
instance configuration as the backup.
For example, if a backup is in an instance configured
us-west3, it can be
restored to any instance in the project that is also configured
compute capacity of the instances does not need to be the same.
The restored database will have all the data and schema from the original database
create_time of the backup, including all database options that
are set with the
ALTER DATABASE SET OPTIONS command, and all change stream configurations.
It will not have any IAM
permissions (except for those inherited from the instance containing the
restored database), and you must apply appropriate IAM
permissions after the restore completes. It will not include the internal
data of any change streams.
The restore process is designed for high-availability. The database can be restored provided that the majority quorum of the regions and zones in the instance is available.
To restore a CMEK-enabled backup, both the key and key version must be available to Cloud Spanner. The restored database, by default, uses the same encryption config as the backup. You can override this behavior by specifying a different encryption config when restoring the database. For more information, see restore from a CMEK-enabled backup.
CREATING: Cloud Spanner begins the restoration by creating a new database and mounting files from the backup. This typically takes ten minutes or less to complete. During this initial
CREATINGstate, the restored database is not yet ready for use.
Please note the following caveats regarding the
- If you are restoring to a different instance, the restore operation belongs to the instance containing the restored database, not the instance containing the backup.
- Cloud Spanner will not allow you to delete the backup while it is being
restored. You can delete it after the restore completes and the database enters
- An instance can have at most one database in the
CREATINGstate due to a restoration from backup. You will not be able to restore another backup to the instance until the restored database transitions to the
READYstate, described below.
READY_OPTIMIZING: After Cloud Spanner mounts the backup, it starts to copy the backup's data into the new database while optimizing its stored size. Your database is ready for use during this process. Depending on the amount of data involved, this phase of the restore might take days to complete.
While you can use your database as usual during
READY_OPTIMIZING, the following caveats apply:
- Read latencies might be slightly higher than usual.
- Storage metrics display the size of the new database, not the backup. Therefore, with the data transfer still in progress, Cloud Spanner storage metrics might show results that do not reflect the total size of all your data.
- As with the
CREATINGstate, Cloud Spanner will not allow you to delete the mounted backup.
READY: Once the copy-and-optimize operation completes, the database transitions to the
READYstate. The database is fully restored, and no longer references or requires the backup.
You are billed based on the amount of storage used by your backups per unit time. Billing begins once the backup operation is complete and will continue until the backup has been deleted. There is no charge for restoring from a backup.
A completed backup is billed for a minimum of 24 hours. If you create a backup, then delete it a minute after it finishes, you are still billed for 24 hours.
For more complete information on backup costs, see the Cloud Spanner Pricing page.
Access control with IAM
IAM lets you control access to Cloud Spanner resources, which include backups and restored databases. If you are new to IAM, roles, and permissions, see IAM Overview for an introduction.
Backup resources are organized under instances in the Cloud Spanner resource hierarchy. We recommend applying IAM policies at the project level or instance level. If you need finer grain control, IAM policies can also be applied at the backup and database level as well, but this is not recommended due to complexity. Remember that backups do not contain database metadata such as IAM policies, so when you restore a database, the database will initially inherit policies from its parent instance.
This section describes the predefined roles that have access to backup and restore.
The following roles are designed specifically for backup and restore:
spanner.backupAdmin: has access to create, view, update, delete backups. This role can also view and manage a backup's IAM policy. This role cannot restore a database from a backup.
spanner.restoreAdmin: has access to restore databases from backups. If you need to restore a backup to a different instance, apply this role at the project level or to both instances. This role cannot create backups.
spanner.backupWriter: has access to create backups, but cannot update, or delete them. This role is intended to be used by scripts that automate backup creation.
The following roles also have access to backup and restore:
spanner.admin: has full access to backup and restore. This role has complete access to all Cloud Spanner resources.
owner: has full access to backup and restore
editor: has full access to backup and restore
viewer: has access to view backups, backup operations, and restore operations. This role cannot create, update, delete, or restore a backup.
For more information, see Cloud Spanner IAM.