Backup and Restore

Overview

Cloud Spanner Backup and Restore lets you create backups of Cloud Spanner databases on demand, and restore them to provide protection against operator and application errors which result in logical data corruption. Backups are highly available, encrypted, and can be retained for up to a year from the time they are created. If you need longer retention times, we recommend exporting your database.

You can use Backup and Restore in the following ways:

Key features

  • Data consistency: Backups are a transactionally and externally consistent copy of a Cloud Spanner database at the create_time of the backup.

  • Replication: Backups reside in the same instance as their source database and are replicated in the same geographic locations. For regional instances, a copy of the backup is stored in each of the three read-write zones. For multi-regional instances, a copy is stored in all zones that contain either a read-write or read-only replica.

  • Automatic expiration: All backups have a user-specified expiration date which determines when it will be automatically deleted.

Choose between Backup and Restore or Import and Export

Cloud Spanner Import and Export serve similar use cases as Backup and Restore. The following table describes similarities and differences between them to help you decide which one to use.

Backup and RestoreImport and Export
Data consistency Both backups and exported databases are transactionally and externally consistent.
Performance impact Both run at low priority to minimize impact on database performance. Export uses low-priority-user CPU whereas backup uses low-priority-system CPU. For more information, see task priority.
Storage format Uses a proprietary, encrypted format designed for fast restore. Supports both CSV and Avro file formats.
Portability Backups reside in the same instance as their source database and cannot be moved.

You can restore a database to any instance in the project with the same instance configuration as the backup.
Exported databases reside in Google Cloud Storage and the data can be migrated to any system that supports CSV or Avro.
Retention Backups can be retained for up to 1 year. Exported databases are stored in Cloud Storage where, by default, they are retained until they are deleted. You can customize lifecycle and retention policies.
Billing Backups are billed to your Cloud Spanner project based on the storage used per unit time. For more details, see the Billing section. Billing for import and export is more complicated due to its use of Google Cloud Storage and Dataflow. For more information, see Database export and import pricing.
Restore time Restore happens in two operations: restore and optimize. The restore operation offers fast time-to-first-byte because the database directly mounts the backup without copying the data. After the restore operation completes, the database is ready for use, though read latency might be slightly higher while it is optimizing. For more information, see How restore works. Import is slower. You need to wait for all the data to be written into the database.

How backup works

Contents

Users can create a backup of any Cloud Spanner database. These backups are complete, in the sense that they contain all of the data in the database (including the schema and secondary indexes) at the create_time of the backup. Any modifications to the data or schema after backup creation has started will not be included in the backup. Backups do not contain database metadata such as Cloud Identity and Access Management (Cloud IAM) policies.

Creation process

When you create a backup, you must specify a source database, a name for the backup resource, and an expiration date (up to 1 year from backup creation time). The system creates a backup resource and a long-running backup operation to track the progress of the backup.

To ensure external consistency of the backup, Cloud Spanner pins the contents of the database at create time. This prevents the garbage collection system from removing the relevant data values for the duration of the backup operation. Then, every zone in the instance begins copying the data in parallel. If a zone is temporarily unavailable, the backup is not complete until the zone comes back online and finishes. Backups are restorable as soon as the operation is done.

Resource hierarchy

Backups are resources in Cloud Spanner. Each backup resource is organized under the same instance as its source database in the resource hierarchy and has a resource path in the form projects/<project>/instances/<instance>/backups/<backup>. A backup continues to exist even after its source database has been deleted, but cannot outlive its parent instance. To prevent accidental deletion of backups, you cannot delete a Cloud Spanner instance if there are backups. For users who want to delete the instance, we recommend restoring the backup and then exporting the restored database, before deleting the backup and the instance.

Backup time and performance

The amount of time it takes to create a backup depends on various factors, but is primarily determined by the size of the database vs the number of nodes. If you need faster backup times, you can increase the number of nodes, but keep in mind that changes to node count take effect for subsequent backups.

Backup creation also uses idle CPU, so make sure your CPU usage falls within the recommended guidelines. Overloading the CPU could result in very long backup times and could also adversely affect database latency.

The CPU in an instance is shared by all of the ongoing backups in the instance. Creating backups of different databases in the same instance at the same time, can result in long backup times.

How restore works

When you restore, you must specify a source backup and a new target database. You cannot restore to an existing database. The new database must be in the same project as the backup and be in an instance with the same instance configuration as the backup. For example, if a backup is in an instance configured us-west3, it can be restored to any instance in the project that is also configured us-west3. The node count of the instances does not need to be the same. The restored database will have all the data and schema from the original database at the create_time of the backup. It will not have any Cloud IAM permissions (except for those inherited from the instance containing the restored database) and users should apply appropriate Cloud IAM permissions after the restore completes. The restore process is designed for high-availability as the database can be restored as long as the majority quorum of the regions and zones in the instance are available.

It's important to understand that a restored database transitions between three States which are tracked by two operations.

  • CREATING state: When you initiate a restore, the system creates a new database and long-running database operation with RestoreDatabaseMetadata to track the progress of the restoration. The new database begins and remains in the CREATING state, which means it is not ready for use, until the restore operation is complete. To provide fast restore times (typically under 10 minutes), the restore operation works by mounting the files in the backup without copying them to the database.

  • READY_OPTIMIZING state: Once the restore operation completes, the database transitions to the READY_OPTIMIZING state. In this state, it is ready for use, but you may experience some slightly higher read latencies while the database reads data from the backup. Any attempt to delete the backup will fail while it is still in use for database restoration or optimzation.

    When the restore operation completes, you will get another long-running database operation with OptimizeRestoredDatabaseMetadata to track the progress of the optimization. The optimize operation copies the data from the backup to the database. If you would like to speed up the optimize process, you can add more nodes to the instance.

  • READY state: Once the optimization operation completes, the database transitions to the READY state. At this point, the restored database is fully performant and no longer references the backup.

Billing

You are billed based on the amount of storage used by your backups per unit time. Billing begins once the backup operation is complete and will continue until the backup has been deleted. There is no charge for restoring from a backup.

A completed backup is billed for a minimum of 24 hours. If you create a backup, then delete it a minute after it finishes, you are still billed for 24 hours.

For more complete information on backup costs, see the Cloud Spanner Pricing page.

Access control (Cloud IAM)

Cloud IAM lets you control access to Cloud Spanner resources, which include backups and restored databases. If you are new to Cloud IAM, roles, and permissions, see Cloud IAM Overview for an introduction.

Backup resources are organized under instances in the Cloud Spanner resource hierarchy. We recommend applying Cloud IAM policies at the project level or instance level. If you need finer grain control, Cloud IAM policies can also be applied at the backup and database level as well, but this is not recommended due to complexity. Remember that backups do not contain database metadata such as Cloud IAM policies, so when you restore a database, the database will initially inherit policies from its parent instance.

This section describes the predefined roles that have access to backup and restore.

The following roles are designed specifically for backup and restore:

  • spanner.backupAdmin: has access to create, view, update, delete backups. This role can also view and manage a backup's Cloud IAM policy. This role cannot restore a database from a backup.
  • spanner.restoreAdmin: has access to restore databases from backups. If you need to restore a backup to a different instance, apply this role at the project level or to both instances. This role cannot create backups.
  • spanner.backupWriter: has access to create backups, but cannot update, or delete them. This role is intended to be used by scripts that automate backup creation.

The following roles also have access to backup and restore:

  • spanner.admin: has full access to backup and restore. This role has complete access to all Cloud Spanner resources.
  • owner: has full access to backup and restore
  • editor: has full access to backup and restore
  • viewer: has access to view backups, backup operations, and restore operations. This role cannot create, update, delete, or restore a backup.

For more information, see Cloud Spanner IAM.