About disk snapshots


Snapshots incrementally back up data from your persistent disks. After you create a snapshot to capture the current state of the disk, you can use it to restore that data to a new disk. Compute Engine stores multiple copies of each snapshot across multiple locations with automatic checksums to ensure the integrity of your data.

You can create snapshots from disks even while they are attached to running virtual machine (VM) instances. The lifecycle of a snapshot created from a disk attached to a running VM instances is independent of the lifecycle of the VM instance.

Note that snapshots are different from custom images and machine images, which are useful for creating instance boot disks. To learn more, see the table comparing the use of images, snapshots, and instance templates.

Working with snapshots

  • To learn how to back up disks with snapshots, see Creating snapshots. You can create a snapshot of your disk before you attempt a potentially dangerous operation, so that you can revert the change in case your results are unexpected.

  • To learn how to restore the contents of a snapshot to a new disk, see Restoring snapshots.

  • If you no longer need a specific snapshot, you can reduce storage costs by deleting the snapshot.

  • To reduce the risk of unexpected data loss, consider the best practice of setting up a snapshot schedule to ensure your data is regularly backed up.

Accessing snapshots

Restrictions

  • You cannot change the storage location of an existing snapshot. See Selecting the storage location for a snapshot.

  • You can snapshot your disks at most once every 10 minutes. If you want to issue a burst of requests to snapshot your disks, you can issue at most 6 requests in 60 minutes. For more information, see Snapshot frequency limits.

  • You cannot edit the data stored in a snapshot.

  • You cannot recover deleted snapshots.

How incremental snapshots work

Snapshots are incremental and automatically compressed, so you can create regular snapshots on a persistent disk faster and at a lower cost than if you regularly created a full image of the disk.

Incremental snapshots work as follows:

  • The first successful snapshot of a persistent disk is a full snapshot that contains all the data on the persistent disk.
  • The second snapshot only contains any new data or modified data since the first snapshot. Data that hasn't changed since snapshot 1 isn't included. Instead, snapshot 2 contains references to snapshot 1 for any unchanged data.
  • Snapshot 3 contains any new or changed data since snapshot 2 but won't contain any unchanged data from snapshot 1 or 2. Instead, snapshot 3 contains references to blocks in snapshot 1 and snapshot 2 for any unchanged data.

This repeats for all subsequent snapshots of the persistent disk. Snapshots are always created based on the last successful snapshot taken.

How to create a snapshot

Snapshot deletion

Compute Engine uses incremental snapshots so that each snapshot contains only the data that has changed since the previous snapshot. For unchanged data, snapshots reference the data in previous snapshots. Storage costs for persistent disk snapshots charge only for the total size of the snapshot.

When you delete a snapshot, Compute Engine immediately marks the snapshot as DELETED in the system. If the snapshot has no dependent snapshots, it is deleted outright. However, if the snapshot does have dependent snapshots:

  1. Any data that is required for restoring other snapshots is moved into the next snapshot, increasing its size.
  2. Any data that is not required for restoring other snapshots is deleted. This lowers the total size of all your snapshots.
  3. The next snapshot no longer references the snapshot marked for deletion, and instead references the snapshot before it.

Because subsequent snapshots might require information stored in a previous snapshot, keep in mind that deleting a snapshot does not necessarily delete all the data on the snapshot. To definitively delete data from your snapshots, you should delete all snapshots.

If your disk has a snapshot schedule, you must detach the snapshot schedule from the disk before you can delete the schedule. Removing the snapshot schedule from the disk prevents further snapshot activity from occurring. You cannot delete a schedule that is attached to a disk. You have the option to manually delete snapshots at any time.

The following diagram shows this process:

The
  process for deleting a snapshot.

Snapshot size and deleted blocks

Snapshots capture parts of the disk that were written to and not discarded. Depending on the disk file system configuration, sometimes deleted files are not discarded. If this happens, you might see that the size of your snapshot is larger than the used space on the disk reported by the file system. To avoid this, it is a best practice to enable the discard option or run fstrim on your disk.

Snapshot chains

Using the gcloud CLI or the Compute Engine API, you can create snapshots in distinct snapshot chains by specifying a snapshot chainName. When you create multiple snapshots of a persistent disk using a chain name, each snapshot is based incrementally on the last successful snapshot created with that chain name. This is available in beta. Use this field only if you are an advanced service owner who needs to create separate snapshot chains, for example, for chargeback tracking.

Archive snapshots

When you create a snapshot, you have the option of creating a standard snapshot or an archive snapshot. Archive snapshots have the same benefits as standard snapshots including incremental chains, compression, and encryption. However, archive snapshots are lower-cost and are better suited for use cases related to compliance, audit, and long-term cold storage. If you require snapshot retention for many months or years and rarely need to access snapshots, consider using archive snapshots instead of standard snapshots. Each snapshot type is stored in separate incremental snapshot chains, and archive snapshots are listed separately in the Google Cloud console.

Snapshot storage location

When you create a snapshot, you can specify a storage location. The location of a snapshot affects its availability and can incur networking costs when creating the snapshot or restoring it to a new disk.

Snapshots can be stored in either one Cloud Storage multi-regional location, such as asia, or one Cloud Storage regional location, such as asia-south1.

A multi-regional storage location provides the highest availability and resilience. A regional storage location gives you more control over the physical location of your data because you specify a single region.

A snapshot can be used to create a new disk in any region and zone, regardless of the storage location of the snapshot.

If you have an organization policy that includes the resource locations constraint, any snapshot storage location that you specify must be in the set of locations defined by the constraint. See Compute Engine resource locations for more information.

If you do not specify a storage location for a snapshot, Google Cloud uses the default location, which stores your snapshot in a Cloud Storage multi-regional location closest to the region of the source disk. If you need to choose regional storage, or if you need to specify a different multi-regional location, store your snapshot in a custom location.

Default location

If you do not specify a storage location, your snapshot is stored in the multi-region that is geographically closest to the location of your persistent disk.

For example, if your persistent disk is stored in us-central1, your snapshot is stored in the us multi-region by default.

However, a default location like australia-southeast1 is outside of a multi-region. The closest multi-region is asia. Creating or restoring a snapshot generates network costs.

Some example use cases for choosing a default location to store your snapshots include the following:

  • The default multi-region location meets corporate or government data-placement policies.
  • Your persistent disk is stored in a regional location that is part of a default multi-region location. For example, your persistent disk is in the us-central1 region, so the default multi-region is us. In this case, higher snapshot availability takes precedence at the risk of slower snapshot restoration performance.
  • You do not expect your snapshots to be frequently restored to disks that are located outside of the default snapshot storage location.

Custom location

Select a custom location to store your snapshot in a regional location, or if you need to specify a different multi-regional location.

Some example cases for selecting a custom storage location for your snapshots are:

  • The custom multi-region location meets corporate or government data-placement policies.
  • Your app is deployed in a region that is not included in one of the Cloud Storage multi-regional locations and you want to prioritize snapshot restoration performance over snapshot availability.
  • You restore your snapshots multiple times from a disk located outside of the default snapshot storage location.

If you need to comply with corporate or government data-placement policies, store your snapshot in the nearest regional location that complies with these policies.

If your app is not deployed in part of a multi-region and you want to prioritize low networking costs over high snapshot availability, store your snapshot in the region where your source disk is located. Storing your snapshot in the region where your source disk is located minimizes networking costs for restoring and creating snapshots from that source disk.

However, unlike a multi-regional storage location, a regional storage location doesn't store your data redundantly across multiple data centers, so your data might not be accessible if a large-scale disruption occurs. To ensure the availability of your data, you might also want to store a redundant snapshot in a second location.

Network costs

Network charges apply for the creation or restore of all multi-regional snapshots when a disk is in a member region of the multi-region. If you don't require the additional replication and resilience of multi-regional snapshots, we recommend using regional snapshots by specifying a regional location when snapshots are created.

Selecting your snapshot storage location is vital to minimizing network costs. If you store your snapshot in the same region as your source disk, there is no network charge when you access that snapshot from the same region. If you access the snapshot from a different region, there is a network cost. Network costs are incurred when a snapshot is created in a different region from the source disk, and when a snapshot is restored to a disk in a different region from the snapshot.

There is a network charge for cross-region access. For example, if your source disk is in asia-east1 and you store your snapshots in asia-east2, you will incur a network cost when you access your snapshot between those two regions.

Two regions, australia-southeast1 and southamerica-east1, have a default multi-region snapshot storage location that will incur network costs unless you override the default when creating a snapshot:

  • If your source disk is in australia-southeast1, the default snapshot storage location is in the asia multi-region. To reduce costs, override this default location and store your snapshots in the australia-southeast1 region.
  • If your source disk is in southamerica-east1, the default snapshot storage location is in the us multi-region. To reduce costs, override this default location and store your snapshots in the southamerica-east1 region.

If you restore a snapshot to a disk in a region that isn't included in the snapshot's storage location you will incur a network cost. For example, if you create a new regional persistent disk in australia-southeast1 from a snapshot stored in asia, a multi-regional location, you will incur network costs.

What's next