Backups overview

This page describes what a backup is, how it works, some common use cases, and best practices when creating and using backups. To learn how to create and manage backups, as well as how to restore a Filestore instance from a backup, see Back up data for disaster recovery.

What is a backup?

A Filestore backup is a copy of a file share that includes all file data and metadata of the file share from the point in time when the backup is created.

After creating a backup of a file share, you can modify or delete the original file share without affecting the backup.

You can use a backup to restore a file share to a new Filestore instance or, for basic tier instances, to the source or to an existing file share.

Backups are regional resources that remain within the region that you specify at the time of creation. You can create backups in the same region as the Filestore instance or to another region to help reduce the risk of data loss.

Backups are globally addressable and can be used to restore file shares to any region, but they cannot be shared across projects.

Network transfer charges apply to cross-region network traffic. For details, see the Pricing page.

Backup creation

The first backup you create is a complete copy of all file data and metadata on a file share. Each subsequent backup copies any successive changes made to the data since the previous backup.

A group of backups associated with the same instance, region, and CMEK (if used) is called a backup chain.

A backup chain resides in a single Cloud Storage bucket and region and can be located outside of the region used to store the source instance.

All service tiers support multiple backup chains allowing you to store an instance's backups in multiple regions.

Every time a backup is created, the previous backup is scanned for both differential and incremental changes:

Differential changes: includes changes made to files on the share such as file edits, additions, or deletions.
Incremental changes: includes changes to storage in the bucket where backup data is located. This might include deduplication of data previously referenced in the chain.

Every time you save a backup to the same backup chain, the previous backup is scanned for differential and incremental changes. In such cases, a full copy is not needed.

However, storing an instance's data to multiple backup chains implies that you are saving and storing backups to alternating locations.

Every time you create a new backup to an alternating location, a complete copy of the backup is generated again. Expect higher latency on backup create operations when alternating between backup chains.

Unchanged data contained in previous backups are referenced in, but not copied to, newer backups. If an older backup is deleted, its unique data is copied to the next most recent backup and all internal data references are automatically updated.

Internally, a backup chain's history is tracked using snapshots, which consume capacity on the source instance.

Backup creation is instantaneous, but it takes a period that's proportional to the amount of data being copied before the backup is available for use. During this period, the backup transitions through three states:

State	Duration	Description
Creating	A few seconds	Capturing the current state of the file share. Any new changes to file share data may or may not be included in the backup. Stable writes acknowledged by the instance before the backup is initiated are included.
Finalizing	Depends on size	Uploading data to the backup. Any new changes to file share data are not included in the backup.
Ready	Until the backup is deleted	The backup is ready for use.

After creation, basic tier backups are automatically compressed to reduce cost. Instance performance may be reduced while creating a backup for instances in zonal, regional, and enterprise service tiers. Creating a backup does not affect the availability or performance of basic tier instances.

Addressing redundant data

By default, backups are incremental to avoid billing for redundant data and to minimize the use of storage space. To ensure the reliability of the underlying change history, a backup may occasionally capture a full copy of the instance.

For more information, see Compare snapshots and backups.

Backup deletion

Backups are project-level resources, not a sub-resource of the source instance, and require their own separate storage. As a result, a backup's lifecycle is not tied to that of the source instance. Deleting the source does not delete its associated backups. If you want to delete a backup, you must explicitly perform a delete operation on the backup, not the instance.

Be sure to delete any unwanted backups. If a source instance is deleted, any remaining backups continue to accrue fees.

Deleting a backup is permanent and can't be undone.

Backup consistency

Filestore backups have NFSv3 and NFSv4.1 consistency semantics. Before a backup is initiated, any write that the Filestore instance acknowledges as written to stable storage or that is followed by an acknowledged COMMIT is included in the backup. For details, see NFSv3 RFC-1813 section 3.3.7 or About supported file system protocols.

Common use cases

The following sections describe common use cases for backups.

Backing up data for disaster recovery

Imagine that you have a Filestore instance in us-west1-c, and you want to protect your data against disasters that affect this region. You can schedule a job that regularly creates backups of this instance to a remote region, say us- east1. If a disaster occurred involving us-west1-c, you can create a new instance in another location from any previous backup.

Backing up data to protect against accidental changes

If you want to protect your Filestore data against unintended changes, you can schedule a job that regularly creates backups of the instance. If you lose data, you can browse through the list of backups to identify the one with the version of the file needed. Then, you can create a new Filestore instance from the backup, mount it to the same client as the original instance, and copy the file over.

Before copying the file over, you can use the Linux diff command on the two mount points to check the differences between the data on the original instance and the data restored from the backup. After the data is recovered, you can delete the restored instance and create a new backup to preserve your data's present state for future use.

Alternatively, you can do an in-place restore where the backup data is directly restored to the original Filestore instance, replacing all data on it with data from the backup. We recommend that you create a backup of the latest data before performing an in-place restoration because any unbacked data is lost.

Creating clones for development and testing

Imagine you have a database set up on a Filestore instance that serves production traffic. If you want to run a test with a database as an input, you can create a new Filestore instance from a backup of the production instance for the test. In this way, test usage does not interfere with production.

Similarly, you can use backups for offline analysis and investigation without affecting production.

Migrating Data

After you create a Filestore instance, you cannot change its location or service tier. To migrate your data to another region, you can create a backup of it and use the backup to create a new Filestore instance or restore it to an existing instance.

Also, when you create a new Filestore instance from a backup, you can choose between basic HDD and basic SSD tiers regardless of the tier of the source instance.

Feature limitations

Filestore backups are generally available (GA) for all service tiers.

The following limitations apply:

Filestore backups cannot be combined with the Filestore multishares feature.
Users should create a new backup or backups to replace those created in Preview. Backups created in Preview are subject to deletion. Backups created in Preview reflect feature behavior available at the time of creation. Existing backups are not updated when new capabilities are released.

The following sections cover other feature limitations related to performance, storage, capacity, encryption, and other topics in detail:

Performance

Numerous changes made through many hard-links on the same file (for example, tens or hundreds of thousands) may result in impacts to performance.
For highly utilized instances, the performance may be reduced by as much as 15% while a backup is uploaded. Basic tier instance performance is not affected by backup create operations.
Storing an instance's data to multiple backup chains does impact backup performance. Expect higher latency on backup create operations when alternating between backup chains.
Instance operations such as instance restore or instance delete may be delayed until a backup create operation completes.
In some cases, delete operations may take up to 24 hours to complete.

Operations concurrency

Backup delete operations associated with the same source instance must be performed one at a time.

Bulk backup delete operations within a backup chain are not supported. While a delete operation is pending, any new delete operations within the same backup chain return a RESOURCE_EXHAUSTED error. This is regardless of whether the source instance has been deleted.
- If the source instance has been deleted, users receive a similar FAILED_PRECONDITION error.
- This limitation applies to every service tier but basic SSD and basic HDD.
- Note that Filestore does support concurrent backup delete operations when backups reference separate source instances.
  
  For example, an instance labeled Source1 has backup data referenced in Backup1 and Backup2. Source2 has backup data referenced in Backup3 and Backup4. Backup1 and Backup2 can't be deleted in parallel, however, Backup2 and Backup3 can.
For more information, see Rate limits for backups.
Backup create and backup delete operations initiated within the same backup chain can run concurrently. However, users can't complete a backup create operation while the most recent backup is being deleted.
- If the user attempts to create a new backup of the instance while the most recent backup is being deleted, they will receive a FAILED_PRECONDITION error. For example, if Source1 has a backup chain composed of Backup1 and Backup2, and the user begins a create operation for Backup3, they won't be able to delete Backup2 until the create operation completes. This is because the most recent backup contains the most critical data needed to successfully complete the backup create operation.
For more information regarding operation rate limits, see Operation rate limits for backups.

Storage

Backup restore operations to the source instance, or to an existing instance, are not supported for zonal, regional, and enterprise instances. If you want to restore a backup of an instance in any of these service tiers, you must create a new instance.
- The new instance must match the source instance's service tier and capacity range. For example, if the source was created using the zonal service tier with lower capacity range, the new instance must use the same service tier and capacity range.
- If you need to create an instance using the legacy high scale SSD service tier, you must run your operations directly through the Filestore API.
- If you need to create an instance using the legacy enterprise service tier, you can run your operations directly through the Filestore API or from the Restore backup > New instance page in the Google Cloud.
  
  For example, if you want to create a regional resource with 10 TiB instance capacity, you must use the legacy enterprise service tier.
Backup operations, such as restore, edit, or delete, may not be available for select backups created in Preview.
Once a RestoreInstance operation is applied to a regional or enterprise instance, you won't be able to create snapshots with the same names as previous snapshots prior to the operation.
Attempts to restore an instance from a backup while either a backup deletion or snapshot deletion are in progress will fail.
If the deletion of a backup fails, the status is marked as invalid. In such cases, you will need to retry the delete operation.

Capacity

Each backup occupies instance capacity. This capacity varies relative to the scope of changes made to the data since the last backup was created.

More specifically, when a backup is created, Filestore creates an internal snapshot of the file system which also occupies a portion of available instance capacity.

Snapshot size is also relative to the scope of changes made to data within the share since the last backup was created. This snapshot continues to exist until the next subsequent backup is created and uploaded.

All data referenced by the backup persists in the state as it was when captured and continues to take up capacity from the file system. So for example, if you were to delete data from the mounted file system, that action itself won't free up capacity. Instead, to do so, you would create a new backup after deleting or overwriting significant amounts of data.

For a detailed description of differential and incremental changes and how they are handled, see Backup creation.

To anticipate sufficient capacity for your workloads, consider applying one of the following:

Increase instance capacity for workloads with significant, frequent data changes or a high change rate.

Encryption

When using CMEK to encrypt your backup chains, the following limitations apply:

An entire backup chain is encrypted using the same CMEK.
A CMEK must reside in the same region as the resource it encrypts.
If storing a backup chain in a region separate from the source instance, you may need to apply separate keys, one for the source and one for the backup chain.
- All service tiers support multiple backup chains, or the ability to store an instance's backups in multiple regions. If electing to use CMEK for encryption, a CMEK key must reside in the same region as the resource it encrypts. If you're storing backups in a region separate from the source, and the CMEK is not a multiregion key, you must use separate CMEK keys. For more information, see CMEK restrictions and Choosing the best CMEK location.
A single CMEK is applied to the Cloud Storage bucket where the backup chain is stored and cannot be combined or replaced.
CMEK support is not available for basic tier backups.

For more information, see CMEK support for backup chains.

Protocols

When restoring a backup, the new instance must use the same protocol as the source instance.

Best practices

The following sections cover recommended best practices.

Preparing your file share for the best backup consistency

The quality of a backup depends on the ability of your application to recover from backups that are created during heavy write workloads. In most situations, you can create backups that have good consistency even while your applications write data to the file share. However, if your applications require strict consistency, we recommend doing one or more of the following:

Use sync mount. For more information, see "The sync mount option" section in nfs(5). Alternatively, you can open files with the O_DIRECT|O_SYNC flags. For more information, see open(2).
Pause applications or operating system processes that write data to the file share and cause them to flush their changes to the file share before initiating the backup. For more information, see fsync(2).
If your applications require consistency between multiple shares, pause all applications on all instances that are writing to all file shares and create backups of all file shares before resuming your applications.
If you require application level consistency, stop your applications and unmount the file share before creating a backup.

Using existing backups as a baseline for new backups to reduce backup creation time

Existing backups of a file share within a region are used as baselines for creating new backups of the file share, reducing backup creation time. Therefore, we recommend that you do the following:

Take a new backup of a file share before you delete the previous backup of that file share.
Wait for new backups to be in the Ready state before creating subsequent backups of the same file share.

Scheduling backups during off-peak hours to reduce backup creation time

Creating backups during off-peak hours reduces the time that it takes to create a backup. If you schedule regular backups of your file shares, we recommend scheduling them during off-peak hours when possible.

Peak hours for backups creation are the end of each business day and midnight in the region where the Filestore instance is located. We recommend creating your backups either in the early morning or during the business day.

Organizing your data on separate Filestore instances to maximize efficiency

The more data on the file share, the larger the backup and the more it costs. To back up only the data that you need to back up, we recommend organizing your data on separate file shares, namely:

Storing critical data with different write patterns or with different backup requirements on different file shares.
Reducing the number of backups that you need to create by keeping similar data in one file share.

Quota

A quota limit exists regarding the number of backups per region for basic SSD and basic HDD service tiers.

Backup quota limits don't apply to zonal, regional, and enterprise service tiers.

For more information, see Service tiers and quota.

Get started with Filestore backups

To get started using the feature, see Backup data for disaster recovery.

What's next

Learn how to back up and restore file shares.
Learn how to schedule backups using Cloud Scheduler.
Learn about Google Cloud regions and zones.
Learn about backups pricing.