Save and load environment snapshots

Cloud Composer 1 | Cloud Composer 2

This page explains how to save and load the state of your environment using environment snapshots.

You can use snapshots together with Cloud Scheduler to automatically save snapshots of your environment. For more information, see Configure scheduled snapshots.

About environment snapshots

Environment snapshots store the state of your environment. You can save and load environment snapshots on demand.

You can use snapshots to:

How snapshots are stored

An environment snapshot is a set of files that describe the state of your environment and store the backup of the environment data.

You can create multiple snapshots of your environment. Environment snapshots are non-incremental. You can use any snapshot independently of other snapshots.

Cloud Composer does not delete snapshots when you delete your environment.

By default, Cloud Composer stores snapshots in the snapshots/ folder in your environment's bucket. You can also specify a custom location when you create a snapshot.

Security considerations for snapshots

To mitigate this security risk, you can store sensitive information that is used by Airflow DAGs, such as keys or passwords, in Secret Manager. For more information, see Configure Secret Manager for your environment.

Make sure to check security permissions for your environment's bucket. If you store environment snapshots in a custom bucket, make sure that access permissions for it are configured properly in your project. When assigning permissions, make sure that environment's service account has enough permissions to save and load snapshots from the bucket.

What data is saved in snapshots

Cloud Composer saves the following data in snapshots:

  • Airflow configuration overrides.
  • Environment variables.
  • List of custom PyPI packages, as requirements.
  • A backup of the Airflow database, including states of executed tasks, and DAG runs history.
  • A backup of the /dags, /data, and /plugins folders from the environment's bucket.
  • Environment's fernet key.
  • Other information about the environment's configuration, such as environment's scale and performance parameters. Cloud Composer does not use this information when it loads snapshots.

What data is loaded from snapshots

Cloud Composer loads the following data from snapshots:

  • Airflow configuration overrides.
  • Environment variables.
  • Custom PyPI packages (unless you choose to skip installing them).

  • The contents of the Airflow database, including states of executed tasks, and DAG runs history.

  • Contents of the /dags, /data, and /plugins folders from the snapshot are loaded into the environment's bucket.

  • The fernet key from the snapshot is used to re-encrypt the data from the snapshot with the environment's own fernet key. The fernet key of the environment remains unchanged.

Although Cloud Composer stores some information about the environment's configuration in snapshots, it is not used when loading snapshots. The following parameters of your environment do not change when you load a snapshot:

  • Environment configuration, such as environment scale and performance parameters.
  • Environment's networking configuration.
  • Contents of the environment's bucket outside of the /dags, /data, and /plugins folders.
  • Environment labels.

Any settings that you applied in Cloud Composer infrastructure without using Cloud Composer API might be lost when you load a snapshot.

About partially completed operations

When you load a snapshot, the operation can be successful, failed, or partially completed:

  • Successful operations load all data from the snapshot.
  • Failed operations do not introduce any changes.
  • Partially completed operations load a subset of data from the snapshot. Such operations are reported as failed, but the error message indicates what data was successfully loaded. For example, if PyPI packages are installed, but Airflow configuration option overrides did not succeed, the error message indicates this.

For a partially completed operation, you can try to load the same snapshot again. Cloud Composer skips steps that were successful on the previous attempt. For example, if an operation failed on a timeout, but the database was successfully loaded, then the next attempt does not load the database again.

Before you begin

  • In Cloud Composer 1 you can only save snapshots, but not load them. You can load snapshots from a Cloud Composer 1 environment to Cloud Composer 2 environments.
  • Snapshots are supported in Cloud Composer 2 version 2.0.9 and later. Cloud Composer 1 supports saving environment snapshots for versions >= 1.18.5.

  • Snapshots do not create an environment. If you want to load a snapshot from an environment to a different environment, you first need to create a new environment and then load the snapshot to it.

  • You cannot load snapshots to environments that are in the error state. It is not possible to fix such environments by loading a snapshot. You can still load an existing snapshot to a new environment.

  • You can only load snapshots to the same or later version of Cloud Composer or Airflow. For example, you cannot load a snapshot from Cloud Composer 2.0.2 to an environment with Cloud Composer 2.0.1. As another example, you cannot load a snapshot from Airflow 2.2.3 to Airflow 2.1.4.

  • Snapshots do not change Cloud Composer version. If you upgrade your environment to a later version of Cloud Composer, then load a snapshot from an earlier version, then you environment still keeps its current version of Cloud Composer. For example, loading a snapshot from Cloud Composer 2.0.1 to Cloud Composer 2.0.2 does not revert the environment to Cloud Composer 2.0.1.

  • The maximum size of the Airflow database that supports snapshots is 20 GB. If your environment's database takes more than 20 GB, reduce the size of the Airflow database before saving a snapshot.

  • If you save snapshots in a location outside your environment's bucket, the service account of your environment must have read and write permissions for the specified location. For example, the Storage Object Admin role has such permissions. You can apply it to a project or to a specific bucket.

Save an environment snapshot

Cloud Composer saves environment snapshots in a subfolder, relative to the folder that you specify. The folder name contains the project ID, the environment's location and name, and the timestamp when the snapshot was saved. For example: /snapshots/example-project_us-central1_example-environment_2022-01-05T18-59-00.

Console

To create a snapshot of your environment:

  1. In Google Cloud console, go to the Environments page.

    Go to Environments

  2. In the list of environments, click the name of your environment. The Environment details page opens.

  3. Click Save snapshot.

  4. In the Save snapshot dialog, select where to store the snapshot:

    • To store the snapshot in the /snapshots folder in the environment's bucket, select Use snapshot folder in environment bucket (default).

    • To store the snapshot in the custom folder, select Use custom folder in another bucket, then specify a location.

  5. Click Save.

gcloud

The gcloud beta composer environments snapshots save command saves a snapshot of your environment.

  • The snapshot-location argument specifies a folder where the snapshot is saved. By default, snapshots are saved in the /snapshots folder in your environment's bucket. For example, gs://us-central1-example-916807e1-bucket/snapshots. You can also specify any other folder.

To save a snapshot of your environment, run:

gcloud beta composer environments snapshots save \
  ENVIRONMENT_NAME \
  --location LOCATION \
  --snapshot-location "SNAPSHOTS_FOLDER"

Replace:

  • ENVIRONMENT_NAME with the name of the environment.
  • LOCATION with the region where the environment is located.
  • (Optional) SNAPSHOTS_FOLDER with the URI of a bucket folder where to store the snapshot. If you omit this argument, Cloud Composer saves the snapshot in the /snapshots folder in your environment's bucket.

The following example uses the default location:

gcloud beta composer environments snapshots save \
  example-environment \
  --location us-central1

The following example saves to a custom folder:

gcloud beta composer environments snapshots save \
  example-environment \
  --location us-central1 \
  --snapshot-location "gs://example-bucket/environment_snapshots"

API

  1. Construct an environments.SaveSnapshot API request.

  2. In the request body, in the snapshotLocation field, specify the folder where you want to save the snapshot.

{
  "snapshotLocation": "SNAPSHOTS_FOLDER"
}

Replace:

  • SNAPSHOTS_FOLDER with the URI of a bucket folder where to save the snapshot.

Example:

// POST https://composer.googleapis.com/v1beta1/projects/example-project/
// locations/us-central1/environments/example-environment:saveSnapshot

{
  "snapshotLocation": "gs://us-central1-example-916807e1-bucket/snapshots"
}

Terraform

It is not possible to save and load environment snapshots from Terraform.

Saving and loading snapshots are actions performed on an environment, and the resulting snapshots are not a part of an environment's definition. Since Terraform manages only Cloud Composer environment's configuration, you cannot save or load environment snapshots from it.

Load an environment snapshot

In Cloud Composer 1 you can only save snapshots, but not load them. You can load snapshots from a Cloud Composer 1 environment to Cloud Composer 2 environments. For example, when you Migrate your environments to Cloud Composer 2.

What's next