Set up highly resilient Cloud Composer environments

Cloud Composer 1 | Cloud Composer 2

This page describes how to set up highly resilient Cloud Composer environments.

About resiliency for zonal failures in Cloud Composer

Highly resilient Cloud Composer environments are Cloud Composer 2 environments that use built-in redundancy and failover mechanisms that reduce the environment's susceptibility to zonal failures and single point of failure outages.

For example, a zonal outage interrupts Airflow tasks that run in a specific zone. Afterwards, a highly resilient environment recovers, restarts its affected components in a different zone, and switches its database to a secondary zone. Thus, the failed Airflow tasks can be rescheduled and restarted by Airflow, while at the same time preserving the history of DAG runs and other settings.

A highly resilient environment runs across at least two zones of a selected region. Cloud Composer automatically distributes the components of your environment between zones.

You can use highly resilient Cloud Composer environments for critical business processes.

About highly available database of your environment

In highly available Cloud Composer environments, the Cloud SQL instance that stores the database of your environment runs in the high availability mode. A Cloud SQL instance configured for high availability is also called a regional instance and is located in a primary and secondary zone within the configured region. Within a regional instance, the configuration is made up of a primary instance and a standby instance.

In case of an outage, the Cloud SQL instance of your environment performs the automatic database failover to the standby Cloud SQL instance. You do not need to perform any additional actions in your Cloud Composer environment. Once the primary zone is operational again, the environment switches back to having two zones (primary and secondary). Primary and secondary zones can be swapped in some cases. The Cloud SQL instance in high availability mode uses the same IP address after a failover.

About highly available Airflow components

Highly available Cloud Composer environments run Airflow components that are distributed between zones.

Your environment always runs exactly two Airflow schedulers, two web servers, and at least two (but no more than ten) triggerers if triggerers are enabled. These pairs of components run in separate zones. The minimum number of workers is set to two, and your environment's cluster distributes worker instances between zones. In case of a zonal outage, affected worker instances are rescheduled in a different zone.

For more information about the architecture of highly resilient environments, see Highly resilient environment architecture.

Before you begin

  • Highly resilient environments are available in Cloud Composer version 2.2.0 and later versions.

  • Highly resilient environments are available only in Private IP environments with Cloud Composer 2.

  • Highly resilient Cloud Composer environments are offered at an incremental charge when compared to regular environments.

  • If you want to update a standard environment to a highly resilient one, make sure the environment meets the following configuration:

    • minimum number of Airflow workers set to 2 or more
    • exactly 2 Airflow schedulers
    • if you need to use deferrable operators in your DAGs, at least 2 (but no more than 10) triggerers

    If your environment doesn't meet these requirements, you can update its settings first. See Scaling environments.

Create a highly resilient Cloud Composer environment

To create a highly resilient environment, enable high resilience mode when you create an environment.

Update a standard environment to high resilience mode

Console

  1. In Google Cloud console, go to the Environments page.

    Go to Environments

  2. In the list of environments, click the name of your environment. The Environment details page opens.

  3. Select the Environment configuration tab.

  4. In the Resilience mode section, click Edit.

  5. Select High resilience and click Save.

gcloud

  gcloud composer environments update ENVIRONMENT_NAME \
    --location LOCATION \
    --enable-high-resilience

Replace the following:

  • ENVIRONMENT_NAME: the name of your Cloud Composer environment
  • LOCATION: the region where the environment is located.

API

Construct an environments.patch API request:

  • Use the updateMask=config.resilienceMode query string in the URL to mark the field that is updated by the request.

  • Use a JSON file in the request body to set the resilienceMode field to HIGH_RESILIENCE in the Environment resource.

Example:

// PATCH https://composer.googleapis.com/v1/{name=projects/*/locations/*/environments/*}?updateMask=config.resilienceMode

{
 "config": {
   "resilience_mode": { "HIGH_RESILIENCE" }
   }
 }

Terraform

Setting the resilience_mode field in the `config block enables the high resilience mode.

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "ENVIRONMENT_NAME"
  region = "LOCATION"

  config {

    resilience_mode = "HIGH_RESILIENCE"

  }
}

Example:

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "example-environment"
  region = "us-central1"

  config {

    resilience_mode = "HIGH_RESILIENCE"

}

Change a highly resilient environment to standard resilience mode

You can change your environment to standard resilience mode at any time. This operation:

  • Reduces the number of web servers in your environment to 1.
  • Switches off the high availability mode of your environment's Airflow database.
  • Doesn't change the settings for minimum number of Airflow workers, schedulers, or triggerers.

Console

  1. In Google Cloud console, go to the Environments page.

    Go to Environments

  2. In the list of environments, click the name of your environment. The Environment details page opens.

  3. Select the Environment configuration tab.

  4. In the Resilience mode section, click Edit.

  5. Select Standard resilience (default) and click Save.

gcloud

  gcloud composer environments update ENVIRONMENT_NAME \
    --location LOCATION \
    --disable-high-resilience

Replace the following:

  • ENVIRONMENT_NAME: the name of your Cloud Composer environment
  • LOCATION: the region where the environment is located.

API

Construct an environments.patch API request:

  • Use the updateMask=config.resilienceMode query string in the URL to mark the field that is updated by the request.

  • Use a JSON file in the request body to set the resilienceMode field to RESILIENCE_MODE_UNSPECIFIED in the Environment resource.

Example:

// PATCH https://composer.googleapis.com/v1/{name=projects/*/locations/*/environments/*}?updateMask=config.resilienceMode

{
 "config": {
   "resilience_mode": { "RESILIENCE_MODE_UNSPECIFIED" }
   }
 }

Terraform

The resilience_mode field in the config block specifies the resilience mode. To use the standard resilience mode, set this value to STANDARD_RESILIENCE.

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "ENVIRONMENT_NAME"
  region = "LOCATION"

  config {

    resilience_mode = "STANDARD_RESILIENCE"

  }
}

Example:

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "example-environment"
  region = "us-central1"

  config {

    resilience_mode = "STANDARD_RESILIENCE"

}

Check if your environment runs in the high resilience mode

Console

  1. In Google Cloud console, go to the Environments page.

    Go to Environments

  2. In the list of environments, click the name of your environment. The Environment details page opens.

  3. Select the Environment configuration tab.

  4. In the Resilience mode section, view the resilience mode of your environment.

gcloud

To check if the high resilience mode is enabled in your environment, run the following Google Cloud CLI command. The value of True means that high resilience mode is enabled in your environment.

gcloud composer environments describe ENVIRONMENT_NAME \
  --location LOCATION \
  --format="value(config.resilienceMode)"

Replace the following:

  • ENVIRONMENT_NAME: the name of your Cloud Composer environment
  • LOCATION: the region where the environment is located.

What's next