Cloud Composer environment architecture

Cloud Composer 1 | Cloud Composer 2

This page describes architecture of Cloud Composer 2 environments.

Environment architecture configurations

Cloud Composer 2 environments can have the following architecture configurations:

Each configuration slightly alters the architecture of environment resources.

Customer and tenant projects

When you create an environment, Cloud Composer distributes the environment's resources between a tenant and a customer project.

Customer project is a Google Cloud project where you create your environments. You can create more than one environment in a single customer project.

Tenant project is a Google-managed tenant project. Tenant project provides unified access control and an additional layer of data security for your environment. Each Cloud Composer environment has its own tenant project.

Environment components

A Cloud Composer environment consists of environment components.

An environment component is an element of a managed Airflow infrastructure that runs on Google Cloud, as a part of your environment.

Environment components run either in the tenant or in the customer project of your environment.

Some of your environment's components are based on standalone Google Cloud products. Quotas and limits for these products also apply to your environments. For example, Cloud Composer environments use VPC peerings. Quotas on the maximum number of VPC peerings apply to your customer project, so once your project reaches this maximum number of peerings, you cannot create additional environments.

Environment's cluster

Environment's cluster is an Autopilot mode VPC-native Google Kubernetes Engine cluster of your environment:

  • Environment nodes are VMs in the environment's cluster.

  • Pods in the environment's cluster run containers with other environment components, such as Airflow workers and schedulers. Pods run on environment nodes.

  • Workload resources of your environment's cluster manage sets of pods in your environment's cluster. Many components of your environment are implemented as different types of workload resources. For example, Airflow workers run as Deployments. In addition to Deployments, your environment also has StatefulSets, DaemonSets, and Jobs workload types.

By default, Cloud Composer enables node auto-upgrades and node auto-repair to protect your environment's cluster from security vulnerabilities. These operations happen during maintenance windows that you specify for your environment.

Airflow schedulers, triggerer, workers and Redis queue

Airflow schedulers control the scheduling of DAG runs and individual tasks from DAGs. Schedulers distribute tasks to Airflow workers by using a Redis queue, which runs as an application in your environment's cluster. Airflow schedulers run as Deployments in your environment's cluster.

Airflow workers execute individual tasks from DAGs by taking them from the Redis queue. Airflow workers run as Custom Resources in your environment's cluster.

Airflow triggerer asynchronously monitors all deferred tasks in your environment. You can configure the number of triggerer instances when you create or update Cloud Composer environments. Cloud Composer supports the following triggerer configurations:

  • Triggerer count:

    • Standard resilience: you can run up to 10 triggerers
    • High resilience: at least 2 triggerers, up to a maximum of 10

    You can set the number of triggerers to zero, but you need at least one triggerer instance in your environment (or at least two in highly resilient environments), to use deferrable operators in your DAGs. If the number of triggerers is set to zero, your environment's cluster still runs a workload for them, but with zero pods - this doesn't incur any costs.

  • Triggerer resource allocation:

    • Maximum 1 vCPU per triggerer
    • Maximum memory equal to the number of triggerer CPUs multiplied by 6.5

Redis queue holds a queue of individual tasks from your DAGs. Airflow schedulers fill the queue; Airflow workers take their tasks from it. Redis queue runs as a StatefulSet application in your environment's cluster, so that messages persist across container restarts.

Airflow web server

Airflow web server runs the Airflow UI of your environment.

In Cloud Composer 2 the Airflow web server runs as a Deployment in your environment's cluster.

Cloud Composer 2 provides access to the interface based on user identities and IAM policy bindings defined for users. Compared to Cloud Composer 1, Cloud Composer 2 uses a different mechanism that does not rely on Identity-Aware Proxy.

Airflow database

Airflow database is a Cloud SQL instance that runs in the tenant project of your environment. It hosts the Airflow metadata database.

To protect sensitive connection and workflow information, Cloud Composer allows database access only to the service account of your environment.

Environment's bucket

Environment's bucket is a Cloud Storage bucket that stores DAGs, plugins, data dependencies, and Airflow logs. Environment's bucket resides in the customer project.

When you upload your DAG files to the /dags folder in your environment's bucket, Cloud Composer synchronizes the DAGs to workers, schedulers, and the web server of your environment. You can store your workflow artifacts in the data/ and logs/ folders without worrying about size limitations, and retain full access control of your data.

Other environment components

A Cloud Composer environment has several additional environment components:

  • Cloud SQL Storage. Stores the Airflow database backups. Cloud Composer backs up the Airflow metadata daily to minimize potential data loss.

    Cloud SQL Storage runs in the tenant project of your environment. You cannot access the Cloud SQL Storage contents.

  • Cloud SQL Proxy. Connects other components of your environment to the Airflow database.

    Your Public IP environment can have one or more Cloud SQL Proxy instances depending on the volume of the traffic towards Airflow database.

    In the case of Public IP environment, a Cloud SQL proxy runs as a Deployment in your environment's cluster.

    When deployed in your environment's cluster, Cloud SQL Proxy also authorizes access to your Cloud SQL instance from an application, client, or other Google Cloud service.

  • Airflow monitoring. Reports environment metrics to Cloud Monitoring and triggers the airflow_monitoring DAG. The airflow_monitoring DAG reports the environment health data, which is later used, for example, on the monitoring dashboard of your environment. Airflow monitoring runs as a Deployment in your environment's cluster.

  • Composer Agent performs environment operations such as creating, updating, upgrading, and deleting environments. In general, this component is responsible for introducing changes to your environment. Runs as a Job in your environment's cluster.

  • Airflow InitDB creates a Cloud SQL instance and initial database schema. Airflow InitDB runs as a Job in your environment's cluster.

  • FluentD. Collects logs from all environment components and uploads the logs to Cloud Logging. Runs as a DaemonSet in your environment's cluster.

  • Pub/Sub subscriptions. Your environment communicates with its GKE service agent through Pub/Sub subscriptions. It relies on Pub/Sub's default behavior to manage messages. Do not delete .*-composer-.* Pub/Sub topics. Pub/Sub supports a maximum of 10,000 topics per project.

  • PSC endpoint connects Airflow schedulers and workers to the Airflow database in the Private IP with PSC architecture.

  • Customer Metrics Stackdriver Adapter reports metrics of your environment, for autoscaling. This component runs as a Deployment in your environment's cluster.

  • Airflow Worker Set Controller automatically scales your environment based on metrics from Customer Metrics Stackdriver Adapter. This component runs as a Deployment in your environment's cluster.

  • Cloud Storage FUSE. Mounts your environment's bucket as a file system on Airflow workers, schedulers, and web server, so that these components can access the data from the bucket. Runs as a DaemonSet in your environment's cluster.

Public IP environment architecture

Public IP Cloud Composer environment resources in the tenant project and the customer project
Figure 1. Public IP environment architecture (click to enlarge)

In a Public IP environment architecture for Cloud Composer 2:

  • The tenant project hosts a Cloud SQL instance and Cloud SQL storage.
  • The customer project hosts all other components of the environment.
  • Airflow schedulers and workers in the customer project communicate with the Airflow database through a Cloud SQL proxy instance located in the customer project.

Private IP environment architecture

Private IP with PSC Cloud Composer environment resources in the tenant project and the customer project (click to enlarge)
Figure 2. Private IP Cloud Composer environment resources in the tenant project and the customer project (click to enlarge)

By default, Cloud Composer 2 uses Private Service Connect, so that your Private IP environments communicate internally without the use of VPC peerings. It's also possible to use VPC peerings instead of Private Service Connect in your environment. This is a non-default option.

In the Private IP environment architecture:

  • The tenant project hosts a Cloud SQL instance and Cloud SQL storage.
  • The customer project hosts all other components of the environment.
  • Airflow schedulers and workers connect to the Airflow database through the configured PSC endpoint.

Highly resilient Private IP architecture

Highly resilient Private IP environment resources in the tenant project and the customer project (click to enlarge)
Figure 3. Highly resilient Private IP Cloud Composer environment resources in the tenant project and the customer project (click to enlarge)

Highly resilient Cloud Composer environments are Cloud Composer 2 environments that use built-in redundancy and failover mechanisms that reduce the environment's susceptibility to zonal failures and single point of failure outages.

In this type of Private IP environment:

  • A Cloud SQL instance of your environment is configured for high availability (is a regional instance). Within a regional instance, the configuration is made up of a primary instance and a standby instance.
  • Your environment runs two Airflow schedulers, two web servers, and if triggerers are used, a minimum of two (up to ten total) triggerers. These pairs of components run in two separate zones.
  • The minimum number of workers is set to two, and your environment's cluster distributes worker instances between zones. In case of a zonal outage, affected worker instances are rescheduled in a different zone.

Integration with Cloud Logging and Cloud Monitoring

Cloud Composer integrates with Cloud Logging and Cloud Monitoring of your Google Cloud project , so that you have a central place to view the Airflow service and workflow logs.

Cloud Monitoring collects and ingests metrics, events, and metadata from Cloud Composer to generate insights through dashboards and charts.

Because of the streaming nature of Cloud Logging, you can view the logs that the Airflow scheduler and workers emit immediately instead of waiting for Airflow logs to appear in the Cloud Storage bucket of your environment. Because the Cloud Logging logs for Cloud Composer are based on google-fluentd, you have access to all logs produced by Airflow schedulers and workers.

To limit the number of logs in your Google Cloud project, you can stop all logs ingestion. Do not disable Logging.

What's next