When you create an environment, Cloud Composer distributes the environment's resources between a Google-managed tenant project and a customer project.
Cloud Composer environments can be created in three configurations: Public IP, Private IP, and Private IP with Domain restricted sharing (DRS). Each configuration slightly alters the architecture of project resources.
In a Public IP configuration, environment resources are distributed between the customer and tenant projects. The tenant project hosts a Cloud SQL instance to run the Airflow database, and a Google App Engine Flex VM to run the Airflow web server.
The Airflow scheduler, workers, and webserver connect to the Airflow database using Cloud SQL proxy processes, shown in the figure below:
In a Private IP configuration, Cloud Composer resources are distributed between the customer and tenant projects. The tenant project hosts a Cloud SQL instance to run the Airflow database, and a Google App Engine Flex VM to run the Airflow web server and Cloud SQL Proxy processes.
The Airflow scheduler and workers connect to Cloud SQL proxy processes running in the tenant project using a HAProxy process running in the GKE cluster. The HAProxy process load balances traffic to the Cloud SQL instance between two Cloud SQL Proxy instances running in the tenant project as shown in the figure below:
Private IP with Domain restricted sharing
If the Domain restricted sharing organizational policy is turned on, Cloud Composer creates an additional bucket in the tenant project that the web server can access directly. An additional process is also run on the Cloud Composer GKE cluster, to sync data from the Cloud Storage bucket located in customer project to Cloud Storage bucket located in tenant project.
Tenant project resources
For unified Identity and Access Management access control and an additional layer of data security, Cloud Composer deploys Cloud SQL and App Engine in the tenant project.
Cloud SQL stores the Airflow metadata. To protect sensitive connection and workflow information, Cloud Composer limits database access to the default or the specified custom service account used to create the environment. Cloud Composer backs up the Airflow metadata daily to minimize potential data loss.
The service account used to create the Cloud Composer environment is the only account that can access your data in the Cloud SQL database. To remotely authorize access to your Cloud SQL database from an application, client, or other Google Cloud service, Cloud Composer provides the Cloud SQL proxy in the GKE cluster.
App Engine flexible environment hosts the Airflow web server. By default, the Airflow web server is integrated with Identity-Aware Proxy. Cloud Composer hides the IAP integration details and enables you to use the Cloud Composer IAM policy to manage web server access. To grant access only to the Airflow web server, you can assign the composer.user role, or you can assign different Cloud Composer roles that provide access to other resources in your environment.
Customer project resources
Cloud Composer deploys Cloud Storage, Google Kubernetes Engine, Cloud Logging, and Cloud Monitoring in your customer project.
Cloud Storage provides the storage bucket
for staging DAGs,
dependencies, and logs.
To deploy workflows (DAGs), you copy your files to the bucket for your
environment. Cloud Composer takes care of synchronizing the DAGs among
workers, schedulers, and the web server. With Cloud Storage you can store
your workflow artifacts in the
logs/ folders without worrying
about size limitations and retain full access control of your data.
Google Kubernetes Engine
By default, Cloud Composer deploys core components—such as Airflow scheduler, worker nodes, and CeleryExecutor—in a GKE cluster. For additional scale and security, Cloud Composer also supports VPC-native clusters using alias IPs.
Redis, the message broker for the CeleryExecutor, runs as a StatefulSet application so that messages persist across container restarts.
Running the scheduler and workers on GKE enables you to use the KubernetesPodOperator to run any container workload. By default, Cloud Composer enables auto-upgrade and auto-repair to protect the GKE clusters from security vulnerabilities.
Cloud Composer runs most of its internal processes in Google Kubernetes Engine pods, which are subject to the Kubernetes Pod Lifecycle. One important outcome of this is that pods can be evicted. For example, this might happen when a particular pod consumes a lot of memory from a node. If this pod runs a DAG task, the task from the evicted pod fails. Task spikes and co-scheduling of workers are two most common causes for pod eviction in Cloud Composer. For more information about troubleshooting DAG tasks that fail because of pod eviction, see Troubleshooting DAGs.
Note that the Airflow worker and scheduler nodes and the Airflow web server run on different service accounts.
- Scheduler and workers: If you do not specify a service account during environment creation, the environment runs under the default Compute Engine service account.
- web server: The service account is
auto-generated during environment creation and derived from the web server
domain. For example, if the domain is
foo-tp.appspot.com, the service account is
You can see
airflowUri information in the environment details.
Cloud Logging and Cloud Monitoring
Cloud Composer integrates with Cloud Logging and Cloud Monitoring, so you have a central place to view all Airflow service and workflow logs.
Because of the streaming nature of Cloud Logging, you can view the logs that the Airflow scheduler and workers emit immediately instead of waiting for Airflow logging module synchronization. And because the Cloud Logging logs for Cloud Composer are based on google-fluentd, you have access to all logs the scheduler and worker containers produce. These logs greatly improve debugging and contain useful system-level and Airflow dependency information.
Cloud Monitoring collects and ingests metrics, events, and metadata from Cloud Composer to generate insights via dashboards and charts.
Airflow configuration information
Cloud Composer relies on the following configurations to execute workflows successfully:
- The Cloud Composer service backend coordinates with
its GKE service agent through Pub/Sub using
subscriptions and relies on Pub/Sub's
default behavior to manage messages. Do not delete
.*-composer-.*topics. Pub/Sub supports a maximum of 10,000 topics per project.
- The Cloud Composer service coordinates logging with Cloud Logging. To limit the number of logs in your Google Cloud project, you can stop all logs ingestion. Do not disable Logging.
- Do not modify the Identity and Access Management policy binding for the Cloud Composer
service account, for example
- Do not change the Airflow database schema.
- Any quotas or limits that apply to the standalone Google Cloud products that Cloud Composer uses for your Airflow deployment apply also to your environment.
- A Cloud Composer release running a stable Airflow version can include Airflow updates that are backported from a later Airflow version.
- The worker and scheduler nodes have a different capacity and run under a different service account than the Airflow web server. To avoid DAG failures on the Airflow web server, do not perform heavyweight computation or access Google Cloud resources that the web server does not have access to at DAG parse time.
- Deleting your environment does not delete the following data in your customer project: the Cloud Storage bucket for your environment, Logging logs, and the Persistent Disk used by the Redis Queue running on the GKE cluster. To avoid incurring charges to your Google Cloud account, export and delete your data, as needed.