About environment scaling

Cloud Composer 3 | Cloud Composer 2 | Cloud Composer 1

This page describes how environment scaling works in Cloud Composer 2.

Other pages about scaling:

Autoscaling environments

Cloud Composer 2 environments automatically scale in response to the demands of your executed DAGs and tasks:

  • If your environment experiences a heavy load, Cloud Composer automatically increases the number of workers in your environment.
  • If your environment does not use some of its workers, these workers are removed to save environment resources and costs.
  • You can set the minimum and maximum number of workers for your environment. Cloud Composer automatically scales your environment within the set limits. You can adjust these limits at any time.

The number of workers is adjusted based on the Scaling Factor Target metric. This metric is calculated based on:

  • Current number of workers
  • Number of Celery tasks in the Celery queue, that are not assigned to a worker
  • Number of idle workers
  • celery.worker_concurrency Airflow configuration option

Cloud Composer autoscaling uses three different autoscalers provided by GKE:

Cloud Composer configures these autoscalers in the environment's cluster. This automatically scales the number of nodes in the cluster, the machine type and the number of workers.

Scale and performance parameters

In addition to autoscaling, you can control the scale and performance parameters of your environment by adjusting the CPU, memory, and disk limits for schedulers, web server, and workers. By doing so you can scale your environment vertically, in addition to the horizontal scaling provided by the autoscaling feature. You can adjust the scale and performance parameters of Airflow schedulers, web server, and workers at any time.

The environment size performance parameter of your environment controls the performance parameters of the managed Cloud Composer infrastructure that includes the Airflow database. Consider selecting a larger environment size if you want to run a large number of DAGs and tasks with higher infrastructure performance. For example, larger environment's size increases the amount of Airflow task log entries that your environment can process with minimal delay.

Multiple schedulers

Airflow 2 can use more than one Airflow scheduler at the same time. This Airflow feature is also known as the HA scheduler. In Cloud Composer 2, you can set the number of schedulers for your environment and adjust it at any time. Cloud Composer does not automatically scale the number of schedulers in your environment.

For more information about configuring the number of schedulers for your environment, see Scale environments.

Database disk space

Disk space for the Airflow database automatically increases to accommodate the demand.

What's next