Cloud Composer 1 | Cloud Composer 2
This page describes how to scale Cloud Composer environments in Cloud Composer 2.
Other pages about scaling:
For a guide about selecting optimal scale and performance parameters for your environment, see Optimize environment performance and costs.
For information about how environment scaling works, see Environment scaling.
Scale vertically and horizontally
Options for horizontal scaling:
- Adjust the minimum and maximum number of workers
- Adjust the number of schedulers
- Adjust the number of triggerers
Options for vertical scaling:
- Adjust worker, scheduler, triggerer, and web server scale and performance parameters
- Adjust the environment size
Adjust the minimum and maximum number of workers
You can set the minimum and maximum number of workers for your environment. Cloud Composer automatically scales your environment within the set limits. You can adjust these limits at any time.
Console
Go to the Environments page in the Google Cloud console:
Select your environment.
Go to the Environment configuration tab.
In the Resources > Workloads configuration item, click Edit.
In the Workloads configuration dialog, in the Workers autoscaling section adjust the limits for Airflow workers:
In the Minimum number of workers field, specify the number of Airflow workers that your environment must always run. The number of workers in your environment does not go below this number, even if a lower number of workers can handle the load.
In the Maximum number of workers field, specify the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.
Click Save.
gcloud
Run the following Google Cloud CLI command:
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--min-workers WORKERS_MIN \
--max-workers WORKERS_MAX
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.WORKERS_MIN
with the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go below this number, even if a lower number of workers can handle the load.WORKERS_MAX
with the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.
Example:
gcloud composer environments update example-environment \
--location us-central1 \
--min-workers 2 \
--max-workers 6
API
Construct an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify theconfig.softwareConfig.workloadsConfig.worker.minCount,config.softwareConfig.workloadsConfig.worker.maxCount
mask.In the request body, in the
minCount
andmaxCount
fields, specify the new worker limits.
"config": {
"workloadsConfig": {
"worker": {
"minCount": WORKERS_MIN,
"maxCount": WORKERS_MAX
}
}
}
Replace:
WORKERS_MIN
with the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go below this number, even if a lower number of workers can handle the load.WORKERS_MAX
with the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.workloadsConfig.worker.minCount,
// config.workloadsConfig.worker.maxCount
"config": {
"workloadsConfig": {
"worker": {
"minCount": 2,
"maxCount": 6
}
}
}
Terraform
The min_count
and max_count
fields in the workloadsConfig.worker
block
specify the minimum and maximum number of workers in your environment:
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
workloads_config {
worker {
min_count = WORKERS_MIN
max_count = WORKERS_MAX
}
}
}
}
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.WORKERS_MIN
with the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go below this number, even if a lower number of workers can handle the load.WORKERS_MAX
with the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
workloads_config {
worker {
min_count = 2
max_count = 6
}
}
}
}
Adjust the number of schedulers
Your environment can run more than one Airflow scheduler at the same time. Use multiple schedulers to distribute load between several scheduler instances for better performance and reliability.
You can have up to 10 schedulers in your environment.Increasing the number of schedulers does not always improve Airflow performance. For example, having only one scheduler might provide better performance than having two. This might happen when the extra scheduler is not utilized, and thus consumes resources of your environment without contributing to overall performance. The actual scheduler performance depends on the number of Airflow workers, the number of DAGs and tasks that run in your environment, and the configuration of both Airflow and the environment.
We recommend starting with two schedulers and then monitoring the performance of your environment. If you change the number of schedulers, you can always scale your environment back to the original number of schedulers.
For more information about configuring multiple schedulers, see Airflow documentation.
To change the number of schedulers for your environment:
Console
Go to the Environments page in the Google Cloud console:
Select your environment.
Go to the Environment configuration tab.
In the Resources > Workloads configuration item, click Edit.
In the Workloads configuration dialog, in the Number of schedulers drop-down list, set the number of schedulers for your environment.
Click Save.
gcloud
Run the following Google Cloud CLI command:
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--scheduler-count SCHEDULER_COUNT
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.SCHEDULER_COUNT
with the number of schedulers.
Example:
gcloud composer environments update example-environment \
--location us-central1 \
--scheduler-count 2
API
Create an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify theconfig.softwareConfig.workloadsConfig.scheduler
mask.In the request body, in the
count
field, specify the number of schedulers.
"config": {
"workloadsConfig": {
"scheduler": {
"count": SCHEDULER_COUNT
}
}
}
Replace:
SCHEDULER_COUNT
with the number of schedulers.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environmentsexample-environment?updateMask=
// config.workloadsConfig.scheduler
"config": {
"workloadsConfig": {
"scheduler": {
"count": 2
}
}
}
Terraform
The count
field in the workloads_config.scheduler
block
specifies the number of schedulers in your environment:
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
workloads_config {
scheduler {
count = SCHEDULER_COUNT
}
}
}
}
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.SCHEDULER_COUNT
with the number of schedulers.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
workloads_config {
scheduler {
count = 2
}
}
}
}
Adjust the number of triggerers
By default, the Airflow triggerer is disabled in your environment, and the number of triggerers is set to 0. After you set the number of triggerers to 1, the triggerer is enabled and you can use deferrable operators in your DAGs.
Even if the triggerer is disabled, your environment's cluster still runs a workload for it, with zero pods. If the triggerer is enabled, then it is billed as other environment components, with Cloud Composer Compute SKUs.
Console
In Google Cloud console, go to the Environments page.
In the list of environments, click the name of your environment. The Environment details page opens.
Go to the Environment configuration tab.
In the Resources > Workloads item, click Edit. The Workloads configuration pane opens.
Select Enable triggerer. As an option, you can also adjust the CPU and memory for the triggerer.
Click Save and wait until your environment is updated.
gcloud
Run the following Google Cloud CLI command:
gcloud beta composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--triggerer-count TRIGGERER_COUNT
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.TRIGGERER_COUNT
with the number of triggerers.
Example:
gcloud composer environments update example-environment \
--location us-central1 \
--triggerer-count 1
API
Create an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify theconfig.workloadsConfig.triggerer
mask.Your environment can have only one triggerer. In the request body, specify the number of triggerers in the following way:
- To enable the Airflow triggerer, set the
count
value to1
. - To disable the Airflow triggerer, set the
count
value to0
.
- To enable the Airflow triggerer, set the
"config": {
"workloadsConfig": {
"triggerer": {
"count": 1
}
}
}
The following example enables the triggerer and sets default CPU and memory parameters for it. If you want to use custom parameters, specify them in the same API call.
// PATCH https://composer.googleapis.com/v1beta1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.workloadsConfig.triggerer
"config": {
"workloadsConfig": {
"triggerer": {
"count": 1
}
}
}
Terraform
The count
field in the workloads_config.triggerer
block
specifies the number of triggerers in your environment:
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
workloads_config {
triggerer {
count = TRIGGERER_COUNT
}
}
}
}
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.TRIGGERER_COUNT
with the number of triggerers.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
workloads_config {
triggerer {
count = 1
}
}
}
}
Adjust worker, scheduler, triggerer and web server scale and performance parameters
You can specify the amount of CPUs, memory, and disk space used by your environment. In this way, you can increase performance of your environment, in addition to horizontal scaling provided by using multiple workers and schedulers.
Console
In Google Cloud console, go to the Environments page.
In the list of environments, click the name of your environment. The Environment details page opens.
Go to the Environment configuration tab.
In the Resources > Workloads item, click Edit. The Workloads configuration pane opens.
In the Number of schedulers and Number of triggerers drop-down lists select the number of schedulers and triggerers in your environment.
In the Workloads configuration pane, in the CPU, Memory, and Storage fields specify the number of CPUs, memory, and storage for Airflow schedulers, triggerer, web server, and workers.
Click Save.
gcloud
The following arguments control the CPU, memory, and disk space parameters of Airflow schedulers, web server, and workers. Each scheduler and worker uses the specified amount of resources.
--scheduler-cpu
specifies the number of CPUs for an Airflow scheduler.--scheduler-memory
specifies the amount of memory for an Airflow scheduler.--scheduler-storage
specifies the amount of disk space for an Airflow scheduler.--triggerer-cpu
specifies the number of CPUs for an Airflow triggerer.--triggerer-memory
specifies the amount of memory for an Airflow triggerer.--web-server-cpu
specifies the number of CPUs for the Airflow web server.--web-server-memory
specifies the amount of memory for the Airflow web server.--web-server-storage
specifies the amount of disk space for the Airflow web server.--worker-cpu
specifies the number of CPUs for an Airflow worker.--worker-memory
specifies the amount of memory for an Airflow worker.--worker-storage
specifies the amount of disk space for an Airflow worker.
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--scheduler-cpu SCHEDULER_CPU \
--scheduler-memory SCHEDULER_MEMORY \
--scheduler-storage SCHEDULER_STORAGE \
--triggerer-cpu TRIGGERER_CPU \
--triggerer-memory TRIGGERER_MEMORY \
--web-server-cpu WEB_SERVER_CPU \
--web-server-memory WEB_SERVER_MEMORY \
--web-server-storage WEB_SERVER_STORAGE \
--worker-cpu WORKER_CPU \
--worker-memory WORKER_MEMORY \
--worker-storage WORKER_STORAGE
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.SCHEDULER_CPU
with the number of CPUs for a scheduler, in vCPU units.SCHEDULER_MEMORY
with the amount of memory for a scheduler.SCHEDULER_STORAGE
with the disk size for a scheduler.TRIGGERER_CPU
with the number of CPUs for a triggerer, in vCPU units.TRIGGERER_MEMORY
with the amount of memory for a triggerer.WEB_SERVER_CPU
with the number of CPUs for web server, in vCPU units.WEB_SERVER_MEMORY
with the amount of memory for web server.WEB_SERVER_STORAGE
with the amount of memory for the web server.WORKER_CPU
with the number of CPUs for a worker, in vCPU units.WORKER_MEMORY
with the amount of memory for a worker.WORKER_STORAGE
with the disk size for a worker.
Example:
gcloud composer environments update example-environment \
--location us-central1 \
--scheduler-cpu 0.5 \
--scheduler-memory 2.5GB\
--scheduler-storage 2GB \
--triggerer-cpu 1 \
--triggerer-memory 1GB \
--web-server-cpu 1 \
--web-server-memory 2.5GB \
--web-server-storage 2GB \
--worker-cpu 1 \
--worker-memory 2GB \
--worker-storage 2GB
API
Create an
environments.patch
API request.In this request:
- In the
updateMask
parameter, specify the fields that you want to update. For example, to update all parameters for schedulers, specify theconfig.workloadsConfig.scheduler.cpu,config.workloadsConfig.scheduler.memoryGb,config.workloadsConfig.scheduler.storageGB
mask.
When you update triggerer parameters, specify the
config.workloadsConfig.triggerer
mask. It is not possible to specify masks for individual parameters of the triggerer.- In the request body, specify the scale and performance parameters.
- In the
"config": {
"workloadsConfig": {
"scheduler": {
"cpu": SCHEDULER_CPU,
"memoryGb": SCHEDULER_MEMORY,
"storageGb": SCHEDULER_STORAGE
},
"triggerer": {
"count": 1,
"cpu": TRIGGERER_CPU,
"memoryGb": TRIGGERER_MEMORY
}
"webServer": {
"cpu": WEB_SERVER_CPU,
"memoryGb": WEB_SERVER_MEMORY,
"storageGb": WEB_SERVER_STORAGE
},
"worker": {
"cpu": WORKER_CPU,
"memoryGb": WORKER_MEMORY,
"storageGb": WORKER_STORAGE
}
}
}
Replace:
SCHEDULER_CPU
with the number of CPUs for a scheduler, in vCPU units.SCHEDULER_MEMORY
with the amount of memory for a scheduler, in GB.SCHEDULER_STORAGE
with the disk size for a scheduler, in GB.TRIGGERER_CPU
with the number of CPUs for a triggerer, in vCPU units.TRIGGERER_MEMORY
with the amount of memory for a triggerer, in GB.WEB_SERVER_CPU
with the number of CPUs for the web server, in vCPU units.WEB_SERVER_MEMORY
with the amount of memory for the web server, in GB.WEB_SERVER_STORAGE
with the disk size for the web server, in GB.WORKER_CPU
with the number of CPUs for a worker, in vCPU units.WORKER_MEMORY
with the amount of memory for a worker, in GB.WORKER_STORAGE
with the disk size for a worker, in GB.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.workloadsConfig.scheduler.cpu,
// config.workloadsConfig.scheduler.memoryGB,
// config.workloadsConfig.scheduler.storageGb,
// config.workloadsConfig.triggerer
// config.workloadsConfig.webServer.cpu,
// config.workloadsConfig.webServer.memoryGb,
// config.workloadsConfig.webServer.storageGb,
// config.workloadsConfig.worker.cpu,
// config.workloadsConfig.worker.memoryGb,
// config.workloadsConfig.worker.storageGb
"config": {
"workloadsConfig": {
"scheduler": {
"cpu": 0.5,
"memoryGb": 2.5,
"storageGb": 2
},
"triggerer": {
"count": 1,
"cpu": 1,
"memoryGb": 1
},
"webServer": {
"cpu": 0.5,
"memoryGb": 2.5,
"storageGb": 2
},
"worker": {
"cpu": 1,
"memoryGb": 2,
"storageGb": 2
}
}
}
Terraform
The following blocks in the workloadsConfig
block control the CPU, memory,
and disk space parameters of Airflow schedulers, web server, triggerers, and
workers. Each scheduler, triggerer, and worker uses the specified amount of
resources.
- The
scheduler.cpu
field specifies the number of CPUs for an Airflow scheduler. - The
scheduler.memory_gb
field specifies the amount of memory for an Airflow scheduler. - The
scheduler.storage_gb
field specifies the amount of disk space for a scheduler. - The
triggerer.cpu
field specifies the number of CPUs for an Airflow triggerer. - The
triggerer.memory_gb
field specifies the amount of memory for an Airflow triggerer. - The
web_server.cpu
field specifies the number of CPUs for the Airflow web server. - The
web_server.memory_gb
field specifies the amount of memory for the Airflow web server. - The
web_server.storage_gb
field specifies the amount of disk space for the Airflow web server. - The
worker.cpu
field specifies the number of CPUs for an Airflow worker. - The
worker.memory_gb
field specifies the amount of memory for an Airflow worker. - The
worker.storage_gb
field specifies the amount of disk space for an Airflow worker.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
workloads_config {
scheduler {
cpu = SCHEDULER_CPU
memory_gb = SCHEDULER_MEMORY
storage_gb = SCHEDULER_STORAGE
}
triggerer {
cpu = TRIGGERER_CPU
memory_gb = TRIGGERER_MEMORY
count = 1
}
web_server {
cpu = WEB_SERVER_CPU
memory_gb = WEB_SERVER_MEMORY
storage_gb = WEB_SERVER_STORAGE
}
worker {
cpu = WORKER_CPU
memory_gb = WORKER_MEMORY
storage_gb = WORKER_STORAGE
}
}
}
}
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.SCHEDULER_CPU
with the number of CPUs for a scheduler, in vCPU units.SCHEDULER_MEMORY
with the amount of memory for a scheduler, in GB.SCHEDULER_STORAGE
with the disk size for a scheduler, in GB.TRIGGERER_CPU
with the number of CPUs for a triggerer, in vCPU units.TRIGGERER_MEMORY
with the amount of memory for a triggerer, in GB.WEB_SERVER_CPU
with the number of CPUs for the web server, in vCPU units.WEB_SERVER_MEMORY
with the amount of memory for the web server, in GB.WEB_SERVER_STORAGE
with the disk size for the web server, in GB.WORKER_CPU
with the number of CPUs for a worker, in vCPU units.WORKER_MEMORY
with the amount of memory for a worker, in GB.WORKER_STORAGE
with the disk size for a worker, in GB.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
workloads_config {
scheduler {
cpu = 0.5
memory_gb = 1.875
storage_gb = 1
}
triggerer {
cpu = 0.5
memory_gb = 0.5
count = 1
}
web_server {
cpu = 0.5
memory_gb = 1.875
storage_gb = 1
}
worker {
cpu = 0.5
memory_gb = 1.875
storage_gb = 1
}
}
}
}
Adjust the environment size
The Environment size controls the performance parameters of the managed Cloud Composer infrastructure that includes the Airflow database. Consider selecting a larger environment size if you want to run a large number of DAGs and tasks.
Console
Go to the Environments page in the Google Cloud console:
Select your environment.
Go to the Environment configuration tab.
In the Resources > Core infrastructure item, click Edit.
In the Core infrastructure dialog, in the Environment size field, specify the environment size.
Click Save.
gcloud
The --environment-size
argument controls the environment size:
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--environment-size ENVIRONMENT_SIZE
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.ENVIRONMENT_SIZE
withsmall
,medium
, orlarge
.
Example:
gcloud composer environments update example-environment \
--location us-central1 \
--environment-size medium
API
Create an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify theconfig.environmentSize
mask.In the request body, specify the environment size.
"config": {
"environmentSize": "ENVIRONMENT_SIZE"
}
Replace:
ENVIRONMENT_SIZE
with the environment size,ENVIRONMENT_SIZE_SMALL
,ENVIRONMENT_SIZE_MEDIUM
, orENVIRONMENT_SIZE_LARGE
.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.environmentSize
"config": {
"environmentSize": "ENVIRONMENT_SIZE_MEDIUM"
}
Terraform
The environment_size
field in the config
block controls the environment size:
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
environment_size = "ENVIRONMENT_SIZE"
}
}
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.ENVIRONMENT_SIZE
with the environment size,ENVIRONMENT_SIZE_SMALL
,ENVIRONMENT_SIZE_MEDIUM
, orENVIRONMENT_SIZE_LARGE
.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
environment_size = "ENVIRONMENT_SIZE_SMALL"
}
}
}
What's next
- Environment scaling and performance
- Cloud Composer pricing
- Update environments
- Environment architecture