Cloud Composer 1 | Cloud Composer 2
This page describes how to scale Cloud Composer environments in Cloud Composer 2.
Other pages about scaling:
For a guide about selecting optimal scale and performance parameters for your environment, see Optimize environment performance and costs.
For information about how environment scaling works, see Environment scaling.
Scale vertically and horizontally
Options for horizontal scaling:
Options for vertical scaling:
- Adjust worker, scheduler, and web server scale and performance parameters
- Adjust the environment size
Adjust the minimum and maximum number of workers
You can set the minimum and maximum number of workers for your environment. Cloud Composer automatically scales your environment within the set limits. You can adjust these limits at any time.
Console
Go to the Environments page in the Google Cloud console:
Select your environment.
Go to the Environment configuration tab.
In the Resources > Workloads configuration item, click Edit.
In the Workloads configuration dialog, in the Workers autoscaling section adjust the limits for Airflow workers:
In the Minimum number of workers field, specify the number of Airflow workers that your environment must always run. The number of workers in your environment does not go below this number, even if a lower number of workers can handle the load.
In the Maximum number of workers field, specify the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.
Click Save.
gcloud
Run the following gcloud composer
command:
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--min-workers WORKERS_MIN \
--max-workers WORKERS_MAX
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.WORKERS_MIN
with the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a lower number of workers can handle the load.WORKERS_MAX
with the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.
Example:
gcloud composer environments update example-environment \
--location us-central1 \
--min-workers 2 \
--max-workers 6
API
Construct an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify theconfig.softwareConfig.workloadsConfig.worker.minCount,config.softwareConfig.workloadsConfig.worker.maxCount
mask.In the request body, in the
minCount
andmaxCount
fields, specify the new worker limits.
"config": {
"workloadsConfig": {
"worker": {
"minCount": WORKERS_MIN,
"maxCount": WORKERS_MAX
}
}
}
Replace:
WORKERS_MIN
with the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a lower number of workers can handle the load.WORKERS_MAX
with the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.workloadsConfig.worker.minCount,
// config.workloadsConfig.worker.maxCount
"config": {
"workloadsConfig": {
"worker": {
"minCount": 2,
"maxCount": 6
}
}
}
Terraform
The min_count
and max_count
fields in the workloadsConfig.worker
block
specify the minimum and maximum number of workers in your environment:
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
workloads_config {
worker {
min_count = WORKERS_MIN
max_count = WORKERS_MAX
}
}
}
}
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.WORKERS_MIN
with the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a lower number of workers can handle the load.WORKERS_MAX
with the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
workloads_config {
worker {
min_count = 2
max_count = 6
}
}
}
}
Adjust the number of schedulers
Your environment can run more than one Airflow scheduler at the same time. Use multiple schedulers to distribute load between several scheduler instances for better performance and reliability. You can specify a number of schedulers up to the number of nodes in your environment.
Increasing the number of schedulers does not always improve Airflow performance. For example, having only one scheduler might provide better performance than having two. This might happen when the extra scheduler is not utilized, and thus consumes resources of your environment without contributing to overall performance. The actual scheduler performance depends on the number of Airflow workers, the number of DAGs and tasks that run in your environment, and the configuration of both Airflow and the environment.
We recommend starting with two schedulers and then monitoring the performance of your environment. If you change the number of schedulers, you can always scale your environment back to the original number of schedulers.
For more information about configuring multiple schedulers, see Airflow documentation.
To change the number of schedulers for your environment:
Console
Go to the Environments page in the Google Cloud console:
Select your environment.
Go to the Environment configuration tab.
In the Resources > Workloads configuration item, click Edit.
In the Workloads configuration dialog, in the Number of schedulers field, set the number of schedulers for your environment.
Click Save.
gcloud
Run the following gcloud composer
command:
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--scheduler-count SCHEDULER_COUNT
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.SCHEDULER_COUNT
with the number of schedulers.
Example:
gcloud composer environments update example-environment \
--location us-central1 \
--scheduler-count 2
API
Create an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify theconfig.softwareConfig.workloadsConfig.scheduler
mask.In the request body, in the
count
field, specify the number of schedulers.
"config": {
"workloadsConfig": {
"scheduler": {
"count": SCHEDULER_COUNT
}
}
}
Replace:
SCHEDULER_COUNT
with the number of schedulers.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environmentsexample-environment?updateMask=
// config.workloadsConfig.scheduler
"config": {
"workloadsConfig": {
"scheduler": {
"count": 2
}
}
}
Terraform
The count
field in the workloadsConfig.scheduler
block
specifies the number of schedulers in your environment:
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
workloads_config {
scheduler {
count = SCHEDULER_COUNT
}
}
}
}
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.SCHEDULER_COUNT
with the number of schedulers.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
workloads_config {
scheduler {
count = 2
}
}
}
}
Adjust worker, scheduler, and web server scale and performance parameters
You can specify the amount of CPUs, memory, and disk space used by your environment. In this way, you can increase performance of your environment, in addition to horizontal scaling provided by using multiple workers and schedulers.
Console
Go to the Environments page in the Google Cloud console:
Select your environment.
Go to the Environment configuration tab.
In the Resources > Workloads configuration item, click Edit.
In the Workloads configuration dialog, in the CPU, Memory, and Storage fields specify the number of CPUs, memory, and storage for Airflow schedulers, web server, and workers.
Click Save.
gcloud
The following arguments control the CPU, memory, and disk space parameters of Airflow schedulers, web server, and workers. Each scheduler and worker uses the specified amount of resources.
--scheduler-cpu
specifies the number of CPUs for an Airflow scheduler.--scheduler-memory
specifies the amount of memory for an Airflow scheduler.--scheduler-storage
specifies the amount of disk space for an Airflow scheduler.--web-server-cpu
specifies the number of CPUs for the Airflow web server.--web-server-memory
specifies the amount of memory for the Airflow web server.--web-server-storage
specifies the amount of disk space for the Airflow web server.--worker-cpu
specifies the number of CPUs for an Airflow worker.--worker-memory
specifies the amount of memory for an Airflow worker.--worker-storage
specifies the amount of disk space for an Airflow worker.
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--scheduler-cpu SCHEDULER_CPU \
--scheduler-memory SCHEDULER_MEMORY \
--scheduler-storage SCHEDULER_STORAGE \
--web-server-cpu WEB_SERVER_CPU \
--web-server-memory WEB_SERVER_MEMORY \
--web-server-storage WEB_SERVER_STORAGE \
--worker-cpu WORKER_CPU \
--worker-memory WORKER_MEMORY \
--worker-storage WORKER_STORAGE
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.SCHEDULER_CPU
with the number of CPUs for a scheduler, in vCPU units.SCHEDULER_MEMORY
with the amount of memory for a scheduler.SCHEDULER_STORAGE
with the disk size for a scheduler.WEB_SERVER_CPU
with the number of CPUs for web server, in vCPU units.WEB_SERVER_MEMORY
with the amount of memory for web server.WEB_SERVER_STORAGE
with the amount of memory for the web server.WORKER_CPU
with the number of CPUs for a worker, in vCPU units.WORKER_MEMORY
with the amount of memory for a worker.WORKER_STORAGE
with the disk size for a worker.
Example:
gcloud composer environments update example-environment \
--location us-central1 \
--scheduler-cpu 0.5 \
--scheduler-memory 2.5 \
--scheduler-storage 2 \
--web-server-cpu 1 \
--web-server-memory 2.5 \
--web-server-storage 2 \
--worker-cpu 1 \
--worker-memory 2 \
--worker-storage 2 \
API
Create an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify the fields that you want to update. For example, to update all parameters for schedulers, specifyconfig.softwareConfig.workloadsConfig.scheduler.cpu,config.softwareConfig.workloadsConfig.scheduler.memoryGb,config.softwareConfig.workloadsConfig.scheduler.storageGB
mask.In the request body, specify the scale and performance parameters.
"config": {
"workloadsConfig": {
"scheduler": {
"cpu": SCHEDULER_CPU,
"memoryGb": SCHEDULER_MEMORY,
"storageGb": SCHEDULER_STORAGE
},
"webServer": {
"cpu": WEB_SERVER_CPU,
"memoryGb": WEB_SERVER_MEMORY,
"storageGb": WEB_SERVER_STORAGE
},
"worker": {
"cpu": WORKER_CPU,
"memoryGb": WORKER_MEMORY,
"storageGb": WORKER_STORAGE
}
}
}
Replace:
SCHEDULER_CPU
with the number of CPUs for a scheduler, in vCPU units.SCHEDULER_MEMORY
with the amount of memory for a scheduler, in GB.SCHEDULER_STORAGE
with the disk size for a scheduler, in GB.WEB_SERVER_CPU
with the number of CPUs for the web server, in vCPU units.WEB_SERVER_MEMORY
with the amount of memory for the web server, in GB.WEB_SERVER_STORAGE
with the disk size for the web server, in GB.WORKER_CPU
with the number of CPUs for a worker, in vCPU units.WORKER_MEMORY
with the amount of memory for a worker, in GB.WORKER_STORAGE
with the disk size for a worker, in GB.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.workloadsConfig.scheduler.cpu,
// config.workloadsConfig.scheduler.memoryGB,
// config.workloadsConfig.scheduler.storageGb,
// config.workloadsConfig.webServer.cpu,
// config.workloadsConfig.webServer.memoryGb,
// config.workloadsConfig.webServer.storageGb,
// config.workloadsConfig.worker.cpu,
// config.workloadsConfig.worker.memoryGb,
// config.workloadsConfig.worker.storageGb
"config": {
"workloadsConfig": {
"scheduler": {
"cpu": 0.5,
"memoryGb": 2.5,
"storageGb": 2
},
"webServer": {
"cpu": 0.5,
"memoryGb": 2.5,
"storageGb": 2
},
"worker": {
"cpu": 1,
"memoryGb": 2,
"storageGb": 2
}
}
}
Terraform
The following blocks in the workloadsConfig
block control the CPU, memory,
and disk space parameters of Airflow schedulers, web server, and workers.
Each scheduler and worker uses the specified amount of resources.
- The
scheduler.cpu
field specifies the number of CPUs for an Airflow scheduler. - The
scheduler.memoryGb
field specifies the amount of memory for an Airflow scheduler. - The
scheduler.storageGb
field specifies the amount of disk space for a scheduler. - The
webServer.cpu
field specifies the number of CPUs for the Airflow web server. - The
webServer.memoryGb
field specifies the amount of memory for the Airflow web server. - The
webServer.storageGb
field specifies the amount of disk space for the Airflow web server. - The
worker.cpu
field specifies the number of CPUs for an Airflow worker. - The
worker.memoryGb
field specifies the amount of memory for an Airflow worker. worker.storageGb
specifies the amount of disk space for an Airflow worker.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
workloads_config {
scheduler {
cpu = SCHEDULER_CPU
memory_gb = SCHEDULER_MEMORY
storage_gb = SCHEDULER_STORAGE
}
web_server {
cpu = WEB_SERVER_CPU
memory_gb = WEB_SERVER_MEMORY
storage_gb = WEB_SERVER_STORAGE
}
worker {
cpu = WORKER_CPU
memory_gb = WORKER_MEMORY
storage_gb = WORKER_STORAGE
}
}
}
}
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.SCHEDULER_CPU
with the number of CPUs for a scheduler, in vCPU units.SCHEDULER_MEMORY
with the amount of memory for a scheduler, in GB.SCHEDULER_STORAGE
with the disk size for a scheduler, in GB.WEB_SERVER_CPU
with the number of CPUs for the web server, in vCPU units.WEB_SERVER_MEMORY
with the amount of memory for the web server, in GB.WEB_SERVER_STORAGE
with the disk size for the web server, in GB.WORKER_CPU
with the number of CPUs for a worker, in vCPU units.WORKER_MEMORY
with the amount of memory for a worker, in GB.WORKER_STORAGE
with the disk size for a worker, in GB.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
workloads_config {
scheduler {
cpu = 0.5
memory_gb = 1.875
storage_gb = 1
}
web_server {
cpu = 0.5
memory_gb = 1.875
storage_gb = 1
}
worker {
cpu = 0.5
memory_gb = 1.875
storage_gb = 1
}
}
}
}
Adjust the environment size
The Environment size controls the performance parameters of the managed Cloud Composer infrastructure that includes the Airflow database. Consider selecting a larger environment size if you want to run a large number of DAGs and tasks.
Console
Go to the Environments page in the Google Cloud console:
Select your environment.
Go to the Environment configuration tab.
In the Resources > Core infrastructure item, click Edit.
In the Core infrastructure dialog, in the Environment size field, specify the environment size.
Click Save.
gcloud
The --environment-size
argument controls the environment size:
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--environment-size ENVIRONMENT_SIZE
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.ENVIRONMENT_SIZE
withsmall
,medium
, orlarge
.
Example:
gcloud composer environments update example-environment \
--location us-central1 \
--environment-size medium
API
Create an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify theconfig.environmentSize
mask.In the request body, specify the environment size.
"config": {
"environmentSize": "ENVIRONMENT_SIZE"
}
Replace:
ENVIRONMENT_SIZE
with the environment size,ENVIRONMENT_SIZE_SMALL
,ENVIRONMENT_SIZE_MEDIUM
, orENVIRONMENT_SIZE_LARGE
.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.environmentSize
"config": {
"environmentSize": "ENVIRONMENT_SIZE_MEDIUM"
}
Terraform
The environment_size
field in the config
block controls the environment size:
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
environment_size = "ENVIRONMENT_SIZE"
}
}
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.ENVIRONMENT_SIZE
with the environment size,ENVIRONMENT_SIZE_SMALL
,ENVIRONMENT_SIZE_MEDIUM
, orENVIRONMENT_SIZE_LARGE
.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
environment_size = "ENVIRONMENT_SIZE_SMALL"
}
}
}
What's next
- Environment scaling and performance
- Cloud Composer pricing
- Update environments
- Environment architecture