Cloud Composer 1 | Cloud Composer 2 | Cloud Composer 3
This page explains how an environment can be updated.
About update operations
When you change parameters of your environment, such as specifying new scaling and performance parameters, or installing custom PyPI packages, your environment updates.
After this operation is completed, changes become available in your environment.
For a single Cloud Composer environment, you can start only one update operation at a time. You must wait for an update operation to complete before starting another environment operation.
How updates affect running Airflow tasks
When you run an update operation, Airflow schedulers and workers in your environment might require a restart. In this case, all currently running tasks are terminated. After the update operation is completed, Airflow schedules these tasks for a retry, depending on the way you configure retries for your DAGs.
The following changes cause Airflow task termination:
- Upgrading your environment to a new version.
- Adding, changing, or deleting custom PyPI packages.
- Changing Cloud Composer environment variables.
- Adding or removing Airflow configuration options overrides, or changing their values.
- Changing Airflow workers' CPU, memory, or storage.
- Reducing the maximum number of Airflow workers, if the new value is lower than the number of currently running workers. For example, if an environment currently runs three workers, and the maximum is reduced to two.
- Changing environment's resilience mode.
The following changes do not cause Airflow task termination:
- Creating, updating, or deleting a DAG (not an update operation).
- Pausing or unpausing DAGs (not an update operation).
- Changing Airflow variables (not an update operation).
- Changing Airflow connections (not an update operation).
- Enabling or disabling Dataplex Data Lineage integration.
- Changing environment's size.
- Changing the number of schedulers.
- Changing Airflow schedulers' CPU, memory, or storage.
- Changing the number of triggerers.
- Changing Airflow triggerers' CPU, memory, or storage.
- Changing Airflow web server's CPU, memory, or storage.
- Increasing or decreasing the minimum number of workers.
- Reducing the maximum number of Airflow workers. For example, if an environment currently runs two workers, and the maximum is reduced to three.
- Changing maintenance windows.
- Changing scheduled snapshots settings.
- Changing environment labels.
Updating with Terraform
Run terraform plan
before terraform apply
to see if Terraform creates a new
environment instead of updating it.
Before you begin
Check that your account, the service account of your environment, and the Cloud Composer Service Agent account in your project have required permissions:
Your account must have a role that can trigger environment update operations.
The service account of your environment must have a role that has enough permissions to perform update operations.
The
gcloud composer environments update
command terminates when the operation is finished. You can use the--async
flag to avoid waiting for the operation to complete.
Update environments
For more information about updating your environment, see other documentation pages about specific update operations. For example:
- Override Airflow configuration options
- Set environment variables
- Install Python dependencies
- Scale environments
- Configure authorized networks
View environment details
Console
In Google Cloud console, go to the Environments page.
In the list of environments, click the name of your environment. The Environment details page opens.
gcloud
Run the following gcloud
command:
gcloud composer environments describe ENVIRONMENT_NAME \
--location LOCATION
Replace:
ENVIRONMENT_NAME
with the name of the environment.LOCATION
with the region where the environment is located.
API
Construct an environments.get
API request.
Example:
GET https://composer.googleapis.com/v1/projects/example-project/
locations/us-central1/environments/example-environment
Terraform
Run the terraform state show
command for your environment's resource.
The name of your environment's Terraform resource might be different than the name of your environment.
terraform state show google_composer_environment.RESOURCE_NAME
Replace:
RESOURCE_NAME
with the name of your environment's resource.
Rolling back update changes
In some rare situations, an update operation might be interrupted (for example, because of a timeout) and the requested changes might not be rolled back in all environment components (such as the Airflow webserver).
For example, an update operation might be installing or removing additional PyPI modules, re-defining or defining a new Airflow or Cloud Composer environment variable, or changing some Airflow-related parameters.
Such a situation might occur if an update operation is triggered when other operations are in progress, for example Cloud Composer cluster's autoscaling or a maintenance operation.
In such a situation, it's recommended to repeat the operation.
Duration of update or upgrade operations
Most update or upgrade operations require restarting Airflow components like Airflow schedulers, workers and web servers.
Once a component is restarted, it must be initialized. During the
initialization, Airflow schedulers and workers download the contents of /dags
and /plugins
folders from the environment's bucket. The process of syncing
files to Airflow schedulers and workers is not instantaneous and depends on
the total size and number of all objects in these folders.
We recommend to keep only DAG and plugin files in /dags
and /plugins
folders (respectively) and remove all other files. Too much data
in /dags
and /plugins
folders might slow down the initialization of Airflow
components and in certain cases might make the initialization not possible.
We recommend to keep less than 30 MB of data in /dags
and /plugins
folders, and to definitely not exceed 100 MB size of data.
For more information, also see:
Upgrading the machine type for GKE nodes
You can manually upgrade the machine type for your environment's
GKE cluster by deleting the existing default-pool
and
creating a new default-pool
with the desired machine type.
We recommend you to specify a suitable machine type for the type of computing that occurs in your Cloud Composer environment when you create an environment.
If you are running jobs that perform resource-intensive computation, you might want to use GKE Operators.
After an upgrade, the previous machine type is still listed in your environment's details. For example, the Environment details page does not reflect the new machine type.
Console
To upgrade the machine type:
In Google Cloud console, go to the Environments page.
In the list of environments, click the name of your environment. The Environment details page opens.
Get information about the default node pool:
Go to the Environment configuration tab.
Click the view cluster details link.
On the Clusters page in the Nodes section, click default-pool.
Note all the information for default-pool on the Node pool details page. You use this information to create a new default node pool for your environment.
To delete default-pool:
On the Node pool details page, click the back arrow to return to the Clusters page for your environment.
In the Node Pools section, click the trash icon for the default-pool. Then click Delete to confirm the operation.
To create the new default-pool:
On the Clusters page, click Add node pool.
For Name, enter
default-pool
. You must use thedefault-pool
name so that workflows in your environment can run in this pool.Enter the Size and Nodes settings.
(Only for default Compute Engine service accounts) For Access scopes, select Allow full access to all Cloud APIs.
Click Save.
If you notice that workloads are distributed unevenly, scale down the airflow-worker deployment to zero and scale up again.