Troubleshooting environment updates and upgrades

Cloud Composer 1 | Cloud Composer 2

This page provides troubleshooting information for problems that you might encounter while updating or upgrading Cloud Composer environments.

For troubleshooting information related to creating environments, see Troubleshooting environment creation.

When Cloud Composer environments are updated, the majority of issues happen because of the following reasons:

  • Service account permission problems
  • PyPI dependency issues
  • Size of the Airflow database

Insufficient permissions to update or upgrade an environment

If Cloud Composer cannot update or upgrade an environment because of insufficient permissions, it outputs the following error message:

ERROR: (gcloud.composer.environments.update) PERMISSION_DENIED: The caller does not have permission

Solution: Assign roles to both to your account and to the service account of your environment as described in Access control.

The service account of the environment has insufficient permissions

When creating a Cloud Composer environment, you specify a service account that runs the environment's GKE cluster nodes. If this service account does not have enough permissions for the requested operation, Cloud Composer outputs an error:

    UPDATE operation on this environment failed 3 minutes ago with the
    following error message:
    Composer Backend timed out. Currently running tasks are [stage:
    CP_COMPOSER_AGENT_RUNNING
    description: "No agent response published."
    response_timestamp {
      seconds: 1618203503
      nanos: 291000000
    }
    ].

Solution: Assign roles to both to your account and to the service account of your environment as described in Access control.

The size of the Airflow database is too big to perform the operation

A Cloud Composer upgrade operation might not succeed because the size of the Airflow database is too large for upgrade operations to succeed.

If the size of the Airflow database is more than 16 GB, Cloud Composer outputs the following error:

Airflow database uses more than 16 GB. Please clean the database before upgrading.

Solution: Perform the Airflow database cleanup, as described in Airflow database maintenance.

An upgrade to a new Cloud Composer version fails because of PyPI package conflicts

When you upgrade an environment with installed custom PyPI packages, you might encounter errors related to PyPI package conflicts. This might happen because the new Cloud Composer image contains newer versions of preinstalled packages that cause dependency conflicts with PyPI packages that you installed in your environment.

Solution:

  • To get detailed information about package conflicts, run an upgrade check.
  • Loosen version constraints for installed custom PyPI packages. For example, instead of specifying a version as ==1.0.1, specify it as >=1.0.1.
  • For more information about changing version requirements to resolve conflicting dependencies, see pip documentation.

Lack of connectivity to DNS can cause problems while performing upgrades or updates

Such connectivity problems might result in the log entries like this:

WARNING - Compute Engine Metadata server unavailable attempt 1 of 5. Reason: [Errno -3] Temporary failure in name resolution Error

It usually means that there is no route to DNS so make sure that metadata.google.internal DNS name can be resolved to IP address from within Cluster, Pods and Services networks. Check if you have Private Google Access turned on within VPC (in host or service project) where your environment is created.

More information:

What's next