Cloud Composer Features

This page provides an overview of the features and capabilities of Cloud Composer.

Cloud Composer is a managed Apache Airflow service that helps you create, schedule, monitor and manage workflows.

Airflow environments

A Cloud Composer environment is a wrapper around Apache Airflow. Cloud Composer creates the following components for each environment:

  • Web server: The web server runs the Apache Airflow web interface, and Cloud Identity-Aware Proxy protects the interface. For more information, see Airflow Web Interface.
  • Database: The database holds the Apache Airflow metadata.
  • Cloud Storage bucket: Cloud Composer associates a Cloud Storage bucket with the environment. The associated bucket stores the DAGs, logs, custom plugins, and data for the environment. For more information about the storage bucket for Cloud Composer, see Cloud Storage.

Airflow management

To access and manage your Airflow environments, you can use the following Airflow-native tools:

  • Web interface: You can access the Airflow web interface from the Google Cloud Platform Console or by direct URL with the appropriate permissions. For information, see Airflow Web Interface.
  • Command line tools: After you install the Cloud SDK, you can run gcloud composer environments commands to issue Airflow CLI commands to Cloud Composer environments. For information, see Airflow Command-line Interface.

In addition to native tools, the Cloud Composer REST or RPC APIs provide programmatic access to your Airflow environments. For more information, see APIs & References.

Airflow configuration

In general, the configurations that Cloud Composer provides for Apache Airflow is the same as a locally-hosted Airflow deployment. Some Airflow configurations are preconfigured in Cloud Composer, and you cannot change the configuration properties. Other configurations, you specify when creating or updating your environment. For more information, see Airflow Configurations.

Airflow DAGs (workflows)

An Apache Airflow DAG is a workflow: a collection of tasks with additional task dependencies. Cloud Composer uses Cloud Storage to store DAGs. To add or remove DAGs from your Cloud Composer environment, you add or remove the DAGs from the Cloud Storage bucket associated with the environment. Once moved to the storage bucket, DAGs are automatically added and scheduled in your environment.

In addition to scheduling DAGs, you can trigger DAGs manually or in response to events, such as changes that occur in the associated Cloud Storage bucket. For more information, see Triggering DAGs.

Plugins

You can install custom plugins, such as custom, in-house Apache Airflow operators, hooks, sensors, or interfaces, into your Cloud Composer environment. For more information, see Cloud Composer Plugins.

Python dependencies

You can install Python dependencies from the Python Package Index in your environment or from a private package repository. For more information, see Installing Python Dependencies.

If the dependencies are not in the package index, you can also use the plugins feature.

Access control

You manage security at the GCP project level and can assign Cloud Identity and Access Management (IAM) roles that prevent individual users from modifying or creating environments. If someone does not have access to your project or does not have an appropriate Cloud Composer IAM role, that person cannot access any of your environments. For more information, see Cloud Composer Access Control.

Logging and monitoring

You can view Airflow logs that are associated with single DAG tasks in the Airflow web interface and the logs folder in the associated Cloud Storage bucket.

Streaming logs are available for Cloud Composer. You can access the streaming logs in Logs Viewer in the Google Cloud Platform Console and by using Stackdriver. For information about using Stackdriver, see Monitoring Cloud Composer Environments.

Cloud Composer also provides audit logs, such as Admin Activity audit logs, for your GCP projects. For information, see Viewing Audit Logs.

Networking and security

During environment creation, Cloud Composer provides the following configuration options:

Features not yet available

VPC Service Controls

VPC Service Controls enables service perimeter configuration around VPC resources and Google-managed services to control the movement of data across the perimeter boundary.

Currently, VPC Service Controls does not support Cloud Composer.

If your deployment requires VPC Service Controls, the following user identities need to be explicitly granted with access to services that the service perimeter protects.

To create an environment:

Include the Google APIs service account and the service account that Logging uses to write log entries.

- members:
  - serviceAccount:{your-project-number}@cloudservices.gserviceaccount.com
  - serviceAccount:cloud-logs@system.gserviceaccount.com
- ipSubnetworks:
  - 0.0.0.0/0
  - ::/0

To install PyPI dependencies:

Include the Airflow webserver service account and, for PyPI dependencies to work, enable private PyPI repository support in your Cloud Composer environment.

- members:
  - serviceAccount:{tenant-project-id}@appspot.gserviceaccount.com

To update your VPC Service Controls policy and access context, refer to the following Cloud SDK commands:

gcloud access-context-manager policies --help
gcloud access-context-manager levels --help
Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Composer