Cloud Composer overview

Cloud Composer 1 | Cloud Composer 2

Cloud Composer is a fully managed workflow orchestration service, enabling you to create, schedule, monitor, and manage workflows that span across clouds and on-premises data centers.

Cloud Composer is built on the popular Apache Airflow open source project and operates using the Python programming language.

By using Cloud Composer instead of a local instance of Apache Airflow, you can benefit from the best of Airflow with no installation or management overhead. Cloud Composer helps you create Airflow environments quickly and use Airflow-native tools, such as the powerful Airflow web interface and command-line tools, so you can focus on your workflows and not your infrastructure.

Workflows, DAGs, and tasks

In data analytics, a workflow represents a series of tasks for ingesting, transforming, analyzing, or utilizing data. In Airflow, workflows are created using DAGs, or "Directed Acyclic Graphs".

Relationship between DAGs and tasks
Figure 1. Relationship between DAGs and tasks

A DAG is a collection of tasks that you want to schedule and run, organized in a way that reflects their relationships and dependencies. DAGs are created in Python scripts, which define the DAG structure (tasks and their dependencies) using code.

Each task in a DAG can represent almost anything—for example, one task might perform any of the following functions:

  • Preparing data for ingestion
  • Monitoring an API
  • Sending an email
  • Running a pipeline

A DAG should not be concerned with the function of each constituent task—its purpose is to ensure that each task is executed at the right time, in the right order, or with the right issue handling.

For more information on DAGs and tasks, see the Apache Airflow documentation.

Cloud Composer environments

To run workflows, you first need to create an environment. Airflow depends on many micro-services to run, so Cloud Composer provisions Google Cloud components to run your workflows. These components are collectively known as a Cloud Composer environment.

Environments are self-contained Airflow deployments based on Google Kubernetes Engine. They work with other Google Cloud services using connectors built into Airflow. You can create one or more environments in a single Google Cloud project. You can create Cloud Composer environments in any supported region.

For an in-depth look at the components of an environment, see Environment architecture.

What version of Apache Airflow does Cloud Composer use?

Cloud Composer supports both Airflow 1 and Airflow 2.

Cloud Composer environments are based on Cloud Composer images. When you create an environment, you can select an image with a specific Airflow version.

You have control over the Apache Airflow version of your environment. You can decide to upgrade your environment to a newer version of Cloud Composer image. Each Cloud Composer release supports several Apache Airflow versions.

Can I use native Airflow UI and CLI?

You can access the Apache Airflow web interface of your environment. Each of your environments has its own Airflow UI. For more information about accessing the Airflow UI, see Airflow web interface.

To run Airflow CLI commands in your environments, you use gcloud commands. For more information about running Airflow CLI commands in Cloud Composer environments, see Airflow command-line interface.

What's next