Quickstart

Cloud Composer 1 | Cloud Composer 2

This page shows you how to create a Cloud Composer environment and run an Apache Airflow DAG in Cloud Composer 1.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  6. Enable the Cloud Composer API.

    Enable the API

Create an environment

Console

  1. In the Google Cloud Console, go to the Create environment page.

    Go to Create environment

  2. In the Name field, enter example-environment.

  3. In the Location drop-down list, select a region for the Cloud Composer environment. See Available regions for information about selecting a region.

  4. For other environment configuration options, use the provided defaults.

  5. To create the environment, click Create.

  6. Wait until the environment is created. When done, a green check mark shows next to the environment name.

gcloud

gcloud composer environments create ENVIRONMENT_NAME \
    --location LOCATION

Replace:

  • ENVIRONMENT_NAME with the name of the environment. This quickstart uses example-environment.

  • LOCATION with a region for the Cloud Composer environment. See Available regions for information about selecting a region.

Example:

gcloud composer environments create example-environment \
    --location us-central1

Terraform

To configure this environment using Terraform, add the following resource block to your Terraform configuration and run terraform apply.

For more information about using Terraform to create a Cloud Composer environment, refer to the Terraform documentation.

resource "google_composer_environment" "example" {
  name = "ENVIRONMENT_NAME"
  region = "LOCATION"
}

Replace:

  • ENVIRONMENT_NAME with the name of the environment.

  • LOCATION with the region for the environment.

    A location is a region for the Cloud Composer environment. See Available regions for information about selecting a region.

Example:

resource "google_composer_environment" "example" {
  name = "example-environment"
  region = "us-central1"
}

View environment details

After the environment creation finishes, you can view your environment's information, such as the Cloud Composer version, the URL for the Airflow web interface, and the DAGs folder in Cloud Storage.

To view the environment information:

  1. In the Google Cloud Console, go to the Environments page.

    Go to Environments

  2. To view the Environment details page, click the name of your environment, example-environment.

Create a DAG

An Airflow DAG is a collection of organized tasks that you want to schedule and run. DAGs are defined in standard Python files.

The Python code in quickstart.py:

  1. Creates a DAG, composer_sample_dag. The DAG runs once per day.
  2. Executes one task, print_dag_run_conf. The task prints the DAG run's configuration by using the bash operator.

To create a DAG, create a copy of the quickstart.py file on your local machine.

Airflow 1

import datetime

import airflow
from airflow.operators import bash_operator

YESTERDAY = datetime.datetime.now() - datetime.timedelta(days=1)

default_args = {
    'owner': 'Composer Example',
    'depends_on_past': False,
    'email': [''],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': datetime.timedelta(minutes=5),
    'start_date': YESTERDAY,
}

with airflow.DAG(
        'composer_sample_dag',
        'catchup=False',
        default_args=default_args,
        schedule_interval=datetime.timedelta(days=1)) as dag:

    # Print the dag_run id from the Airflow logs
    print_dag_run_conf = bash_operator.BashOperator(
        task_id='print_dag_run_conf', bash_command='echo {{ dag_run.id }}')

Airflow 2

import datetime

import airflow
from airflow.operators import bash

YESTERDAY = datetime.datetime.now() - datetime.timedelta(days=1)

default_args = {
    'owner': 'Composer Example',
    'depends_on_past': False,
    'email': [''],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': datetime.timedelta(minutes=5),
    'start_date': YESTERDAY,
}

with airflow.DAG(
        'composer_sample_dag',
        'catchup=False',
        default_args=default_args,
        schedule_interval=datetime.timedelta(days=1)) as dag:

    # Print the dag_run id from the Airflow logs
    print_dag_run_conf = bash.BashOperator(
        task_id='print_dag_run_conf', bash_command='echo {{ dag_run.id }}')

Upload the DAG to Cloud Storage

Cloud Composer schedules only the DAGs that are located in the /dags folder in the environment's Cloud Storage bucket.

To schedule your DAG, upload quickstart.py from your local machine to your environment's /dags folder.

Console

  1. In the Google Cloud Console, go to the Environments page.

    Go to Environments

  2. To open the /dags folder, follow the DAGs folder link for example-environment.

  3. On the Bucket details page, click Upload files and then select your local copy of quickstart.py.

  4. To upload the file, click Open.

    After you upload your DAG, Cloud Composer adds the DAG to Airflow and schedules a DAG run immediately. It might take a few minutes for the DAG to show up in the Airflow web interface.

gcloud

To upload quickstart.py with gcloud, run the following command:

gcloud composer environments storage dags import \
--environment example-environment  --location us-central1 \
--source quickstart.py

View the DAG in the Airflow web interface

Each Cloud Composer environment has a web server that runs the Airflow web interface. You can manage DAGs from the Airflow web interface.

To view the DAG in the Airflow web interface:

Airflow 1

  1. In the Google Cloud Console, go to the Environments page.

    Go to Environments

  2. To open the Airflow web interface, click the Airflow link for example-environment. The Airflow UI opens in a new browser window.

  3. In the Airflow toolbar, go to the DAGs page.

  4. To open the DAG details page, click composer_sample_dag.

    The page for the DAG shows the Tree View, a graphical representation of the workflow's tasks and dependencies.

Airflow 2

  1. In the Google Cloud Console, go to the Environments page.

    Go to Environments

  2. To open the Airflow web interface, click the Airflow link for example-environment. The Airflow UI opens in a new browser window.

  3. In the Airflow toolbar, go to the DAGs page.

  4. To open the DAG details page, click composer_sample_dag.

    The page for the DAG shows the Tree View, a graphical representation of the workflow's tasks and dependencies.

View task instance details in the Airflow logs

The DAG that you scheduled includes the print_dag_run_conf task. The task prints the DAG run's configuration, which you can see in the Airflow logs for the task instance.

To view the task instance details:

Airflow 1

  1. In the DAG's Tree View in the Airflow web interface, click Graph View.

    If you hold the pointer over the print_dag_run_conf task, its status displays.

  2. Click the print_dag_run_conf task.

    In the Task Instance context menu, you can get metadata and perform some actions.

  3. In the Task Instance context menu, click View Log.

  4. In the Log, look for Running: ['bash' to see the output from the bash operator.

Airflow 2

  1. In the DAG's Tree View in the Airflow web interface, click Graph View.

    If you hold the pointer over the print_dag_run_conf task, its status displays.

  2. Click the print_dag_run_conf task.

    In the Task Instance context menu, you can get metadata and perform some actions.

  3. In the Task Instance context menu, click Log.

  4. In the log, look for Running command: ['bash' to see the output from the bash operator.

    [2021-10-04 15:27:21,029] {subprocess.py:63} INFO - Running command:
    ['bash', '-c', 'echo 735']
    [2021-10-04 15:27:21,167] {subprocess.py:74} INFO - Output:
    [2021-10-04 15:27:21,168] {subprocess.py:78} INFO - 735
    [2021-10-04 15:27:21,168] {subprocess.py:82} INFO - Command exited with
    return code 0
    

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this page, follow these steps.

Choose one of the options:

  • The most straightforward way to clean up is to delete the project that you created for the quickstart.
  • As an alternative, you can delete the individual resources.

Delete the project

  1. In the Cloud Console, go to the Manage resources page.

    Go to Manage resources

  2. If the project that you plan to delete is attached to an organization, expand the Organization list in the Name column.
  3. In the project list, select the project that you want to delete, and then click Delete.
  4. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete individual resources

Instead of deleting the project, you can delete the resources used in this tutorial:

  1. Delete the Cloud Composer environment:

    1. In the Google Cloud Console, go to the Environments page.

      Go to Environments

    2. Select example-environment and click Delete.

    3. Wait until the environment is deleted.

  2. Delete your environment's bucket. Deleting the Cloud Composer environment does not delete its bucket.

    1. In the Google Cloud Console, go to the Storage > Browser page.

      Go to Storage > Browser

    2. Select the environment's bucket and click Delete. For example, this bucket can be named us-central1-example-environ-c1616fe8-bucket.

  3. Delete the persistent disk of your environment's Redis queue. Deleting the Cloud Composer environment does not delete its persistent disk.

    1. In the Google Cloud Console, go to the Compute Engine > Disks.

      Go to Disks

    2. Select the environment's Redis queue persistent disk and click Delete. For example, this disk can be named gke-us-central1-exampl-pvc-b12055b6-c92c-43ff-9de9-10f2cc6fc0ee. Such disks always have the Standard persistent disk type and the size of 2 GB.

What's next