{ % include "_shared/apis/console/_local_variables.html" % }

Quickstart

This page shows you how to create a Cloud Composer environment in the Google Cloud Platform Console and run a simple Apache Airflow DAG (also called a workflow).

Before you begin

  1. Google 계정에 로그인합니다.

    아직 계정이 없으면 새 계정을 등록하세요.

  2. Google Cloud Platform 프로젝트를 선택하거나 만듭니다.

    리소스 관리 페이지로 이동

  3. Google Cloud Platform 프로젝트에 결제가 사용 설정되어 있는지 확인하세요.

    결제 사용 설정 방법 알아보기

  4. Cloud Composer API를 사용 설정합니다.

    API 사용 설정

Creating an environment

  1. In the GCP Console, go to the Create environment page.

    Open the Create environment page

  2. In the Name field, enter example-environment.

  3. In the Location drop-down list, select a region for the Cloud Composer environment. See Available regions for information on selecting a region.

  4. For other environment configuration options, use the provided defaults.

  5. To create the environment, click Create.

  6. Wait until environment creation is completed. When done, the green check mark displays to the left of the environment name.

Viewing environment details

After environment creation is completed, you can view your environment's deployment information, such as the Cloud Composer version, the URL for the Airflow web interface, and the DAGs folder in Cloud Storage.

To view deployment information:

  1. In the GCP Console, go to the Environments page.

    Open the Environments page

  2. To view the Environment details page, click example-environment.

Creating a DAG

An Airflow DAG is a collection of organized tasks that you want to schedule and run. DAGs are defined in standard Python files.

The Python code in quickstart.py:

  1. Creates a DAG, composer_sample_dag. The DAG runs once per day.
  2. Executes one task, print_dag_run_conf. The task prints the DAG run's configuration by using the bash operator.

To create a DAG, create a copy of the quickstart.py file on your local machine.

import datetime

import airflow
from airflow.operators import bash_operator

YESTERDAY = datetime.datetime.now() - datetime.timedelta(days=1)

default_args = {
    'owner': 'Composer Example',
    'depends_on_past': False,
    'email': [''],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': datetime.timedelta(minutes=5),
    'start_date': YESTERDAY,
}

with airflow.DAG(
        'composer_sample_dag',
        'catchup=False',
        default_args=default_args,
        schedule_interval=datetime.timedelta(days=1)) as dag:

    # Print the dag_run id from the Airflow logs
    print_dag_run_conf = bash_operator.BashOperator(
        task_id='print_dag_run_conf', bash_command='echo {{ dag_run.id }}')

Uploading the DAG to Cloud Storage

Cloud Composer schedules only the DAGs that are in the DAGs folder in the environment's Cloud Storage bucket.

To schedule your DAG, move quickstart.py from your local machine to your environment's DAGs folder:

  1. In the GCP Console, go to the Environments page.

    Open the Environments page

  2. To open the /dags folder, click the DAGs folder link for example-environment.

  3. On the Bucket details page, click Upload files and then select your local copy of quickstart.py.

  4. To upload the file, click Open.

    After you upload your DAG, Cloud Composer adds the DAG to Airflow and schedules the DAG immediately. It might take a few minutes for the DAG to show up in the Airflow web interface.

Viewing the DAG in the Airflow web interface

Each Cloud Composer environment has a web server that runs the Airflow web interface that you can use to manage DAGs.

To view the DAG in the Airflow web interface:

  1. In the GCP Console, go to the Environments page.

    Open the Environments page

  2. To open the Airflow web interface, click the Airflow link for example-environment. The interface opens in a new browser window.

  3. In the Airflow toolbar, click DAGs.

  4. To open the DAG details page, click composer_sample_dag.

    The page for the DAG shows the Tree View, a graphical representation of the workflow's tasks and dependencies.

Viewing task instance details in the Airflow logs

The DAG that you scheduled includes the print_dag_run_conf task. The task prints the DAG run's configuration, which you can see in the Airflow logs for the task instance.

To view the task instance details:

  1. In the DAG's Tree View in the Airflow web interface, click Graph View.

    If you mouseover the graphic for the print_dag_run_conf task, its status displays. Note that the border around the task also indicates the status (light green border = running).

  2. Click print_dag_run_conf task.

    The Task Instance Context Menu displays. Here you can get metadata and perform some actions.

  3. In the Task Instance Context Menu, click View Log.

  4. In the Log, look for Running: ['bash' to see the output from the bash operator.

Clean up

To avoid incurring charges to your GCP account for the resources used in this quickstart:

  1. GCP Console에서 프로젝트 페이지로 이동합니다.

    프로젝트 페이지로 이동

  2. 삭제하려는 프로젝트가 조직에 연결된 경우 페이지 상단 조직 드롭다운에서 조직을 선택합니다.
  3. 프로젝트 목록에서 삭제할 프로젝트를 선택하고 삭제를 클릭합니다.
  4. 대화상자에서 프로젝트 ID를 입력한 다음 종료를 클릭하여 프로젝트를 삭제합니다.

Alternatively, you can delete the resources used in this tutorial:

  1. Delete the Cloud Composer environment.
  2. Delete the Cloud Storage bucket for the Cloud Composer environment. Deleting the Cloud Composer environment does not delete its bucket.
  3. Delete the Cloud Pub/Sub topics for the Cloud Composer environment (composer-agent and composer-backend). Deleting the Cloud Composer environment does not delete these topics.

What's next

이 페이지가 도움이 되었나요? 평가를 부탁드립니다.

다음에 대한 의견 보내기...

Google Cloud Composer