Testing DAGs (workflows)

Before deploying DAGs to production, you can execute Airflow CLI sub-commands to parse DAG code in the same context under which the DAG is executed.

Testing during DAG creation

You can run a single task instance locally and view the log output. Viewing the output enables you to check for syntax and task errors. Testing locally does not check dependencies or communicate status to the database.

We recommend that you put the DAGs in a data/test folder in your test environment.

Checking for syntax errors

  1. In the Cloud Storage bucket for your environment, create a test directory.

  2. To check for syntax errors, enter the following gcloud command:

    Airflow 1.10 CLI

    gcloud composer environments run \
      ENVIRONMENT_NAME \
      --location ENVIRONMENT_LOCATION \
       list_dags -- -sd /home/airflow/gcs/data/test
    

    Airflow 2.0 CLI

    gcloud beta composer environments run \
      ENVIRONMENT_NAME \
      --location ENVIRONMENT_LOCATION \
       dags list -- --subdir /home/airflow/gcs/data/test
    

    Where:

    • ENVIRONMENT_NAME is the name of the environment.
    • ENVIRONMENT_LOCATION is the Compute Engine region where the environment is located.

    For example:

    Airflow 1.10 CLI

    gcloud composer environments run \
    test-environment --location us-central1 \
    list_dags -- -sd /home/airflow/gcs/data/test
    

    Airflow 2.0 CLI

    gcloud beta composer environments run \
    test-environment --location us-central1 \
    dags list -- --subdir /home/airflow/gcs/data/test
    

Checking for task errors

To check for task-specific errors, enter the following gcloud command:

Airflow 1.10 CLI

gcloud composer environments run \
  ENVIRONMENT_NAME \
  --location ENVIRONMENT_LOCATION \
  test -- -sd /home/airflow/gcs/data/test DAG_ID \
  TASK_ID DAG_EXECUTION_DATE

Airflow 2.0 CLI

gcloud beta composer environments run \
  ENVIRONMENT_NAME \
  --location ENVIRONMENT_LOCATION \
  tasks test -- --subdir /home/airflow/gcs/data/test \
  DAG_ID TASK_ID \
  DAG_EXECUTION_DATE

Where:

  • ENVIRONMENT_NAME is the name of the environment.
  • ENVIRONMENT_LOCATION is the Compute Engine region where the environment is located.
  • DAG_ID is the ID of the DAG.
  • TASK_ID is the ID of the task.
  • DAG_EXECUTION_DATE is the execution date of the DAG. This date is used for templating purposes. Regardless of the date you specify here, the DAG runs immediately.

For example:

Airflow 1.10 CLI

gcloud composer environments run test-environment \
  --location us-central1 \
  test -- -sd /home/airflow/gcs/data/test \
  hello_world print_date 2021-04-22

Airflow 2.0 CLI

gcloud beta composer environments run \
  test-environment \
  --location us-central1 \
  tasks test -- --subdir /home/airflow/gcs/data/test \
  hello_world print_date 2021-04-22

Updating and testing a deployed DAG

To test updates to your DAGs in your test environment:

  1. Copy the deployed DAG that you want to update to data/test.
  2. Update the DAG.
  3. Test the DAG.
    1. Check for syntax errors.
    2. Check for task-specific errors.
  4. Make sure the DAG runs successfully.
  5. Turn off the DAG in your test environment.
    1. Go to the Airflow UI > DAGs page.
    2. If the DAG you're modifying runs constantly, turn off the DAG.
    3. To expedite outstanding tasks, click the task and Mark Success.
  6. Deploy the DAG to your production environment.
    1. Turn off the DAG in your production environment.
    2. Upload the updated DAG to the dags/ folder in your production environment.

FAQs for testing workflows

What's next