Monitoring Cloud Composer Environments in Stackdriver

This page provides information about and an example of using Stackdriver to monitor Cloud Composer environments.

Stackdriver is a free monitoring and logging service that enables you to view logs and metrics about your environment. For more information, see the Stackdriver documentation.

Viewing metrics

Go to Stackdriver in the Google Cloud Platform Console to view the Stackdriver monitoring dashboards or to define Stackdriver alerts. You can also use the Stackdriver monitoring API to query and view metrics.

Metrics and resource types

The Stackdriver documentation includes the following information about Cloud Composer metrics and resources:

  • For the usage metrics that Cloud Composer reports to Stackdriver, see Metrics List.
  • For the details for the cloud_composer_environment resource type, see Monitored Resource Types in the Stackdriver documentation.

Health checking your Cloud Composer environment

Because of Cloud Composer's microservice architecture, the end-to-end liveness monitoring of your Airflow setup involves multiple metrics.

The example in this section shows one approach to monitoring your entire Airflow setup using Stackdriver's logs based metrics. The approach involves:

  • Deploying a simple DAG that runs every few minutes in the Cloud Composer environment.
  • Leveraging the logs that Stackdriver produces for Cloud Composer by using the logs-based metrics to create a customized metric.
  • Setting up monitoring and alerting based on the customized metric.

Step 1. Add the DAG to your environment.

  1. Upload the following DAG to the Cloud Storage bucket for the environment:

    import airflow
    from airflow import DAG
    from airflow.operators.bash_operator import BashOperator
    from datetime import timedelta

    default_args = { 'retries': 1, 'retry_delay': timedelta(minutes=5), 'start_date': airflow.utils.dates.days_ago(0) }

    dag = DAG( 'liveness', default_args=default_args, description='liveness monitoring dag', schedule_interval=timedelta(minutes=1))

    t1 = BashOperator( task_id='echo', bash_command='echo test', dag=dag, depends_on_past=False)

  2. Wait until Airflow schedules the DAG. You can check the status in the Airflow web interface.

Airflow web interface

Step 2. Create the logs-based counter metric.

  1. Go to the Logs Viewer.

    Go to the Logs Viewer

  2. Make sure the correct project is selected at the top of the page, or use the drop-down menu at the top of the page to select a project.

  3. In the resource field, select Cloud Composer Environment and then select the location and environment name.

  4. In the log type field, select airflow-worker.

  5. In the filter bar, create a filter to show only logs that the liveness DAG emits by entering following label: label:workflow:liveness.

  6. At the top of the page, click Create Metric. The Metric Editor displays on the right side of the page, and the viewer panel showing your logs displays on the left side.

    Create metric

  7. In the Metric Editor panel, set the following fields:

    1. Name: Choose a name that is unique among logs-based metrics in your project.
    2. Description: Describe the metric.
    3. Type: Counter.
  8. Click Create Metric. The Logs-based metrics page displays.

Step 3: Create the alerting policy.

  1. In the Logs-based metrics page, check the metric you want to create a policy for, and in the menu at the right side of the metric's listing, select Create alert from metric. A pre-populated Add Metric Threshold Condition window displays in Stackdriver Logging.

    Create alert

  2. In the Add Metric Threshold Condition window, set the Threshold value and make any other adjustments, as needed.

    Alert policy

  3. Click Save Condition, which displays the Create new alerting policy panel with your completed condition. You can add additional conditions, as needed.

  4. Fill in the Notifications, Documentation, and Name sections of the alerting policy.

  5. Click Save Policy.

Send feedback about...

Cloud Composer