Cloud Monitoring provides powerful logging and diagnostics. Dataflow integration with Monitoring lets you access Dataflow job metrics such as Job Status, Element Counts, System Lag (for streaming jobs), and User Counters from the Monitoring dashboards. You can also employ Monitoring alerting capabilities to notify you of various conditions, such as long streaming system lag or failed jobs.
Before you begin
Follow one of the quickstarts to set up your Dataflow project and to construct and run your pipeline.
To see logs in Metrics Explorer, the
controller service account
must have the roles/monitoring.metricWriter
role.
Custom metrics
Any metric you define in your Apache Beam pipeline is reported by
Dataflow to Monitoring as a custom metric. There
are three types of
Apache Beam pipeline metrics:
Counter
, Distribution
, and Gauge
. Dataflow currently only
reports Counter
and Distribution
to
Monitoring. Distribution
is reported as four
submetrics suffixed with _MAX
, _MIN
,
_MEAN
, and _COUNT
. Dataflow does not support creating a
histogram from Distribution
metrics.
Dataflow reports incremental updates to Monitoring
approximately every 30 seconds. All Dataflow custom metrics are
exported as a double
data type to avoid conflicts. They appear in Monitoring as
dataflow.googleapis.com/job/user_counter
with
metric_name: metric-name
and ptransform:
ptransform-name
as labels. They are subject to the limitation
of cardinality
in Monitoring.
For backward compatibility, Dataflow also reports custom metrics
to Monitoring as
custom.googleapis.com/dataflow/metric-name
. There is a
limit of 100 Dataflow custom metrics per project published as
custom.googleapis.com/dataflow/metric-name
.
Custom metrics reported to Monitoring incurs charges based on the Cloud Monitoring pricing.
Explore metrics
You can explore Dataflow metrics using Monitoring. Follow the steps in this section to observe the several standard metrics provided for each of your Apache Beam pipelines.
In the Google Cloud console, select Monitoring:
In the left navigation pane, click
Metrics Explorer.
In the Find resource type and metric pane, select the dataflow_job resource type.
From the list that appears, select a metric you'd like to observe for one of your jobs.
Create alerting policies and dashboards
Monitoring provides access to Dataflow-related metrics. You can create dashboards to chart time series of metrics, and you can create alerting policies that notify you when metrics reach specified values.
Create groups of resources
You can create resource groups that include multiple Apache Beam pipelines so that you can easily set alerts and build dashboards.
In the Google Cloud console, select Monitoring:
In the Groups menu, select Create Group.
Add filter criteria that define the Dataflow resources included in the group. For example, one of your filter criteria can be the name prefix of your pipelines.
After the group is created, you can see the basic metrics related to resources in that group.
Create alerting policies for Dataflow metrics
Monitoring gives you the ability to create alerts and be notified when a certain metric crosses a specified threshold. For example, when System lag of a streaming pipeline increases above a predefined value.
In the Google Cloud console, select Monitoring:
In the Alerting menu, click Create Policy.
In the Create new alerting policy page, you can define the alerting conditions and notification channels.
For example, to set an alert on the System lag for theWindowedWordCount
Apache Beam pipeline group, complete the following steps:- Select Add condition.
- In the Find resource type or metric field, enter and select dataflow_job.
- In the Find resource type or metric field, select System lag.
After you've created an alert, you can review the events related to Dataflow by selecting See all events in the Events section. Every time an alert is triggered, an incident and a corresponding event are created. If you specified a notification mechanism in the alert, such as email or SMS, you also receive a notification.
Build your own custom monitoring dashboard
You can build Monitoring dashboards with the most relevant Dataflow-related charts.
Go to the Google Cloud console, and select Monitoring:
Select Dashboards > Create Dashboard.
Click Add Chart.
In the Add Chart window, select dataflow_job and metric you want to chart.
In the Filter field, select a group that contains Apache Beam pipelines.
You can add as many charts to the dashboard as you like.
Receive worker VM metrics from the Monitoring agent
If you would like to monitor persistent disk, CPU, network, and process metrics from your Dataflow worker VM instances, you can enable the Monitoring agent when you run your pipeline. See the list of available Monitoring agent metrics.
To enable the Monitoring agent, use the --experiments=enable_stackdriver_agent_metrics
option when running your pipeline. The
controller service account
must have the roles/monitoring.metricWriter
role.
To disable the Monitoring agent without stopping your pipeline, update your pipeline by
launching a replacement job
without the --experiments=enable_stackdriver_agent_metrics
parameter.
Storage and retention
Information about completed or cancelled Dataflow jobs is stored for 30 days.
Operational logs are stored in the _Default
log bucket.
The logging API service name is dataflow.googleapis.com
. For more information about
the Google Cloud monitored resource types and services used in Cloud Logging,
see Monitored resources and services.
For details about how long log entries are retained by Logging, see the retention information in Quotas and limits: Logs retention periods.
For information about viewing operational logs, see Monitor and view pipeline logs.
What's next
To learn more, consider exploring these other resources: