Stackdriver Monitoring provides visibility into the performance, uptime, and overall health of cloud-powered applications. Stackdriver collects and ingests metrics, events, and metadata from Dataproc clusters to generate insights via dashboards and charts.
Use Stackdriver cluster metrics to monitor the performance and health of Dataproc clusters.
See Stackdriver Pricing to understand your costs.
See Monitoring Quotas and limits for information on metric data retention.
Stackdriver cluster metrics
Dataproc cluster resource metrics are automatically enabled on Dataproc clusters. Use Monitoring to view these metrics.
After creating a cluster, go to
in the Cloud Console to view cluster monitoring data.
When you first access Monitoring, it creates a Workspace and associates your Google Cloud project with that Workspace. If you've never used Monitoring, this process is automatic. If you have used Monitoring, then the Add your project to a Workspace dialog is displayed. To create a new Workspace, from the New Workspace list, select your Google Cloud project, and then click Add.
After setting up the Workspace, the Monitoring console appears. At this point, you can install the Monitoring agent on VMs in your project as an additional set-up step. You don't need to install the agent on VMs in Dataproc clusters because this step is performed for you when you create a Dataproc cluster.
- Select Metrics Explorer, From the "Find resource type and metric" drop-down list, select the "Cloud Dataproc Cluster" resource (or type "cloud_dataproc_cluster" in the box).
- Click again in the input box, and then select a metric from the drop-down list. In the next screenshot, "YARN memory size" is selected. Hovering over the metric name displays information about the metric.
- You can select filters, group by metric labels, perform aggregations, and select chart viewing options (see the Monitoring documentation).
You can use the Monitoring
API to capture and list metrics defined by a
Use the Try this API template on the API page to send
an API request and display the response.
Example: Here's a snapshot of a templated request and the returned
JSON response for the following Monitoring
- name: projects/example-project-id
- filter: metric.type="dataproc.googleapis.com/cluster/hdfs/storage_capacity"
- interval.endTime: 2018-02-27T11:54:00.000-08:00
- interval.startTime: 2018-02-20T00:00:00.000-08:00
Building a custom Monitoring dashboard
You can build a custom Monitoring dashboard that display charts of selected Cloud Dataproc cluster metrics.
Select + CREATE DASHBOARD from the Monitoring Dashboards Overview page. Provide a name for the dashboard, then click Add Chart in the upper-right menu to open the Add Chart window. Select "Cloud Dataproc Cluster" as the resource type. Select one or more metrics and metric and chart properties. Then Save the chart.
You can add additional charts to your dashboard. After you Save the dashboard, its title appears in the Monitoring Dashboards Overview page. Dashboard charts can be viewed, updated, and deleted from the dashboard display page.
Using Monitoring alerts
You can create a Monitoring alert that notifies you when a Dataproc cluster or job metric crosses a specified threshold, for example, when HDFS free capacity is low.
Creating an alert
Open Monitoring Alerting in the Cloud Console. Click + CREATE POLICY to open the Create new alerting policy form. Define an alert by adding alert conditions, policy triggers, notification channels, and documentation.
Select ADD CONDITION to open the alert condition form with the Metric tab selected. Fill in the fields to define an alert condition, then click ADD. The example alert condition shown below will trigger when Dataproc cluster HDFS capacity falls below the specified 930 GiB (binary GB) threshold (998,579,896,320 bytes) for 1 minute.
After adding the alert condition, complete the alert policy by setting notification channels, policy triggers, documentation, and the alert policy name.
When an alert is triggered by a metric threshold condition, Monitoring creates an incident and a corresponding event. You can view incidents from the Monitoring Alerting page in the Cloud Console. If you defined a notification mechanism in the alert policy, such as an email or SMS notification, Monitoring also sends a notification of the incident.