Stackdriver Monitoring provides visibility into the performance, uptime, and overall health of cloud-powered applications. Stackdriver collects and ingests metrics, events, and metadata from Cloud Dataproc clusters to generate insights via dashboards and charts. You can use Monitoring to understand the performance and health of your Cloud Dataproc clusters and examine HDFS, YARN, and Cloud Dataproc job and operation metrics.
Cloud Dataproc Cluster resource metrics are automatically enabled on Cloud Dataproc clusters, and you can use Monitoring to view these metrics.
Using Monitoring on Cloud Dataproc clusters
After creating a cluster, go to the Monitoring console to view cluster monitoring data.
When you first access Monitoring, you are asked to create a Stackdriver account, and select a project. You can optionally install the Monitoring agent on VMs in your project as an additional set-up step. You do not need to install the agent on VMs in Cloud Dataproc clusters since this step is performed for you when you create a Cloud Dataproc cluster.
After setting up the Stackdriver account, the Monitoring console appears.
Select Resources→Metrics Explorer, then click in the "Find resource type and metric" input box to display the resource drop-down list. Select the "Cloud Dataproc Cluster" resource (or type "cloud_dataproc_cluster" in the box).
Click again in the input box, and then select a metric from the drop-down list. In the next screenshot, "YARN memory size" is selected. Hovering over the metric name displays information about the metric.
You can select filters, group by metric labels, perform aggregations, and select chart viewing options (see the Monitoring documentation).
You can use the Monitoring
API to capture and list metrics defined by a
Use the Try this API template on the API page to send
an API request and display the response.
Example: Here's a snapshot of a templated request and the returned
JSON response for the following Monitoring
- name: projects/example-project-id
- filter: metric.type="dataproc.googleapis.com/cluster/hdfs/storage_capacity"
- interval.endTime: 2018-02-27T11:54:00.000-08:00
- interval.startTime: 2018-02-20T00:00:00.000-08:00
Building a custom Monitoring dashboard
You can build a custom Monitoring dashboard that display charts of selected Cloud Dataproc cluster metrics.
Select Dashboards→Create Dashboard from the Monitoring console.
An "Untitled Dashboard" opens. Click Add Chart. In the Add Chart window, select "Cloud Dataproc Cluster" as the resource type. Select one or more metrics and metric and chart properties. Confirm or type a new chart title, then Save the chart.
You can add additional charts to your dashboard. After you Save the the dashboard, its title appears in the Monitoring Dashboards menu.
Dashboard charts can be viewed, updated, and deleted from the dashboard display page.
Using Monitoring alerts
You can create a Monitoring alert that notifies you when a Cloud Dataproc cluster or job metric crosses a specified threshold (for example, when HDFS free capacity is low).
Creating an alert
Select "Alerting→Create a Policy" from the Monitoring console.
From the Create a new alerting policy page, define an alert by adding alert conditions, notification channels, and documentation.
Select "Conditions→+ Add Condition", then from the Select condition type page, select "Metric Threshold/Rate Change/Absence".
In the Add monitoring.v3 Condition page, select the "Cloud Dataproc Cluster" metric and the alert trigger condition, then click "Save Condition".
After setting the alert condition, complete the alert policy by setting notification channels, documentation, and the name for the new alert policy from the Create a new alerting policy page.
When an alert is triggered by a metric threshold condition, Monitoring creates an incident (and a corresponding event). You can review incidents from the Monitoring Alerting→Incidents page. If you defined a notification mechanism in the alert policy, such as an email or SMS notification, Monitoring also sends a notification of the incident.
- Explore Stackdriver