Cloud Monitoring provides visibility into the performance, uptime, and overall health of cloud-powered applications. Google Cloud's operations suite collects and ingests metrics, events, and metadata from Dataproc clusters, including per-cluster HDFS, YARN, job, and operation metrics, to generate insights via dashboards and charts (see Cloud Monitoring Dataproc metrics).
Use Cloud Monitoring cluster metrics to monitor the performance and health of Dataproc clusters.
See Cloud Monitoring Pricing to understand your costs.
See Monitoring Quotas and limits for information on metric data retention.
Cloud Monitoring cluster metrics
Dataproc cluster resource metrics are automatically enabled on Dataproc clusters. Use Monitoring to view these metrics.
After creating a cluster, go to
in the Cloud Console to view cluster monitoring data.
If you have never used Cloud Monitoring, then on your first access of Monitoring in the Google Cloud Console, a Workspace is automatically created and your project is associated with that Workspace. Otherwise, if your project isn't associated with a Workspace, then a dialog appears and you can either create a Workspace or add your project to an existing Workspace. We recommend that you create a Workspace. After you make your selection, click Add.
After the Monitoring console appears, you can install the Monitoring agent on VMs in your project as an additional set-up step. You don't need to install the agent on VMs in Dataproc clusters because this step is performed for you when you create a Dataproc cluster.
- Select Metrics Explorer, From the "Find resource type and metric" drop-down list, select the "Cloud Dataproc Cluster" resource (or type "cloud_dataproc_cluster" in the box).
- Click again in the input box, and then select a metric from the drop-down list.
In the next screenshot, "YARN memory size" is selected. Hovering over the metric name
displays information about the metric.
You can select filters, group by metric labels, perform aggregations, and select chart viewing options (see the Monitoring documentation).
You can use the Monitoring
API to capture and list metrics defined by a
Use the Try this API template on the API page to send
an API request and display the response.
Example: Here's a snapshot of a templated request and the returned
JSON response for the following Monitoring
- name: projects/example-project-id
- filter: metric.type="dataproc.googleapis.com/cluster/hdfs/storage_capacity"
- interval.endTime: 2018-02-27T11:54:00.000-08:00
- interval.startTime: 2018-02-20T00:00:00.000-08:00
Building a custom Monitoring dashboard
You can build a custom Monitoring dashboard that display charts of selected Cloud Dataproc cluster metrics.
Select + CREATE DASHBOARD from the Monitoring Dashboards Overview page. Provide a name for the dashboard, then click Add Chart in the upper-right menu to open the Add Chart window. Select "Cloud Dataproc Cluster" as the resource type. Select one or more metrics and metric and chart properties. Then Save the chart.
You can add additional charts to your dashboard. After you Save the dashboard, its title appears in the Monitoring Dashboards Overview page. Dashboard charts can be viewed, updated, and deleted from the dashboard display page.
Using Monitoring alerts
You can create a Monitoring alert that notifies you when a Dataproc cluster or job metric crosses a specified threshold, for example, when HDFS free capacity is low.
Creating an alert
Open Monitoring Alerting in the Cloud Console. Click + CREATE POLICY to open the Create new alerting policy form. Define an alert by adding alert conditions, policy triggers, notification channels, and documentation.
Select ADD CONDITION to open the alert condition form with the Metric tab selected. Fill in the fields to define an alert condition, then click ADD. The example alert condition shown below will trigger when Dataproc cluster HDFS capacity falls below the specified 930 GiB (binary GB) threshold (998,579,896,320 bytes) for 1 minute.
After adding the alert condition, complete the alert policy by setting notification channels, policy triggers, documentation, and the alert policy name.
When an alert is triggered by a metric threshold condition, Monitoring creates an incident and a corresponding event. You can view incidents from the Monitoring Alerting page in the Cloud Console. If you defined a notification mechanism in the alert policy, such as an email or SMS notification, Monitoring also sends a notification of the incident.
- See the Cloud Monitoring documentation