Cloud Monitoring

Cloud Monitoring provides visibility into the performance, uptime, and overall health of cloud-powered applications. Google Cloud's operations suite collects and ingests metrics, events, and metadata from Dataproc clusters to generate insights via dashboards and charts.

Use Cloud Monitoring cluster metrics to monitor the performance and health of Dataproc clusters.

Cloud Monitoring cluster metrics

Dataproc cluster resource metrics are automatically enabled on Dataproc clusters. Use Monitoring to view these metrics.

You can access Monitoring from the Google Cloud Console or by using the Monitoring API.

Console

  1. After creating a cluster, go to Monitoring in the Cloud Console to view cluster monitoring data.

    When you first access Monitoring, it creates a Workspace and associates your Google Cloud project with that Workspace. If you've never used Monitoring, this process is automatic. If you have used Monitoring, then the Add your project to a Workspace dialog is displayed. To create a new Workspace, from the New Workspace list, select your Google Cloud project, and then click Add.

    After setting up the Workspace, the Monitoring console appears. At this point, you can install the Monitoring agent on VMs in your project as an additional set-up step. You don't need to install the agent on VMs in Dataproc clusters because this step is performed for you when you create a Dataproc cluster.

  2. Select Metrics Explorer, From the "Find resource type and metric" drop-down list, select the "Cloud Dataproc Cluster" resource (or type "cloud_dataproc_cluster" in the box).
  3. Click again in the input box, and then select a metric from the drop-down list. In the next screenshot, "YARN memory size" is selected. Hovering over the metric name displays information about the metric.
  4. You can select filters, group by metric labels, perform aggregations, and select chart viewing options (see the Monitoring documentation).

API

You can use the Monitoring timeSeries.list API to capture and list metrics defined by a filter expression. Use the Try this API template on the API page to send an API request and display the response.

Example: Here's a snapshot of a templated request and the returned JSON response for the following Monitoring timeSeries.list parameters:

  • name: projects/example-project-id
  • filter: metric.type="dataproc.googleapis.com/cluster/hdfs/storage_capacity"
  • interval.endTime: 2018-02-27T11:54:00.000-08:00
  • interval.startTime: 2018-02-20T00:00:00.000-08:00

Building a custom Monitoring dashboard

You can build a custom Monitoring dashboard that display charts of selected Cloud Dataproc cluster metrics.

  1. Select + CREATE DASHBOARD from the Monitoring Dashboards Overview page. Provide a name for the dashboard, then click Add Chart in the upper-right menu to open the Add Chart window. Select "Cloud Dataproc Cluster" as the resource type. Select one or more metrics and metric and chart properties. Then Save the chart.

  2. You can add additional charts to your dashboard. After you Save the dashboard, its title appears in the Monitoring Dashboards Overview page. Dashboard charts can be viewed, updated, and deleted from the dashboard display page.

Using Monitoring alerts

You can create a Monitoring alert that notifies you when a Dataproc cluster or job metric crosses a specified threshold, for example, when HDFS free capacity is low.

Creating an alert

  1. Open Monitoring Alerting in the Cloud Console. Click + CREATE POLICY to open the Create new alerting policy form. Define an alert by adding alert conditions, policy triggers, notification channels, and documentation.

  2. Select ADD CONDITION to open the alert condition form with the Metric tab selected. Fill in the fields to define an alert condition, then click ADD. The example alert condition shown below will trigger when Dataproc cluster HDFS capacity falls below the specified 930 GiB (binary GB) threshold (998,579,896,320 bytes) for 1 minute.

  3. After adding the alert condition, complete the alert policy by setting notification channels, policy triggers, documentation, and the alert policy name.

Viewing alerts

When an alert is triggered by a metric threshold condition, Monitoring creates an incident and a corresponding event. You can view incidents from the Monitoring Alerting page in the Cloud Console. If you defined a notification mechanism in the alert policy, such as an email or SMS notification, Monitoring also sends a notification of the incident.

Whats next