Custom metrics with OpenCensus

Stackdriver Monitoring automatically collects more than a thousand built-in metrics from a lengthy list of monitored resources. But those metrics cannot capture application-specific data or client-side system data. Those metrics can give you information on backend latency or disk usage, but they can't tell you how many background routines your applicaton spawned.

Application-specific metrics are metrics that you define and collect to capture information the built-in Stackdriver Monitoring metrics cannot. You capture such metrics by using an API provided by a library to instrument your code, and then you send the metrics to a backend application like Stackdriver Monitoring.

In Stackdriver Monitoring, application-specific metrics are typically called “custom metrics”. The terms are interchangeable. They are also called “user-defined metrics”.

As far as Stackdriver Monitoring is concerned, custom metrics can be used like the built-in metrics. You can chart them, set alerts on them, and otherwise monitor them. The difference is that you define the metrics, write data to them, and can delete them. You can't do any of that with the built-in metrics.

There are many ways to capture custom metrics, including using the native Stackdriver Monitoring API. Stackdriver recommends that you use OpenCensus to instrument your code for collecting custom metrics.

What is OpenCensus?

OpenCensus is a free, open-source project whose libraries:

  • Provide vendor-agnostic support for the collection of metric and trace data across a variety of languages.
  • Can export the collected data to a variety of backend applications, including Stackdriver.

For the current list of supported languages, see Language Support. For the current list of backend applications for which exporters are available, see Exporters.

Why OpenCensus?

Although Stackdriver Monitoring provides an API that supports defining and collecting custom metrics, it is a low-level, proprietary API. OpenCensus provides a much more idiomatic API, along with an exporter that sends your metric data to Stackdriver Monitoring through the Monitoring API for you.

Additionally, OpenCensus is an open-source project. You can export the collected data using a vendor-agnostic library rather than a proprietary library.

OpenCensus also has good support for application tracing; see OpenCensus Tracing for a general overview. Stackdriver recommends using OpenCensus for trace instrumentation. You can use a single distribution of libraries to collect both metric and trace data from your services. For information about using OpenCensus with Stackdriver Trace, see Client Libraries for Trace.

Before you begin

To use Stackdriver Monitoring, you must have a GCP project with billing enabled. The project must also be associated with a Stackdriver Workspace. Stackdriver Monitoring uses Workspaces to organize monitored GCP projects.

If you do not have a GCP project, do the following::

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. Select or create a GCP project.

    Go to the Manage resources page

  3. Make sure that billing is enabled for your project.

    Learn how to enable billing

To associate your project with a Workspace, see Workspaces.

Custom metrics are a chargeable feature of Stackdriver Monitoring, so there could be costs associated with the ingestion of your metrics. For more information on pricing, see Stackdriver Pricing.

Installing OpenCensus

To use OpenCensus, you must make the metrics libraries, and the Stackdriver exporter, available.

j# include "_shared/widgets/_sample_tab_section.html" with lang="go" project="golang-samples" file="opencensus/metrics_quickstart/main.go" region_tag=sample_id #}

Java

For Maven, add the following to the dependencies element in your pom.xml file:
<dependency>
  <groupId>io.opencensus</groupId>
  <artifactId>opencensus-api</artifactId>
  <version>${opencensus.version}</version>
</dependency>
<dependency>
  <groupId>io.opencensus</groupId>
  <artifactId>opencensus-impl</artifactId>
  <version>${opencensus.version}</version>
</dependency>
<dependency>
  <groupId>io.opencensus</groupId>
  <artifactId>opencensus-exporter-stats-stackdriver</artifactId>
  <version>${opencensus.version}</version>
</dependency>

Using OpenCensus for metrics

Instrumenting your code to use OpenCensus for metrics involves three general steps:

  1. Importing the OpenCensus stats and OpenCensus Stackdriver exporter packages.
  2. Initializing the Stackdriver exporter.
  3. Using the OpenCensus API to instrument your code.

A basic example

Following is a minimal program that illustrates these steps. It runs a loop and collects latency measures, and when the loop finishes, it exports the stats to Stackdriver Monitoring and exits:

Java

import com.google.common.collect.Lists;

import io.opencensus.exporter.stats.stackdriver.StackdriverStatsExporter;
import io.opencensus.stats.Aggregation;
import io.opencensus.stats.BucketBoundaries;
import io.opencensus.stats.Measure.MeasureLong;
import io.opencensus.stats.Stats;
import io.opencensus.stats.StatsRecorder;
import io.opencensus.stats.View;
import io.opencensus.stats.View.Name;
import io.opencensus.stats.ViewManager;

import java.io.IOException;
import java.util.Collections;
import java.util.Random;
import java.util.concurrent.TimeUnit;

public class Quickstart {
  private static final int EXPORT_INTERVAL = 60;
  private static final MeasureLong LATENCY_MS = MeasureLong.create(
      "task_latency",
      "The task latency in milliseconds",
      "ms");
  // Latency in buckets:
  // [>=0ms, >=100ms, >=200ms, >=400ms, >=1s, >=2s, >=4s]
  private static final BucketBoundaries LATENCY_BOUNDARIES = BucketBoundaries.create(
      Lists.newArrayList(0d, 100d, 200d, 400d, 1000d, 2000d, 4000d));
  private static final StatsRecorder STATS_RECORDER = Stats.getStatsRecorder();

  public static void main(String[] args) throws IOException, InterruptedException {
    // Register the view. It is imperative that this step exists,
    // otherwise recorded metrics will be dropped and never exported.
    View view = View.create(
        Name.create("task_latency_distribution"),
        "The distribution of the task latencies.",
        LATENCY_MS,
        Aggregation.Distribution.create(LATENCY_BOUNDARIES),
        Collections.emptyList());

    ViewManager viewManager = Stats.getViewManager();
    viewManager.registerView(view);

    // Enable OpenCensus exporters to export metrics to Stackdriver Monitoring.
    // Exporters use Application Default Credentials to authenticate.
    // See https://developers.google.com/identity/protocols/application-default-credentials
    // for more details.
    StackdriverStatsExporter.createAndRegister();

    // Record 100 fake latency values between 0 and 5 seconds.
    Random rand = new Random();
    for (int i = 0; i < 100; i++) {
      long ms = (long) (TimeUnit.MILLISECONDS.convert(5, TimeUnit.SECONDS) * rand.nextDouble());
      System.out.println(String.format("Latency %d: %d", i, ms));
      STATS_RECORDER.newMeasureMap().put(LATENCY_MS, ms).record();
    }

    // The default export interval is 60 seconds. The thread with the StackdriverStatsExporter must
    // live for at least the interval past any metrics that must be collected, or some risk being
    // lost if they are recorded after the last export.

    System.out.println(String.format(
        "Sleeping %d seconds before shutdown to ensure all records are flushed.", EXPORT_INTERVAL));
    Thread.sleep(TimeUnit.MILLISECONDS.convert(EXPORT_INTERVAL, TimeUnit.SECONDS));
  }
}

When this metric data is exported to Stackdriver, you can use it like any other data.

The program creates an OpenCensus view called task_latency_distribution. This string becomes part of the name of the metric when it is exported to Stackdriver Monitoring. See Retrieving metric descriptors to see how the OpenCensus view is realized as a Stackdriver Monitoring metric descriptor.

You can therefore use the view name as a search string when selecting a metric to chart. For example, you can type it into the Find resource type and metric field in Metrics Explorer. The following screenshot shows the result:

Metrics from OpenCensus in Stackdriver Monitoring

Each bar in the heatmap represents one run of the program, and the colored components of each bar represent buckets in the latency distribution. See OpenCensus metrics in Stackdriver for more details about the data behind the chart.

OpenCensus documentation

OpenCensus provides the authoritative reference documentation for its metrics API and for the Stackdriver exporter. The following table provides links to these reference documents:

Language API Reference Documentation Exporter Documentation Quickstart
Java Java API Stats Exporter Metrics

Mapping the models

The native Stackdriver Monitoring API for custom metrics is supported; using it is described in Using Custom Metrics. In fact, the OpenCensus exporter for Stackdriver uses this API for you.

Even if you don't need to know the specifics of using the Stackdriver Monitoring API, familiarity with its constructs and terminology is useful for understanding how Stackdriver Monitoring represents the metrics. This section provides some of that background.

Once your metrics are ingested into Stackdriver, they are stored within Stackdriver Monitoring constructs. You can, for example, retrieve the metric descriptor — a type from the Monitoring API — of a custom metric. See MetricDescriptor for more information. You encounter these metric descriptors, for example, when creating charts for your data.

Terminology and concepts

The constructs used by the OpenCensus API differ from those used by Stackdriver Monitoring, as does some use of terminology. Where Stackdriver Monitoring refers to “metrics”, OpenCensus sometimes refers to “stats”. For example, the component of OpenCensus that sends metric data in Stackdriver is called the “stats exporter for Stackdriver”.

See OpenCensus Metrics for an overview of the OpenCensus model for metrics.

The data models for OpenCensus stats and Stackdriver Monitoring metrics do not fall into a neat 1:1 mapping. Many of the same concepts exist in each, but they are not directly interchangeable.

  • An OpenCensus view is generally analogous to the MetricDescriptor in the Monitoring API. A view describes how to collect and aggregate individual measurements. All recorded measurements are broken down by tags.

  • An OpenCensus tag is a key-value pair. This corresponds generally to the LabelDescriptor in the Monitoring API. Tags allow you to capture contextual information that can be used to filter and group metrics

  • An OpenCensus measure describes metric data to be recorded. An OpenCensus aggregation is a function applied to data used to summarize it. These are used in exporting to determine the MetricKind, ValueType, and unit reported in the Stackdriver metric descriptor.

  • An OpenCensus measurement is a data point collected for measure. Measurements must be aggregated into views. Otherwise, the individual measurements are dropped. This construct is analogous to a Point in the Monitoring API. When measurements are aggregated in views, the aggregated data is stored as view data, analogous to a TimeSeries in the Monitoring API.

OpenCensus metrics in Stackdriver

You can examine the exported metrics in Stackdriver Monitoring. The screenshot in A basic example was taken from Metrics Explorer. If you have run the sample program, you can use Metrics Explorer to look at your data:

Go to the Metrics Explorer page

You can supply the name of the OpenCensus view when specifying the metric to restrict the search. See Selecting metrics for more information.

Retrieving metric descriptors

You can retrieve the metric data using the Monitoring API directly. To retrieve the metric data, you need to know the Stackdriver names to which the OpenCensus metrics were exported.

One way to get this information is to retrieve the metric descriptors that were created by the exporter and find the value of the type field. This value incorporates the name of the OpenCensus view from which it was exported. For details on metric descriptors, see MetricDescriptor.

You can see the metric descriptors created for the exported metrics by using the API Explorer (Try this API) widget on the reference page for the metricDescriptors.list method. To retrieve the metrics descriptors for the OpenCensus metrics using this tool:

  1. Enter the name of your project in the name field: projects/[PROJECT_ID] This document uses a project with the ID a-gcp-project.

  2. Enter a filter in the filter field. The name of the OpenCensus view becomes part of metric name, so you can use that name to restrict the listing by providing a filter like this:

    metric.type=has_substring("task_latency_distribution")

    There are a lot of metric descriptors in any project. Filtering on a substring from the OpenCensus view's name eliminates most of them.

  3. Click the Execute button.

The following shows the returned metric descriptor:

{
  "metricDescriptors": [
    {
      "name": "projects/a-gcp-project/metricDescriptors/custom.googleapis.com/opencensus/task_latency_distribution",
      "labels": [
        {
          "key": "opencensus_task",
          "description": "Opencensus task identifier"
        }
      ],
      "metricKind": "CUMULATIVE",
      "valueType": "DISTRIBUTION",
      "unit": "ms",
      "description": "The distribution of the task latencies",
      "displayName": "OpenCensus/task_latency_distribution",
      "type": "custom.googleapis.com/opencensus/task_latency_distribution"
    }
  ]
}

This line in the metric descriptor tells you the name of the metric type in Stackdriver Monitoring:

"type": "custom.googleapis.com/opencensus/task_latency_distribution"

With this information, you can then manually retrieve the data associated with this metric type. This is also the data that appears on a chart for this metric.

Retrieving metric data

To manually retrieve time-series data from a metric type, you can use the Try this API tool on the reference page for the timeSeries.list method:

  1. Enter the name of your project in the name field: projects/[PROJECT_ID]
  2. Enter a filter in the filter field for the desired metric type: metric.type="custom.googleapis.com/opencensus/task_latency_distribution"
    • The key, metric.type, is a field in a type embedded in a timeseries. See TimeSeries for details.
    • The value is the type value extracted from the metric descriptor in Retrieving metric descriptors.
  3. Enter time boundaries for the retrieval by specifying values for these fields:
    • interval.endTime as a timestamp, for example: 2018-10-11T15:48:38-04:00
    • interval.startTime (must be earlier than interval.endTime)
  4. Click the Execute button.

The following shows the result of one such retrieval:

{
  "timeSeries": [
    {
      "metric": {
        "labels": {
          "opencensus_task": "java-21368@docbuild"
        },
        "type": "custom.googleapis.com/opencensus/task_latency_distribution"
      },
      "resource": {
        "type": "global",
        "labels": {
          "project_id": "a-gcp-project"
        }
      },
      "metricKind": "CUMULATIVE",
      "valueType": "DISTRIBUTION",
      "points": [
        {
          "interval": {
            "startTime": "2018-10-21T18:02:28.171270Z",
            "endTime": "2018-10-21T18:02:28.171366Z"
          },
          "value": {
            "distributionValue": {
              "count": "60",
              "mean": 2501.9301065269069,
              "sumOfSquaredDeviation": 116606963.10749318,
              "bucketOptions": {
                "explicitBuckets": {
                  "bounds": [
                    0,
                    100,
                    200,
                    400,
                    1000,
                    2000,
                    4000
                  ]
                }
              },
              "bucketCounts": [
                "0",
                "0",
                "2",
                "3",
                "4",
                "15",
                "25",
                "11"
              ]
            }
          }
        }
      ]
    },
    [ ... data from additional program runs deleted ...]
  ]
}

This data returned here includes:

  • Information about the monitored resource on which the data was collected.
  • Description of the kind of metric and the type of the values.
  • The actual data points collected within the time interval requested.
Was this page helpful? Let us know how we did:

Send feedback about...

Stackdriver Monitoring