Chart distribution metrics

This document describes how you can create and interpret a chart that displays metric data of the `Distribution` value type. This value type is used by services when the individual measurements are too numerous to collect, but statistical information, such as averages or percentiles, about those measurements is valuable. For example, when an application relies on HTTP traffic, you can use a distribution-valued metric that captures HTTP response latency to evaluate how quickly HTTP requests complete.

To illustrate how a histogram is created, consider a service that measures the HTTP latency of requests and that reports this data by using a metric with a distribution-value type. The data is reported every minute. The service defines ranges of values for the metric, called buckets, and records the count of measured values that falls into each bucket. For example, when an HTTP request completes, the service increments the count in the bucket whose range includes the request's latency value. These counts create a histogram of values for that minute.

Assume that the latencies measured in a one-minute interval are 5, 1, 3, 5, 6, 10, and 14. If the buckets are [0, 4), [4, 8), [8, 12), and [12, 16), then the histogram of this data is [2, 3, 1, 1]. The following table shows how individual measurements affect the count for each bucket:

Bucket Latency measurements Number of values in the bucket
[12, 16) 14 1
[8,12) 10 1
[4,8) 5, 5, 6 3
[0,4) 1, 3 2

When this data is written to the time series, a `Point` object is created. For metrics with a distribution value, that object includes the histogram of values. For this sampling period, the `Point` contains [2, 3, 1, 1]. The individual measurements aren't written to the time series.

The following table illustrates a sequence of histograms. Each column in the table represents the latency data for a one-minute period:

Bucket Histogram for
1:00
Histogram for
1:01
Histogram for
1:02
Histogram for
1:03
[12, 16) 1 6 0 1
[8,12) 1 0 2 2
[4, 8) 3 1 1 8
[0, 4) 2 6 10 3

Heatmap charts

Heatmap charts are designed to display a single time series with distribution values. For these charts, the X-axis represents time, the Y-axis represents the buckets, and color represents the value. The brighter the color indicates a higher value. For example, dark areas of the heatmap indicate lower bucket counts than yellow or white areas.

The following figure is one representation of a heatmap for the previous example:

In the previous figure, the heatmap uses black to represent the smallest bucket count, 0, and yellow to represent the largest bucket count, 10. Reds and oranges represent values between these two extremes.

Because heatmap charts can display only a single time series, when you have multiple time series, you must set the aggregation options to combine them into a single time series. For example, to use Metrics Explorer to create a heatmap chart that shows the sum of the time series, do the following:

1. In the Cloud Console, select Monitoring or click the following button:
2. In the navigation pane, select Metrics Explorer .
3. Select a distribution-valued metric and a resource. For example, select the RTT Latencies metric and the VM instance resource.
4. In the Metrics Explorer toolbar, click Line chart , and then select Heatmap.
5. Use the configuration pane to combine the time series into a single time series:

• Ensure that the Group by field is empty.
• Select sum as the Aggregator.

Line and bar charts

Line charts, stacked bar charts, and stacked line charts, which are designed to display scalar data, can't display distribution values. To display a metric with a distribution value with one of these chart types, you must convert the histogram values into scalar values. For example, you could compute the sum of the values in the histogram or you could select a percentile.

For example, each row in the following table includes a timestamp, a histogram, and a sum of histogram values:

Time Histogram Sum of histogram values
1:00 [2, 3, 1, 1] 7
1:01 [6, 1, 0, 6] 13
1:02 [10, 1, 2, 0] 13
1:03 [3, 8, 2, 1] 14

In the preceding table, you can display the sum of histogram values with an X-Y plot.

For a metric that stores HTTP latency information, the sum is a meaningful measure, because it indirectly represents the number of completed HTTP requests. The data from the preceding table shows that the rate of HTTP request completion is low but relatively constant:

Line charts only display time series with scalar values. To display a distribution-valued metric on a line chart, use the aggregation fields to convert the distribution values into scalar values. For example, to use Metrics Explorer to display the 99th percentile of a distribution-valued metric, do the following:

1. In the Cloud Console, select Monitoring or click the following button:
2. In the navigation pane, select Metrics Explorer .
3. Select a distribution-valued metric and a resource. For example, select the RTT Latencies metric and the VM instance resource.
4. Ensure the Metrics Explorer toolbar shows Line chart .
5. In the configuration pane, select 99th percentile for the Aggregator.

Aggregation and distribution metrics

Aggregation is the process of regularizing points within a time series and of combining multiple time series. Aggregation is the same for distribution type metrics as it is for metrics that have a value type of integer or double. However, the chart type enforces some requirements on the choices used for aligning and grouping time series.

Heatmap charts

Heatmap charts display one distribution-valued time series. When you have multiple time series, you must use aligners and grouping functions to create a single time series.

Select a sum or delta aligner when a chart displays a heatmap. These functions combine, at the bucket level, all samples for a single time series that are in the same alignment period, and the result is a distribution value. For example, if two adjacent samples of a time series are [2, 3, 1, 1] and [2, 5, 4, 1], then the sum aligner produces [4, 8, 5, 2].

The grouping function defines how different time series are combined. This function is sometimes called an aggregator or a reducer. For heatmaps, set the grouping function to the sum function. The sum function adds the values of the same buckets across all histograms, resulting in a new histogram. For example, the sum of the value [2, 3, 1, 1] from timeseries-A and the value [1, 5, 2, 2] from timeseries-B is [3, 8, 3, 3].

Line charts

Line charts display only scalar-valued time series. To display a distribution-valued metric on a line chart, use the aligner or the grouping function to convert the distribution values into scalar values:

• Percentile aligners convert a distribution value into a scalar value. With these aligners, grouping time series is optional.

• Sum and delta aligners don't convert a distribution value into a scalar value. When you use these aligners, select a grouping function that converts distribution values into scalar values.

For example, to configure a line chart on a dashboard to display the 99th percentile of every time series for a distribution-valued metric, do the following:

1. In the Cloud Console, select Monitoring or click the following button:
2. In the navigation pane, select Dashboards, then select the dashboard that you want to view or edit.
3. If Editing isn't shown, then click Viewing and select Switch to Editing mode.
4. Add a line chart to your dashboard by selecting the line-chart widget from the Chart library.
5. Modify the line chart configuration to display a distribution-valued metric for a specific resource. For example, select the RTT Latencies metric and the VM instance resource.
6. Configure the chart to use a percentile aligner:

• Basic tab: Clear Grouped and select 99th percentile.
• Advanced tab: Select Percentile in the preprocessing step and use the menu to select the 99th percentile. Also, ensure the group-by field is empty and the group-by function is set to none.

The resulting chart can display multiple lines, one for each time series.

For another example, suppose you want to display a single time series that is the 99th percentile of the time series for a distribution-valued metric. To configure this chart, replace the final step in the previous sequence with the following steps, which that specify a sum aligner and set a grouping function: