Charting distribution metrics

This document describes how you can create and interpret a chart that displays metric data of the Distribution value type. This value type is used by services when the individual measurements are too numerous to collect, but statistical information, such as averages or percentiles, about those measurements is valuable. For example, if your application relies on HTTP traffic, then you can use a distribution-valued metric that captures HTTP response latency to evaluate how quickly HTTP requests complete.

To illustrate how a histogram is created, consider a service that measures the HTTP latency of requests and that reports this data by using a metric with a distribution-value type. The data is reported every minute. The service defines ranges of values for the metric, called buckets, and records the count of measured values that falls into each bucket. For example, when an HTTP request completes, the service increments the count in the bucket whose range includes the request's latency value. These counts create a histogram of values for that minute.

Assume that the latencies measured in a one-minute interval are 5, 1, 3, 5, 6, 10, and 14. If the buckets are [0, 4), [4, 8), [8, 12), and [12, 16), then the histogram of this data is [2, 3, 1, 1]. The following table shows how individual measurements affect the count for each bucket:

Bucket Latency measurements Number of values in the bucket
[12, 16) 14 1
[8,12) 10 1
[4,8) 5, 5, 6 3
[0,4) 1, 3 2

When this data is written to the time series, a Point object is created. For metrics with a distribution value, that object includes the histogram of values. For this sampling period, the Point contains [2, 3, 1, 1]. The individual measurements aren't written to the time series.

The following table illustrates a sequence of histograms. Each column in the table represents the latency data for a one-minute period:

Bucket Histogram for
1:00
Histogram for
1:01
Histogram for
1:02
Histogram for
1:03
[12, 16) 1 6 0 1
[8,12) 1 0 2 2
[4, 8) 3 1 1 8
[0, 4) 2 6 10 3

Heatmap charts

Heatmap charts are designed to display a single time series with distribution values. For these charts, the X-axis represents time, the Y-axis represents the buckets, and color represents the value. The brighter the color indicates a higher value. For example, dark areas of the heatmap indicate lower bucket counts than yellow or white areas.

The following figure is one representation of a heatmap for the previous example:

Heatmap chart for the example.

In the previous figure, the heatmap uses black to represent the smallest bucket count, 0, and yellow to represent the largest bucket count, 10. Reds and oranges represent values between these two extremes.

Because heatmap charts can represent only one time series, you must set the aggregation options to combine all time-series data into a single time series. For example, if you are using Metrics Explorer, the configuration pane contains Group by and Aggregator fields. To combine all time-series data into a single time series, use the following settings:

  • Ensure that the Group by field is empty.
  • Select sum for the Aggregator.

Line and bar charts

Line charts, stacked bar charts, and stacked line charts can't display distribution values. If you have a metric with a distribution value and want to display it using one of these chart types, then you must convert the histogram into a single numerical value. For example, you could compute the sum of the values in the histogram or you could select a percentile.

For example, each row in the following table includes a timestamp, a histogram, and a sum of histogram values:

Time Histogram Sum of histogram values
1:00 [2, 3, 1, 1] 7
1:01 [6, 1, 0, 6] 13
1:02 [10, 1, 2, 0] 13
1:03 [3, 8, 2, 1] 14

In the preceding table, you can display the sum of histogram values with an X-Y plot.

If the metric stores HTTP latency information, then the sum is a meaningful measure, because it indirectly represents the number of completed HTTP requests. The data from the preceding table shows that the rate of HTTP request completion is low but relatively constant:

Line chart for the example.

Aggregation and distribution metrics

Aggregation is the process of regularizing points within a time series and of combining multiple time series. Aggregation is the same for distribution type metrics as it is for metrics that have a value type of integer or double. However, the chart type enforces some requirements on the choices used for aligning and grouping time series.

Heatmap charts

Heatmap charts can only display a single distribution-valued time series. When you have multiple time series, you must use aligners and grouping functions to create a single time series.

For the aligner, you can select either sum or delta. These functions combine, at the bucket level, all samples for a single time series that are in the same alignment period, and the result is a distribution value. For example, if two adjacent samples of a time series are [2, 3, 1, 1] and [2, 5, 4, 1], then the sum aligner produces [4, 8, 5, 2].

The grouping function defines how different time series are combined. This function is sometimes called an aggregator or a reducer. For heatmap charts, this function must be set to sum. The sum function adds together the values of the same buckets for the different time series. The result is a distribution value. For example, if the value of timeseries-A is [2, 3, 1, 1], and the value of timeseries-B is [1, 5, 2, 2], then the sum is [3, 8, 3, 3].

Line charts

The aligner and the grouping functions must be selected such that, after aggregation is complete, the distribution values are converted into numerical values. You can convert a distribution value into a numeric value with the aligner or with the grouping function.

  • If you select a percentile for the aligner, then during the alignment stage of aggregation, each distribution value is converted into a numerical value. Grouping time series is optional.

    For example, assume that you're using the Google Cloud Console to configure a line chart on a dashboard and that you want to display the 99th percentile of every time series:

    • If you use the Basic tab, clear Grouped and select 99th percentile.
    • If you use the Advanced tab, select Percentile in the preprocessing step and use the menu to select the 99th percentile. Also, ensure the group-by field is empty and the group-by function is set to none.

    With this configuration, the chart can display multiple lines, one for each time series.

  • If you select sum or delta as the aligner, then select a grouping function that converts a distribution value into a numeric value. The result of the sum or delta aligners is a distribution value.

    For example, assume that you're using the Google Cloud Console to configure a a line chart on a dashboard and that you want to display the 99th percentile of all time-series data:

    1. Select the Advanced tab and then select No preprocessing step.
    2. Set the Alignment function to sum and ensure the group-by field is empty.
    3. Set the Group by function to 99th percentile.

    With this configuration, the chart displays a single line.

What's next

For information about how to determine the bucket model for a metric and how to interpret percentiles, see Percentiles and distribution-valued metrics.