This document describes how to understand percentiles and the histogram model
for a metric data with a Distribution
value type.
A distribution metric defines ranges of values, called buckets, and records
the count of measured values that falls into each bucket. Distribution metrics
don't report the individual measure values; they report a histogram of counts
in buckets. This value type is used by services when the individual
measurements are too numerous to collect, but statistical information,
such as averages or percentiles, about those measurements is valuable.
The next section of this page uses a synthetic example to show how percentiles are determined. The example shows that the percentile values depend on the number of buckets, the width of the buckets, the distribution of the measurements, and the total count of samples. The percentile values don't depend on the actual measured values because those values aren't available in the histogram.
Example with synthetic data
Consider an Exponential
bucket model with a scale of
1, a growth factor of 2, and 10 finite buckets. This histogram contains
12 buckets, the 10 finite buckets, 1 bucket that only specifies an
upper bound, and 1 that only specifies a lower bound. For this example, the
finite bucket with index n+1 is twice as wide as the
finite bucket with index n.
The following examples show that the width of the bucket determines the maximum error between the computed percentile and the measurements. They also show that the number of samples in a histogram is important. For example, if the number of samples is less than 20, then the 95th and 99th percentiles are always in the same bucket.
Case 1: The total number of samples is 1.
When there is a single measurement, the three percentile values differ but they only show the 50th, 95th, and 99th percentile of the same bucket. The error between the estimate and the actual measurements can't be determined because the measurement isn't known.
For example, assume that the histogram of measurements is as shown in the following table:
Bucket number | Lower bound | Upper bound | Count | Percentile range |
---|---|---|---|---|
0 | 1 | 0 | 0 | |
1 | 1 | 2 | 0 | 0 |
2 | 2 | 4 | 0 | 0 |
3 | 4 | 8 | 0 | 0 |
4 | 8 | 16 | 0 | 0 |
5 | 16 | 32 | 0 | 0 |
6 | 32 | 64 | 0 | 0 |
7 | 64 | 128 | 0 | 0 |
8 | 128 | 256 | 1 | 0 - 100 |
9 | 256 | 512 | 0 | 0 |
10 | 512 | 1024 | 0 | 0 |
11 | 1024 | 0 | 0 |
To compute the 50th percentile, do the following:
- Use the bucket counts to determine the bucket that contains the 50th percentile. In this example, bucket number 8 contains the 50th percentile.
Compute the estimate using the following rule:
pth percentage = bucket_low + (bucket_up - bucket_low)*(p - p_low)/(p_up - p_low)
In the previous expression,
p_low
andp_up
are the lower and upper bounds of the percentile range for the bucket. Similarly,bucket_low
andbucket_up
are the lower and upper bounds of the bucket. The values forp_low
andp_up
depend on how the counts are distributed between the different buckets.
For example, the 50th percentile is computed as:
50th percentile = 128 + (256-128)*(50-0)/(100-0) = 128 + 128 * 50 / 100 = 128 + 64 = 192
To compute the 95th percentile, replace 50
with 95
in the previous
expression. For this example where there is exactly one sample, the
percentiles are as follows:
Percentile | Bucket number | Value |
---|---|---|
50th | 8 | 192 |
95th | 8 | 249.6 |
99th | 8 | 254.7 |
The error between the estimate and the actual measurements can be bounded, but it can't be determined because the measurement isn't known.
Case 2: The total number of samples is 10.
When there are 10 samples, the 50th percentile might be in a different bucket than the 95th and 99th percentiles. However, there aren't enough measurements to allow the 95th and 99th percentiles to be in different buckets.
For example, assume that the histogram of measurements is as shown in the following table:
Bucket number | Lower bound | Upper bound | Count | Percentile range |
---|---|---|---|---|
0 | 1 | 4 | 0 - 40 | |
1 | 1 | 2 | 2 | 40 - 60 |
2 | 2 | 4 | 1 | 60 - 70 |
3 | 4 | 8 | 1 | 70 - 80 |
4 | 8 | 16 | 1 | 80 - 90 |
5 | 16 | 32 | 0 | 0 |
6 | 32 | 64 | 0 | 0 |
7 | 64 | 128 | 0 | 0 |
8 | 128 | 256 | 1 | 90 - 100 |
9 | 256 | 512 | 0 | 0 |
10 | 512 | 1024 | 0 | 0 |
11 | 1024 | 0 | 0 |
You can use the procedure described previously to compute the 50th, 95th, and 99th percentiles. For example, the 50th percentile, which is in bucket number 1, is computed as follows:
50th percentile = 1 + (2-1)*(50-40)/(60-40) = 1 + (1 * 10 / 20) = 1 + 0.5 = 1.5
Similarly, the 95th percentile is computed as follows:
95th percentile = 128 + (256-128)*(95-90)/(100-90) = 128 + 128 * 5 / 10 = 128 + 64 = 192
By using the process described previously, the percentiles can be computed. Each row in the following table lists a percentile, the corresponding bucket, and the computed value:
Percentile | Bucket number | Value | Maximum error |
---|---|---|---|
50th | 1 | 1.5 | 0.5 |
95th | 8 | 192 | 74 |
99th | 8 | 243.2 | 115.2 |
In this example and in the previous example, the 95th percentile is in bucket number 8; however, the percentile computation is different. The difference is due to how the samples are distributed. In the first example, all samples are in the same bucket, while in the most recent example, the samples are in different buckets.
Example with real data
This section contains an example that illustrates how you can determine the bucket model used by a particular metric. This section also illustrates how you can evaluate the potential error in the computed percentile values.
Identify the bucket model
To determine the buckets used for a metric over a specific time interval,
call the Cloud Monitoring API's
projects.timeSeries/list
method.
For example, to identify the bucket model for a metric, do the following:
- Go to the
projects.timeSeries/list
web page. In APIs Explorer, enter the filter that specifies the metric, a start time, and an end time.
For example, to get information about the metric that stores API requests latencies, enter the following:
metric.type="serviceruntime.googleapis.com/api/request_latencies" resource.type="consumed_api"
In this example, the filter field specifies a metric type and a resource type. For more information about these filters, see Monitoring filters.
Click Enter.
The following is the list
API response for a distribution-valued metric
that is available on one Google Cloud project:
{ "timeSeries": [ { "metric": {...}, "resource": {...}, }, "metricKind": "DELTA", "valueType": "DISTRIBUTION", "points": [ { "interval": { "startTime": "2020-11-03T15:05:00Z", "endTime": "2020-11-03T15:06:00Z" }, "value": { "distributionValue": { "count": "3", "mean": 25.889, "bucketOptions": { "exponentialBuckets": { "numFiniteBuckets": 66, "growthFactor": 1.4, "scale": 1 } }, "bucketCounts": [ "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "3" ] } } },
In the API response, the value
field describes the data stored in
the points
array. The count
and mean
fields report that for the
specified time interval there were 3 measurements and their average value
was 25.889. The bucketOptions
field shows that the exponential model is
configured to have 66 buckets, a scale of 1, and a growth factor of 1.4.
To compute the lower and upper bounds for the bucket with index n, use the following rules:
- Lower bound (1 ≤ n < N) = scale * (growth factor)(n-1)
- Upper bound (0 ≤ n < N-1) = scale * (growth factor)n
In the previous expressions, N
is the total number of buckets.
The buckets for this metric, along with the midpoint of each bucket, are shown in the following table:
nth bucket | Lower bound | Upper bound | Midpoint |
---|---|---|---|
0 | 1 | Not applicable | |
1 | 1 | 1.40 | 1.20 |
2 | 1.40 | 1.96 | 1.68 |
... | |||
9 | 14.76 | 20.66 | 17.71 |
10 | 20.66 | 28.93 | 24.79 |
11 | 28.93 | 40.50 | 34.71 |
... |
Verify the percentile computations
Now that the bucket configuration is known, for any set of measurements you can predict the values of 50th, 95th, and 99th percentile values. For example, if there is one sample and it is in bucket number 10, then the 50th percentile value is 24.79.
To retrieve the 50th, 95th, and 99th percentile values of the metric, you can
use the API method projects.timeSeries/list
, and
include an alignment period and aligner. In this example, the following settings
were selected:
- Aligner:
ALIGN_PERCENTILE_50
,ALIGN_PERCENTILE_95
, orALIGN_PERCENTILE_99
- Alignment Period: 60 s
For the ALIGN_PERCENTILE_50
selection, each value in the time series is the
50th percentile of a bucket:
{ "timeSeries": [ { "metric": {...}, "resource": {...}, "metricKind": "GAUGE", "valueType": "DOUBLE", "points": [ { "interval": { "startTime": "2020-11-03T15:06:36Z", "endTime": "2020-11-03T15:06:36Z" }, "value": { "doubleValue": 24.793256140799986 } }, { "interval": { "startTime": "2020-11-03T15:05:36Z", "endTime": "2020-11-03T15:05:36Z" }, "value": { "doubleValue": 34.710558597119977 } }, { "interval": { "startTime": "2020-11-03T15:04:36Z", "endTime": "2020-11-03T15:04:36Z" }, "value": { "doubleValue": 24.793256140799986 } } ] },
For two of the samples, the 50th percentile is in bucket 10, for the other sample it is in bucket 11.
The following table shows the results of executing the
projects.timeSeries/list
method with different aligners. The first row
corresponds to the case where the aligner isn't specified. When you don't
specify an aligner, the bucket model and mean values are returned. The next
three rows list the data returned when the aligner is set to
ALIGN_PERCENTILE_50
, ALIGN_PERCENTILE_95
, and ALIGN_PERCENTILE_99
:
Statistic | Sample @ 15:06 | Sample @ 15:05 | Sample @ 15:04 |
---|---|---|---|
mean | 25.889 | 33.7435 | Not available. |
50th percentile | 24.79 | 34.71 | 24.79 |
95th percentile | 28.51 | 39.91 | 28.51 |
99th percentile | 28.84 | 40.37 | 28.84 |
As the two examples with synthetic data illustrate, the values of the percentiles dependent on how the samples are distributed. When all samples are in the sample bucket, then the 50th percentile is the midpoint of that bucket. However, when samples are in different buckets, that distribution affects the estimates.
To determine if the 50th percentile is a reasonable estimate of the mean, you can compare the mean value to the 50th percentile. The mean value is returned with the bucket details.
What's next
For information about how to visualize distribution-valued metrics, see
About distribution-valued metrics.