Retention and latency of metric data

This page provides information about how long Cloud Monitoring retains your metric data and information about the latency between the collection of the data and its visibility to you.

Quotas and limits provides additional information about limits on metric data.

Retention of metric data

Cloud Monitoring acquires metric data and holds it in the time series of metric types for a period of time. This period of time varies with the metric type; See Data retention for details.

At the end of that period, Cloud Monitoring deletes the expired data points.

When all the points in a time series have expired, Cloud Monitoring deletes the time series. Deleted time series don't appear in Cloud Monitoring charts or in results from the Monitoring API.

Latency of metric data

Latency refers to the delay between when Cloud Monitoring samples a metric and when the metric data point becomes visible as time series data. The latency depends on whether the metric is a metric from a Google Cloud service or a user-defined metric:

Google Cloud metrics: The Google Cloud metrics list includes the metric types from Google Cloud services. Many of these descriptions include a statement like the following: “Sampled every 60 seconds. After sampling, data is not visible for up to 240 seconds.”

The values in the statement vary for specific metrics. The example statement means that Cloud Monitoring collects one measurement each minute (the sampling interval), but because some of these metrics receive additional processing before they are exposed, it can take additional time (latency) before you can retrieve the data for this metric. In this example, the latency can be up to 4 minutes. So, the timestamp recording the collection time might be up to 4 minutes old for this metric. This latency doesn't apply to user-defined metrics.

User-defined metrics: If you are writing data to user-defined metrics, including custom metrics, OpenTelemetry-collected metrics, application metrics collected by the Ops Agent, and Prometheus metrics, then data from these metrics is typically visible and queryable within 3 to 7 seconds, excluding network latency.

In some situations, you might need to adjust how you use a metric with latency. For example:

When using client libraries to retrieve metric data, you might need to use an offset in the query interval to account for latency.
When using a metric to drive resource management like when autoscaling, the latency of the metric can affect the responsiveness of the autoscaling. For example, some Pub/Sub metrics have latencies that range from 2 to 4 minutes.
When using alerting policies, be aware that latency can affect incident creation time for metric-based alerting policies. For example, if a monitored metric has a latency of up to 180 seconds, then Cloud Monitoring won't create an incident for up to 180 seconds after the metric violates the threshold of the alerting policy condition. Cloud Monitoring automatically accounts for the latency, if any, of the underlying metric when evaluating alerting policies.