This page provides troubleshooting information for common scenarios when using logs-based metrics in Cloud Logging.
Metric is missing logs data
There are several possible reasons for missing data in logs-based metrics:
New log entries might not match your metric's filter. A logs-based metric gets data from matching log entries that are received after the metric is created. Logging doesn't backfill the metric from previous log entries.
New log entries might not contain the correct field, or the data might not be in the correct format for extraction by your distribution metric. Check that your field names and regular expressions are correct.
Your metric counts might be delayed. Even though countable log entries appear in the Logs Explorer, it may take up to 10 minutes to update the logs-based metrics in Cloud Monitoring.
The log entries that are displayed might be counted late or might not be counted at all, because they are time-stamped too far in the past or future. If a log entry is received by Cloud Logging more than 24 hours in the past or 10 minutes in the future, then the log entry won't be counted in the logs-based metric.
The number of late-arriving entries is recorded for each log in the system logs-based metric
Example: A log entry matching a logs-based metric arrives late. It has a
timestampof 2:30 PM on February 20, 2020 and a
receivedTimestampof 2:45 PM on February 21, 2020. This entry won't be counted in the logs-based metric.
Resource type is "undefined" in Cloud Monitoring
Some Cloud Logging monitored-resource types do not map directly to Cloud Monitoring monitored-resource types. For example, when you first create either an alert or chart from a logs-based metric, you might see that the resource type is "undefined".
The monitored-resource type maps either to
global or a different
monitored-resource type in Cloud Monitoring. See the
Mappings for Logging-only resources
to determine which monitored-resource type you need to choose.
False-positive alerts or alerts that aren't triggered
You could get false-positive alerts or alerts that aren't being triggered from logs-based metrics because the alignment period for the alert is too short. Common scenarios where an alignment period that's too short causes problems are when an alert uses less than logic, or the alert is based on a percentile condition for a distribution metric.
False-positive alerts can occur because log
entries can be sent to Logging late. For example, the log fields
receiveTimestamp can have a delta of minutes in some cases.
Also, when Logging ingests logs, there is an inherent delay
between when the log entries are generated and when Logging
receives them. This means that Logging might not have the
total count for a particular log entry until some later point in time after the
log entries were generated. This is why an alert using less than logic
or based on a percentile condition for a distribution metric can
produce a false-positive alert: not all the log entries have been accounted for
However, logs-based metrics are always eventually consistent. Logs-based metrics
are eventually consistent because a log entry that matches a logs-based metric
can be sent to Logging with a
timestamp that is significantly
older or newer than the log's
This means that the logs-based metric can receive log entries with older timestamps after existing log entries with the same timestamp have already been received by Logging. Thus, the metric value must be updated.
In order to guarantee that alerts are accurate even for on-time data, alert policies for logs-based metrics should use alert conditions with alignment periods greater than or equal to two minutes. For log entries that are sent to Logging with delays measured in minutes, an alignment period of ten minutes is recommended to balance timeliness and accuracy.
Metric has too many time series
The number of time series in a metric depends on the number of different combinations of label values. The number of time series is called the cardinality of the metric, and it must not exceed 30,000.
Because you can generate a time series for every combination of label values, if you have one or more labels with high number of values, it isn't difficult to exceed 30,000 time series. You want to avoid high-cardinality metrics.
As the cardinality of a metric increases, the metric can get throttled and some data points might not be written to the metric. Charts that display the metric can be slow to load due to the large number of time series that the chart has to process. You might also incur costs for API calls to query time series data; review Cloud Monitoring costs for details.
To avoid creating high cardinality metrics:
Check that your label fields and extractor regular expressions match values that have a limited cardinality.
Avoid extracting text messages that can change, without bounds, as label values.
Avoid extracting numerical values with unbounded cardinality.
Only extract values from labels of known cardinality; for instance, status codes with a set of known values.
These system logs-based metrics can help you measure the effect that adding or removing labels has on the cardinality of your metric:
When you inspect these metrics, you can further filter your results by metric name. For details, go to Selecting metrics: filtering.
Metric name is invalid
When you create a counter or distribution metric, choose a metric name that is unique among the logs-based metrics in your project.
Metric-name strings must not exceed 100 characters and can include only the following characters:
The special characters
The forward slash character
/denotes a hierarchy of pieces within the metric name and cannot be the first character of the name.
Label values are truncated
Values for user-defined labels must not exceed 1,024 bytes.
No logs-based metrics are available for creation
Logs-based metrics apply only to a single Google Cloud project. You cannot create them for logs buckets or for other Google Cloud resources such as Cloud Billing accounts or organizations.