Troubleshoot log-based metrics

This page provides troubleshooting information for common scenarios when using log-based metrics in Cloud Logging.

Cannot view or create metrics

Log-based metrics apply only to a single Google Cloud project or to a Logging bucket within a Google Cloud project. You can't create log-based metrics for other Google Cloud resources such as billing accounts or organizations. Log-based metrics are computed for logs only in the Google Cloud project or bucket in which they're received.

To create metrics, you need the correct Identity and Access Management permissions. For details, see Access control with IAM: Log-based metrics.

Metric is missing logs data

There are several possible reasons for missing data in log-based metrics:

New log entries might not match your metric's filter. A log-based metric gets data from matching log entries that are received after the metric is created. Logging doesn't backfill the metric from previous log entries.
New log entries might not contain the correct field, or the data might not be in the correct format for extraction by your distribution metric. Check that your field names and regular expressions are correct.
Your metric counts might be delayed. Even though countable log entries appear in the Logs Explorer, it may take up to 10 minutes to update the log-based metrics in Cloud Monitoring.
The log entries that are displayed might be counted late or might not be counted at all, because they are time-stamped too far in the past or future. If a log entry is received by Cloud Logging more than 24 hours in the past or 10 minutes in the future, then the log entry won't be counted in the log-based metric.

The number of late-arriving entries is recorded for each log in the system log-based metric logging.googleapis.com/logs_based_metrics_error_count.

Example: A log entry matching a log-based metric arrives late. It has a timestamp of 2:30 PM on February 20, 2020 and a receivedTimestamp of 2:45 PM on February 21, 2020. This entry won't be counted in the log-based metric.
The log-based metric was created after the arrival of log entries that the metric might count. Log-based metrics evaluate log entries as they're stored in log buckets; these metrics don't evaluate log entries stored in Logging.

Resource type is "undefined" in Cloud Monitoring

Some Cloud Logging monitored-resource types don't map directly to Cloud Monitoring monitored-resource types. For example, when you first create either an alerting policy or chart from a log-based metric, you might see that the resource type is "undefined".

The resource type is undefined.

The monitored-resource type maps either to global or a different monitored-resource type in Cloud Monitoring. See the Mappings for Logging-only resources to determine which monitored-resource type you need to choose.

Incidents aren't created or are false-positive

You could get false-positive incidents or situations where Monitoring doesn't create incidents from log-based metrics because the alignment period for the alerting policy is too short. You might encounter false positives in the following scenarios:

When an alerting policy uses less than logic.
When an alerting policy is based on a percentile condition for a distribution metric.

False-positive incidents can occur because log entries can be sent to Logging late. For example, the log fields timestamp and receiveTimestamp can have a delta of minutes in some cases. Also, when Logging stored logs in log buckets, there is an inherent delay between when the log entries are generated and when Logging receives them. This means that Logging might not have the total count for a particular log entry until some later point in time after the log entries were generated. This is why an alerting policy using less than logic or based on a percentile condition for a distribution metric can produce a false-positive alert: not all the log entries have been accounted for yet.

However, log-based metrics are always eventually consistent. Log-based metrics are eventually consistent because a log entry that matches a log-based metric can be sent to Logging with a timestamp that is significantly older or newer than the log's receiveTimestamp.

This means that the log-based metric can receive log entries with older timestamps after existing log entries with the same timestamp have already been received by Logging. Thus, the metric value must be updated.

For notifications to remain accurate even for on-time data, alerting policies for log-based metrics should use conditions with alignment periods of at least two minutes. For log entries that are sent to Logging with delays measured in minutes, an alignment period of ten minutes is recommended to balance timeliness and accuracy.

Metric has too many time series

The number of time series in a metric depends on the number of different combinations of label values. The number of time series is called the cardinality of the metric, and it must not exceed 30,000.

Because you can generate a time series for every combination of label values, if you have one or more labels with high number of values, it isn't difficult to exceed 30,000 time series. You want to avoid high-cardinality metrics.

As the cardinality of a metric increases, the metric can get throttled and some data points might not be written to the metric. Charts that display the metric can be slow to load due to the large number of time series that the chart has to process. You might also incur costs for API calls to query time series data; review Cloud Monitoring costs for details.

To avoid creating high cardinality metrics:

Check that your label fields and extractor regular expressions match values that have a limited cardinality.
Avoid extracting text messages that can change, without bounds, as label values.
Avoid extracting numerical values with unbounded cardinality.
Only extract values from labels of known cardinality; for instance, status codes with a set of known values.

These system log-based metrics can help you measure the effect that adding or removing labels has on the cardinality of your metric:

When you inspect these metrics, you can further filter your results by metric name. For details, see Selecting metrics: filtering.

Metric name is invalid

When you create a counter or distribution metric, choose a metric name that is unique among the log-based metrics in your Google Cloud project.

Metric-name strings must not exceed 100 characters and can include only the following characters:

A-Z
a-z
0-9
The special characters _-.,+!*',()%\/.

The forward slash character / denotes a hierarchy of pieces within the metric name and cannot be the first character of the name.

Label values are truncated

Values for user-defined labels must not exceed 1,024 bytes.

Cannot delete a custom log metric

You attempt to delete a custom log-based metric by using the Google Cloud console. The delete request fails and the deletion dialog displays the error message There is an unknown error while executing this operation.

To resolve this problem, try the following:

Refresh the Log-based metrics page in the Google Cloud console. The error message might be shown due to an internal timing issue.
Identify and delete any alerting policies that monitor the log-based metric. After you ensure that the log-based metric isn't monitored by an alerting policy, delete the log-based metric. Log-based metrics that are monitored by an alerting policy can't be deleted.