Create metric-absence alerting policies

This document describes how to use the Google Cloud console to create an alerting policy that sends notifications when a monitored time series has no data for a specific period of time.

Metric-absence conditions require at least one successful measurement — one that retrieves data — within the maximum period of time after the policy was installed or modified. This time period is called the trigger absence time. The maximum configurable trigger absence time is 23.5 hours.

For example, suppose you set the trigger absence time in a metric-absence policy to 30 minutes. The condition won't be met when the subsystem that writes metric data has never written a data point. The subsystem needs to output at least one data point and then fail to output additional data points for 30 minutes.

This content does not apply to log-based alerting policies. For information about log-based alerting policies, which notify you when a particular message appears in your logs, see Monitoring your logs.

This document doesn't describe the following:

How to be notified when the values of a metric are more than, or less than, a threshold. For more information, see Create metric-threshold alerting policies.

How to be notified based on the predicted value of a metric. For more information, see Create forecasted metric-value alerting policies.

How to create an alerting policy by using the Cloud Monitoring API. For more information, see Create alerting policies by using the API.
How to create an alerting policy whose condition includes a Monitoring Query Language (MQL) query. These policies can use a static or dynamic threshold. For more information, see the following documents:
- Alerting policies with MQL.
- Create dynamic severity levels using MQL.

This feature is supported only for Google Cloud projects. For App Hub configurations, select the App Hub host project or management project.

Before you begin

To get the permissions that you need to create and modify alerting policies by using the Google Cloud console, ask your administrator to grant you the Monitoring Editor (roles/monitoring.editor) IAM role on your project. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

For more information about Cloud Monitoring roles, see Control access with Identity and Access Management.
Ensure that you're familiar with the general concepts of alerting policies. For information about these topics, see Alerting overview.
Configure the notification channels that you want to use to receive any notifications. For redundancy purposes, we recommend that you create multiple types of notification channels. For more information, see Create and manage notification channels.

Create alerting policy

To create an alerting policy that sends notifications when a monitored time series has no data for a specific trigger absence time, do the following:

In the Google Cloud console, go to the Alerting page:
Go to Alerting

If you use the search bar to find this page, then select the result whose subheading is Monitoring.
In the toolbar of the Google Cloud console, select your Google Cloud project. For App Hub configurations, select the App Hub host project or management project.
Select Create policy.
Select the time series to be monitored:
1. Click Select a metric, navigate through the menus to select a resource type and metric type, and then click Apply.
  
  The Select a metric menu contains features that help you find the metric types available:
  - To find a specific metric type, use the Filter bar. For example, if you by enter util, then you restrict the menu to show entries that include util. Entries are shown when they pass a case-insensitive "contains" test.
  You can monitor any built-in metric or any user-defined metric.
2. Optional: To monitor a subset of the time series that match the metric and resource types you selected in the previous step, click Add filter. In the filter dialog, select the label by which to filter, a comparator, and then the filter value. For example, the filter zone =~ ^us.*.a$ uses a regular expression to match all time-series data whose zone name starts with us and ends with a. For more information, see Filter the selected time series.
3. Optional: To change how the points in a time series are aligned, in the Transform data section, set the Rolling window and Rolling window function fields.
  
  If you are monitoring a log-based metric, then we recommend that the Rolling window menu is set to at least 10 minutes.
  
  These fields specify how the points that are recorded in a window are combined. For example, assume that the window is 15 minutes and the window function is max. The aligned point is the maximum value of all points in the most recent 15 minutes. For more information, see Alignment: within-series regularization.
4. Optional: Combine time series when you want to reduce the number of time series monitored by a policy, or when you want to monitor only a collection of time series. For example, instead of monitoring the CPU utilization of each VM instance, you might want to compute the average of the CPU utilization for all VMs in a zone, and then monitor that average. By default, time series aren't combined. For general information, see Reduction: combining time series.
  
  To combine all time series, do the following:
  1. In the Across time series section, click Expand.
  2. Set the Time series aggregation field to a value other than none. For example, to display the average value of the time series, select mean.
  3. Ensure that the Time series group by field is empty.
  To combine, or group, time series by label values, do the following:
  1. In the Across time series section, click Expand.
  2. Set the Time series aggregation field to a value other than none.
  3. In the Time series group by field, select the labels by which to group.
  For example, if you group by the zone label and then set the aggregation field to a value of mean, then the chart displays one time series for each zone for which there is data. The time series shown for a specific zone is the average of all time series with that zone.
  
  Note: To delete a grouping, clear the Time series group by field and set the Time series aggregation field to none.
5. Click Next.
Configure the condition trigger:
1. Select Metric absence for the type of condition.
2. Optional: Update the Alert trigger menu, which has the following values:
  - Any time series violates: Default setting. Any time series with absent data for the entire trigger absence time causes the condition to be met.
  - Percent of time series violates: A percentage of time series must have absent data for the entire trigger absence time before the condition is met. For example, you could be notified when 50% of the monitored time series don't have data for the entire trigger absence time.
  - Number of time series violates: A specific number of time series must have absent data for the entire trigger absence time before the condition is met. For example, you could be notified when 32 of the monitored time series don't have data for the entire trigger absence time.
  - All time series violate: All time series must have absent data for the entire trigger absence time before the condition is met.
  For information about the intervals that Monitoring uses to align and measure time series data, see Alignment periods and retest windows.
3. Specify how long metric data must be absent before Monitoring notifies you by using the Trigger absence time field.
4. Click Next.
Optional: Create an alerting policy with multiple conditions.

Most policies monitor a single metric type, for example, a policy might monitor the number of bytes written to a VM instance. When you want to monitor multiple metric types, create a policy with multiple conditions. Each condition monitors one metric type. After you create the conditions, you specify how the conditions are combined. For information, see Policies with multiple conditions.

To create an alerting policy with multiple conditions, do the following:
1. For each additional condition, click Add alert condition and then configure that condition.
2. Click Next and configure how conditions are combined.
3. Click Next to advance to the notifications and documentation set up.
Configure the notification and add user labels:
1. Expand the Notifications and name menu and select your notification channels. For redundancy purposes, we recommend that you add to an alerting policy multiple types of notification channels. For more information, see Manage notification channels.
2. Optional: To use a custom subject line in your notification instead of the default, update the Notification subject line field.
3. Optional: To be notified when an incident is closed, select Notify on incident closure. By default, when you create an alerting policy with the Google Cloud console, a notification is sent only when an incident is created.
4. Optional: To change how long Monitoring waits before closing an incident after data stops arriving, select an option from the Incident autoclose duration menu. By default, when data stops arriving, Monitoring waits seven days before closing an open incident.
5. Optional: To associate your alerting policy with an App Hub application, in the Application labels section, select an application and either a service or workload. Incidents and notifications display these labels.
6. Optional: Select an option from the Policy severity level menu. Incidents and notifications display the severity level.
7. Optional: To add custom labels to the alerting policy, in the Policy user labels section, do the following:
  1. Click Add label, and in the Key field enter a name for the label. Label names must start with a lowercase letter, and they can contain lowercase letters, numerals, underscores, and dashes. For example, enter severity.
  2. Click Value and enter a value for your label. Label values can contain lowercase letters, numerals, underscores, and dashes. For example, enter critical.
  For information about how you can use policy labels to help you manage your notifications, see Annotate incidents with labels.
Optional: In the Documentation section, enter any content that you want included with the notification.

To format your documentation, you can use plain text, Markdown, and variables. You can also include links to help users debug the incident, such as links to internal playbooks, Google Cloud dashboards, and external pages. For example, the following documentation template describes a CPU utilization incident for a gce_instance resource and includes several variables to reference the alerting policy and condition REST resources. The documentation template then directs readers to external pages to help with debugging.

When notifications are created, Monitoring replaces the documentation variables with their values. The values replace the variables only in notifications. The preview pane and other places in the Google Cloud console show only the Markdown formatting.
Preview
```
## CPU utilization exceeded

### Summary

The ${metric.display_name} of the ${resource.type}
${resource.label.instance_id} in the project ${resource.project} has
exceeded 90% for over 15 minutes.

### Additional resource information

Condition resource name: ${condition.name}  
Alerting policy resource name: ${policy.name}  

### Troubleshooting and Debug References

Repository with debug scripts: example.com  
Internal troubleshooting guide: example.com  
${resource.type} dashboard: example.com
```
Format in notification
For more information, see Annotate notifications with user-defined documentation and Using channel controls.
Click Alert name and enter a name for the alerting policy.
Click Create policy.

Filters ensure that only time series that meet some set of criteria are monitored. When you apply filters, you might reduce the number of lines on the chart, which can improve the performance of the chart. You can also reduce the amount of data being monitored by applying aggregation. Filters ensure that only time series that meet some set of criteria are used. When you apply filters, there are fewer time series to evaluate, which can improve the performance of the alert.

A filter is composed of a label, a comparator, and a value. For example, to match all time series whose zone label starts with "us-central1", you could use the filter zone=~"us-central1.*", which uses a regular expression to perform the comparison.

When you filter by the project ID or the resource container, you must use the equals operator, (=). When you filter by other labels, you can use any supported comparator. Typically, you can filter metric and resource labels, and by resource group.

When you supply multiple filtering criteria, only the time series that meet all criteria are monitored.

To add a filter, click Add filter, complete the dialog, and then click Done. In the dialog, you use the Filter field to select the criterion by which to filter, select the comparison operator, and then select or input the value. The drop-down menu only lists values that appear over the last week, but you can input any value. Each row in the following table lists a comparison operator, its meaning, and an example:

Operator	Meaning	Example
`=`	Equality	`resource.labels.zone = "us-central1-a"`
`!=`	Inequality	`resource.labels.zone != "us-central1-a"`
`=~`	Regular expression2 equality	`monitoring.regex.full_match("^us.*")`
`!=~`	Regular expression2 inequality	`monitoring.regex.full_match("^us.*")`
`starts_with`	Value starts with	`resource.labels.zone = starts_with("us")`
`ends_with`	Value ends with	`resource.labels.zone = ends_with("b")`
`has_substring`	Value contains	`resource.labels.zone = has_substring("east")`
`one_of`	One of	`resource.labels.zone = one_of("asia-east1-b", "europe-north1-a")`
`!starts_with`	Value doesn't start with	`resource.labels.zone != starts_with("us")`
`!ends_with`	Value doesn't ends with	`resource.labels.zone != ends_with("b")`
`!has_substring`	Value doesn't contain	`resource.labels.zone != has_substring("east")`
`!one_of`	Value isn't one of	`resource.labels.zone != one_of("asia-east1-b", "europe-north1-a")`

Create metric-absence alerting policies Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Create alerting policy

Preview

Format in notification

Filter the selected time series

Create metric-absence alerting policies