Select and configure metrics

This document describes the fields that you set when you are configuring the condition of an alerting policy. Typically, you create an alerting policy when you want to be notified when time-series data, such as the CPU usage of a virtual machine, satisfies certain conditions. This content does not apply to log-based alerting policies. For information about log-based alerting policies, which notify you when a particular message appears in your logs, see Monitoring your logs.

Select the data to display

To specify the metrics to display when creating an alerting policy, you specify values for a metric and a resource type:

  • The metric field identifies the measurements to be collected from a monitored resource. It includes a description of what is being measured and how the measurements are interpreted. Metric is a short form of metric type. For conceptual information, see Metric types.

  • The resource type field specifies from which resource the metric data is captured. The resource type is sometimes called the monitored resource type or the resource. For conceptual information, see Monitored resources.

Monitoring has many predefined metric types and monitored resources available, and you can create custom metrics as well:

  • For information on predefined metrics types, see the Metrics list. Each document lists metrics by the type of service. For example, the Google Cloud metrics page contains a series of tables, one for each Google Cloud service.

  • For information about monitored resource types available, see Monitored resource list.

  • For information on defining your own metrics, see Using custom metrics.

Cloud Monitoring is refreshing the interface to create an alerting policy. Select the tab that corresponds to the interface that you are using.

Legacy interface

To select a metric, use the Find resource type and metric field to choose one resource type and one metric type. You can specify them in either order. To begin, click in the field. This brings up one or two lists, based on any prior selections. The lists are indicated by headers, Resource types and Metrics, as seen in the following screenshot:

Search lists for selecting metrics and resources.

You can select an entry in two ways:

  • By selecting entries from the lists.

  • By entering a Monitoring filter. A Monitoring filter is an expression that Monitoring uses to identify the time series to be monitored. The following Monitoring filter results in the chart displaying the count of log entries for all Google Cloud virtual machine instances in the us-east1-b zone:

    metric.type="logging.googleapis.com/log_entry_count"
    resource.type="gce_instance" resource.label."zone"="us-east1-b"
    

    To enter a Monitoring filter, do the following:

    1. Next to Find resource type and metric, click Help
    2. Click Direct filter mode in the help pane.

      When Direct filter mode is enabled, the Find resource type and metric option is replaced with an editable text box labeled Resource type, metric, and filter:

      Direct filter mode is displayed.

      If you made selections for a resource type, a metric, or a filter prior to selecting Direct filter mode, then those settings are used to prepopulate the Resource type, metric, and filter text box.

    3. Enter a Monitoring filter in the Resource type, metric, and filter text box. Your filter must include a metric type and a resource type. You can also include label filters. For the filter grammar, see Monitoring filters.

      For example, to display the log entries for all Google Cloud VM instances in the us-east1-b zone, enter the following:

      metric.type="logging.googleapis.com/log_entry_count"
      resource.type="gce_instance" resource.label."zone"="us-east1-b"
      

      If you've used direct filter mode to configure charts or alerting policies and no data is available, then the chart displays an error message. The exact error message depends on the filter you entered. For example, a typical message is Chart definition invalid. You might also see the message No data is available for the selected time frame.

Hovering over an item on either list brings up a tooltip that displays the information in the item's descriptor. For information on descriptors for metric types or monitored resources, see the metrics list or the monitored resource list.

When at least one resource type and metric pair is selected, the chart shows all the available time series, and additional items appear below the specified metric on the Metric tab. The following screenshot shows the Metric tab after a metric has been specified:

Display additional selection options.

Preview interface

To configure the condition of an alerting policy, you can use the Cloud Monitoring API or the Google Cloud Console. If you choose to use the Cloud Console, then select how to specify the time series to be monitored:

  • Basic mode

    Use basic mode when you want to configure a condition that monitors a metric for a specific resource and you don't want to use MQL. By default, the menus only list metrics for which data has been received. A toggle is provided to let you list all Google Cloud metrics.

    After you select the resource and metric, the next step is to specify the filters.

  • MQL mode

    Use MQL mode when you want to use MQL to describe the condition or if you want to monitor a ratio of metrics.

    For information about how to use MQL, see Using the Monitoring Query Language.

    The next step is to configure the condition trigger.

  • Direct filter mode

    Use direct filter mode when you are interested in any of the following:

    • Monitoring service level objectives (SLO).
    • Configuring an alert for custom metrics for which you don't yet have data.
    • Monitoring the count of processes running on virtual machines (VMs).
    • Verifying syntax for a filter statement to be included in an API command.

    When you use direct filter mode, to select the time series you enter a Monitoring filter. For example, the following Monitoring filter in the chart displaying a count of processes whose name includes nginx:

    select_process_count("monitoring.regex.full_match(\".*nginx.*\")")
    resource.type="gce_instance"
    

    The next filter selects the Disk write bytes time series for Compute Engine VMs that are located in the us-central1-a zone:

    metric.type="compute.googleapis.com/instance/disk/write_bytes_count"
    resource.type="gce_instance"
    resource.label."zone"="us-central1-a"
    

    For information about syntax, see the following documents:

    After you specify a Monitoring filter, the next step is to specify the data transformation options.

The remainder of this page uses the terminology used by the menu-driven interface of the Cloud Console. However, the conceptual information is applicable to all approaches that you can use to create an alerting policy.

Filter the selected data

Legacy interface

You can reduce the amount of data being monitored by specifying filter criteria or by applying aggregation. Filters ensure that only time series that meet some set of criteria are used. If you apply filters, then there are fewer time series to evaluate and that can improve the performance of the alert.

If you supply multiple filtering criteria, then the corresponding chart shows only the time series that meet all criteria, a logical AND.

In the Google Cloud Console, to add a filter, click the Filter field. This opens a panel containing lists of criteria by which you can filter. For example, you can filter by resource group, by name, by resource label, by zone, and by metric label.

The following screenshot shows the known filter-by labels for a specific metric:

Lists of pre-populated filter labels.

You can select from the lists or type to find matches. Additionally, you can create filters for data that has not yet appeared; such filter criteria won't appear on the selection list, but you can manually specify filters that you know will be valid in the future.

After you choose a label on which to filter, you have to specify the rest of the filter: a value or range of values and a comparison.

For example, the following screenshot shows a filter on the zone resource label. The Filter field supports a pair of comparison operators for equality, = and =~, and a pair for inequality, != and !=~. The second item in each pair takes a regular-expression as a value. Simple equality, =, is the default.

List of filter comparators.

Below the list of comparison operators is a list of the available values. The following screenshot shows the names of zones in the project:

Example of some pre-populated filter values.

For the Value field, you can select one of the items on the drop-down list, or you can enter an expression that matches multiple items:

  • If you use a direct comparison, = or !=, you can create a filter string like starts_with. For example, the filter string starts_with("us-central") matches any us-central zone:

    Example of using a filter string.

    See Monitoring filters for more information on filter strings.

  • If you select =~ or !=~, then enter a RE2 regular expression as the value. For example, the regular expression us-central1-.* matches any us-central1 zones:

    Example of filtering with regexps.

    The regular expression ^us.*.a$ matches any US zone that ends with “a”:

    Example of filtering for zones by using a regexps.

You can specify multiple filter criteria, and you can use the same label multiple times. This lets you specify a filter for a range of values. To add additional filters, click Add a filter near the bottom of the filter field. Currently, all filter criteria must be met; they constitute a logical AND. For example, you can use both starts_with and ends_with filter strings to show only “a” zones in the US:

Example using multiple filters.

With a zone="starts_with("asia-east1")" or zone=~"asia-east1.*" filter in place, only the time series with data from one of the asia-east1 zones is displayed:

Displaying a filtered time series.

Preview interface

You can reduce the amount of data being monitored by specifying filter criteria or by applying aggregation. Filters ensure that only time series that meet some set of criteria are used. If you apply filters, then there are fewer time series to evaluate and that can improve the performance of the alert.

If you supply multiple filtering criteria, then the corresponding chart shows only the time series that meet all criteria, a logical AND.

To add a filter, click Add filter, complete the dialog, and then click Done. In the dialog, you use the Filter field to select the criterion by which to filter. For example, you can filter by resource group, by name, by resource label, by zone, and by metric label. After you select the filter criterion, then complete the filter by selecting the comparison operator and the value. Each row in the following table lists a comparison operator, its meaning, and an example:

OperatorMeaningExample
= Equality resource.labels.zone = "us-central1-a
!= Inequality resource.labels.zone != "us-central1-a"
=~ Regular expression2 equality monitoring.regex.full_match("^us.*")
!=~ Regular expression2 inequality monitoring.regex.full_match("^us.*")
starts_with Value starts with resource.labels.zone = starts_with("us")
ends_with Value ends with resource.labels.zone = ends_with("b")
has_substring Value contains resource.labels.zone = has_substring("east")
one_of One of resource.labels.zone = one_of("asia-east1-b", "europe-north1-a")
!starts_with Value doesn't start with resource.labels.zone != starts_with("us")
!ends_with Value doesn't ends with resource.labels.zone != ends_with("b")
!has_substring Value doesn't contain resource.labels.zone != has_substring("east")
!one_of Value isn't one of resource.labels.zone != one_of("asia-east1-b", "europe-north1-a")

Transform data

After the time series are selected, the next steps are to specify how each time series is processed, also known as alignment, and how the aligned time series are combined. If you use the Legacy interface or use the Cloud Monitoring API, you use the aggregation fields to specify how time series are transformed.

The remainder of this page briefly describes these options. For a detailed explanation, see Manipulating time series.

Align time series

Alignment is the process of converting a time series received by Monitoring into a new time series with data points spaced by a fixed length of time. The process of alignment consists of the following steps:

  1. Dividing a time series into a set of fixed-length intervals.
  2. Collecting all data points received in each interval and applying a function to combine those data points together. For example, you can select this function to compute the average of all samples.
  3. Associating a timestamp with the value computed in the previous step, and then adding the pair to the aligned time series.

For a general discussion of alignment, see Alignment: within-series regularization.

Legacy interface

When you create a condition on an alerting policy, you must specify the alignment parameters. If you use the Google Cloud Console, then default values for these parameters are provided:

  • Period: The period is a look-back interval from a particular point in time. For example, if the period is five minutes, then at 1:00 PM, the samples received between 12:55 PM and 1:00 PM are to be aligned. At 1:01 PM, the samples received between 12:56 PM and 1:01 PM are to be aligned. In the context of alerting policies, the alignment period can be viewed as a sliding window that looks to the past. For a more involved discussion about this field, see The alignment period and the duration.

    To view the remaining aggregation options, click Show advanced options.

  • Aligner: The aligner field specifies the function used to combine all the data points in an alignment period. For more information on the available aligners, see Aligner in the API reference. Some aligners both align the data and convert it from one metric kind or type to another. For a detailed explanation, see Kinds, types, and conversions.

Preview interface

When you create a condition on an alerting policy, you must specify the alignment parameters. If you use the Google Cloud Console, then default values for these parameters are provided:

  • Rolling window: This field is a look-back interval from a particular point in time. For example, if this value is five minutes, then at 1:00 PM, the samples received between 12:55 PM and 1:00 PM are to be aligned. At 1:01 PM, the samples received between 12:56 PM and 1:01 PM are to be aligned. In the context of alerting policies, the alignment period can be viewed as a sliding window that looks to the past. For a more involved discussion about this field, see The alignment period and the duration.

  • Rolling window function: The field specifies the function used to combine all the data points in the look-back interval. In the Cloud Monitoring API, this field is called an aligner. For more information on the available functions, see Aligner in the API reference. Some of the aligner functions both align the data and convert it from one metric kind or type to another. For a detailed explanation, see Kinds, types, and conversions.

Combine time series

You can reduce the amount of data returned for a metric by combining different time series. To combine multiple time series, you typically specify a grouping and a function. Grouping is done by label values. The function defines how all time-series data within a group are combined into a new time series.

Legacy interface

To add a grouping, click the text in the Group by text box, and then make a selection from the menu. The menu is constructed dynamically based on the time-series data for the resource and metric you selected. Grouping and filtering use the same set of labels.

When you add the first label, the following occurs:

  • An Aggregator is selected. The type of data being displayed determines the default aggregator; however, you can change this function.
  • The aggregator determines how the time series that have the same label value are combined into a single time series.
  • The chart displays one time series for each value of the label listed in the Group by text box.

If you group by multiple labels, then the aggregator combines those times series that have the same value for the specified labels.

If you don't specify a grouping option and do specify an aggregator, then that function is applied to all selected time series and results in a single time series.

The following screenshot shows a grouping by user_labels.version with the aggregator set to the default value of sum:

Example of grouping setting.

This selection results in one time series for each value of the label user_labels.version. The data points in each time series are computed from the sum of all the values for individual time series for a specific version:

Showing time series' grouped by user_labels.version

Preview interface

To access the options to combine time series, click Show more in the Across time series section.

To combine time series by label value, click the text Time series group by and make a selection from the menu. The menu is constructed dynamically based on the time series you selected.

When you add the first label, the following occurs:

  • An error is displayed because the Time series aggregation field is set to none. To resolve the error, select a function that is used to combine the time series with the same label value.

  • The chart displays one time series for each value of the label listed in the Time series group by field.

If you don't specify a grouping option and do specify an aggregation function, then that function is applied to the selected time series and results in a single time series.

You can group by multiple labels. When you have multiple grouping options, the aggregator is applied to the set of time series that have the same values for the selected labels.

The resulting chart displays one time series for each combination of label values. The order in which you specify the labels doesn't matter.

For example, the following screenshot illustrates grouping by user_labels.version and system_labels.machine_image:

Showing time series' grouped by version and machine image.

As illustrated, if you group by both the labels, you get one time series for each pair of values. The fact that you get a time series for each combination of labels means that this technique can easily create more data than you can usefully put on a single chart.

When you specify grouping or if you select an aggregator, the charted time series only contains required labels, such as the project identifier, and the labels specified by the grouping.

To remove a group-by condition, you must:

  1. Delete the group-by labels.
  2. Set the aggregator to none.

Secondary Aggregation

Legacy interface

When you have multiple time series that already represent aggregations, you can reduce all the time series on the chart to a single time series by choosing a Secondary Aggregator. For example, if you group data by zone, your chart shows one time series for each zone. To create a chart with a single time series, use the secondary aggregation fields.

Preview interface

If you have multiple time series displayed after the Primary data transform and if you want the alerting policy to monitor a single time series, then use the Secondary data transform fields.