Specifying conditions for alerting policies

This page describes how to specify conditions for alerting policies.

The conditions for an alerting policy define what is monitored and when to trigger an alert. For example, suppose you want to define an alerting policy that emails you if the CPU utilization of a Compute Engine VM instance is above 80% for more than 3 minutes. You use the conditions dialog to specify that you want to monitor the CPU utilization of a Compute Engine VM instance, and that you want an alerting policy to trigger when that utilization is above 80% for 3 minutes.

Before you begin

To open the Conditions pane for a new alerting policy, do the following:

  1. In the Cloud Console, select Monitoring:

    Go to Monitoring

  2. Select Alerting.

  3. Click Create policy.

  4. Click Add Condition in the Create new alerting policy window.

Title

Each condition must contain a title. As you complete the fields in the conditions dialog, the title field is automatically populated. You can change the auto-populated content to something more meaningful.

Type of Condition

The conditions dialog lets you select the type of condition that you are adding. While all conditions include a configuration that defines when an alert occurs, each type of condition has unique fields:

  • A metric condition is defined by a resource type and a metric.
  • An uptime-check condition is defined by a resource type and an uptime check.
  • A process-health condition is defined by a resource type and a series of filters.

You can also create conditions using the text-based Monitoring Query Language (MQL). For information on using MQL to create conditions, see Creating MQL alerting policies.

Select the type of condition to add to the alerting policy.

Target

After you select the type of condition, you use the fields in the Target pane to define values for the condition's fields. For example, if you select a metric condition, the Target pane includes a list of resource types and metrics.

When you select a target for any type of alerting policy, you are selecting a set of time series that must stay within some constraint. These time series are plotted on the chart for the condition. For more information on time series, see Metrics, time series, and resources.

Adding a metric target

A metric target is defined by a resource type and a metric. For example, you might select Compute Engine VM Instance and CPU load (15m) as the resource type and metric, respectively. To add a metric condition, do the following:

  1. Ensure the Metric tab is selected.

  2. Click the Find resource type and metric field to bring up a drop-down list of available resource types and metrics.

  3. You can either enter text into the Find resource type and metric field or select the resource type that you want to monitor from the menu:

    Select the resource type.

  4. To choose a metric, scroll through the menu and make a selection. Another option is to filter the menu options by entering a partial service name or the metric name. For more information see Selecting metrics.

After you select the resource type and metric, this page expands to display a chart and to provide fine-grained control for your alerting condition. See Configuring a target metric for details on the new options. For additional information:

You can't create a condition based on the ratio of two metrics through the UI, but you can create such policies using the API. See Metric ratio for a sample policy.

Adding an uptime-check target

To create an alerting policy for an uptime check, go to the details pane of the uptime check and click Add alert policy in the Uptime details pane. For details, see Alerting on uptime checks.

Adding a process-health target

A process-health target is defined by a resource type and a series of filters. You can configure this policy to create an alert if the number of processes that match a specific pattern falls above, or below, a threshold during a duration window. To add a process-health condition, do the following:

  1. Ensure the Process health tab is selected.
  2. In the Resource Type fields, complete the following steps:

    • From the drop-down list, select a single resource, a group of resources, or all resources.
    • From the drop-down list, select the resource type you want to monitor. For example, you might select G​C​E VM Instance. The UI provides the list of available resource types for your system.
  3. For the Command Line, Command, and User filters, select the fields to identify the processes that you want to monitor. In these filters, you can select the string-match operator and specify the query.

    • The string-match operators are: Equals, Contains, Starts with, Ends with, and Regex. The operations are case sensitive.
    • The syntax of the query depends on the operation choice. You can use wildcard operators in queries. For example, the wildcard * matches any process.

    The results of the three filters are combined using the following rules:

    • If you don't specify the query value for any of the filters, then all processes are counted.

    • If you enter a query for one filter, only processes that match the filter are counted.

    • If you enter command-line and command queries, processes that match either filter are counted. Note that command lines are truncated after 1024 characters, so text in a command line beyond that limit can't be matched against.

    • If you enter a user query, processes that match the user filter and the command-line-or-command filter are counted.

Example

As an example, to count the number of processes with nginx in their name, that are owned by root, on all Compute Engine VM instances in a project, you can configure the Target pane as follows:

  • In the Resource type menu, select All, and for the other menu, select Compute Engine VM Instance.
  • In the Command Line menu, select Contains, and for the field, enter nginx.
  • Leave the Command field empty.
  • In the User menu, select Equals, and for the field, enter root.
  • Click Apply.

Show user is root for nginx.

In the preceding figure, the graph shows an alerting threshold of one process and data for two instances. Neither instance is running enough processes to trigger an alerting policy.

Configuration

After specifying the target, you use the Configuration region to define when the alerting policy triggers. The configuration region defines which time series can cause an alerting policy to trigger and when these time series aren't meeting the policy.

The Condition triggers if menu lets you select the subset of the targets that must violate the condition:

  • Any time series violates
  • Percent of time series violates
  • Number of time series violates
  • All time series violate

The Condition menu defines the comparator:

  • Is above
  • Is below
  • Increases by
  • Decreases by
  • Is absent

For example, to configure an alerting policy to trigger if any time series is above 50 for 3 minutes, do the following:

  • In the Condition triggers if menu, select Any time series violates.
  • In the Condition menu, select is above.
  • In the Threshold field, enter 50.
  • In the For menu, select 3 minutes.

    Configuring the target metric dialog.

Finish defining the condition

To complete the definition of your condition and to return to the alerting policy dialog, click Add.