Specifying conditions for alerting policies

Using the Google Cloud Console

This page describes how to specify conditions for alerting policies.

The conditions for an alerting policy define what is monitored and when to trigger an alert. For example, suppose you want to define an alerting policy that emails you if the CPU utilization of a Compute Engine VM instance is above 80% for more than 3 minutes. You use the conditions dialog to specify that you want to monitor the CPU utilization of a Compute Engine VM instance, and that you want an alerting policy to trigger when that utilization is above 80% for 3 minutes.

Before you begin

To open the Conditions pane for a new alerting policy, do the following:

  1. In the Cloud Console, select Monitoring:

    Go to Monitoring

  2. Select Alerting.

  3. Click Add policy.

  4. Click Add Condition in the Create new alerting policy window.

Title

Each condition must contain a title. As you complete the fields in the conditions dialog, the title field is automatically populated. You can change the auto-populated content to something more meaningful.

Type of Condition

The conditions dialog lets you select the type of condition that you are adding. While all conditions include a configuration that defines when an alert occurs, each type of condition has unique fields:

  • A metric condition is defined by a resource type and a metric.
  • An uptime-check condition is defined by a resource type and an uptime check.
  • A process-health condition is defined by a resource type and a series of filters.

Select the type of condition to add to the alerting policy.

Target

After you select the type of condition, you use the fields in the Target pane to define values for the condition's fields. For example, if you select a metric condition, the Target pane includes a list of resource types and metrics.

When you select a target for any type of alerting policy, you are selecting a set of time series that must stay within some constraint. These time series are plotted on the chart for the condition. For more information on time series, see Metrics, time series, and resources.

Adding a metric target

A metric target is defined by a resource type and a metric. For example, you might select Compute Engine VM Instance and CPU load (15m) as the resource type and metric, respectively. To add a metric condition, do the following:

  1. Click the Metric tab.

  2. Click the Find resource type and metric field to bring up a drop-down list of available resource types and metrics, and then select the resource type that you want to monitor:

    Select resource type

  3. To choose a metric, scroll through the menu and make a selection. Another option is to filter the menu options by entering a partial service name or the metric name. For more information see Selecting metrics.

After you select the resource type and metric, this page expands to display a chart and to provide fine-grained control for your alerting condition. See Configuring a target metric for details on the new options. For additional information:

You can't create a condition based on the ratio of two metrics through the UI, but you can create such policies using the API. See Metric ratio for a sample policy.

Adding an uptime-check target

To create an alerting policy for an uptime check, go to the details pane of the uptime check and click Add alert policy in the Uptime details pane. For details, see Alerting on uptime checks.

Adding a process-health target

A process-health target is defined by a resource type and a series of filters. You can configure this policy to trigger an incident if the number of processes that match a specific pattern falls above, or below, a threshold during a duration window. To add a process-health condition, do the following:

  1. Click the Process health tab.
  2. In the Resource Type fields, complete the following steps:

    • From the drop-down list, select a single resource, a group of resources, or all resources.
    • From the drop-down list, select the resource type you want to monitor. For example, you might select Compute Engine VM Instance. The UI provides the list of available resource types for your system.
  3. For the Command Line, Command, and User filters, select the fields to identify the processes that you want to monitor. In these filters, you can select the string-match operator and specify the query.

    • The string-match operators are: Equals, Contains, Starts with, Ends with, and Regex. The operations are case sensitive.
    • The syntax of the query depends on the operation choice. You can use wildcard operators in queries. For example, the wildcard * matches any process.

    The results of the three filters are combined using the following rules:

    • If you don't specify the query value for any of the filters, then all processes are counted.

    • If you enter a query for one filter, only processes that match the filter are counted.

    • If you enter command-line and command queries, processes that match either filter are counted. Note that command lines are truncated after 1024 characters, so text in a command line beyond that limit can't be matched against.

    • If you enter a user query, processes that match the user filter and the command-line-or-command filter are counted.

Example

As an example, to count the number of processes with nginx in their name, that are owned by root, on all Compute Engine VM instances in a project, you can configure the Target pane as follows:

  • In the Resource type menu, select All, and for the other menu, select Compute Engine VM Instance.
  • In the Command Line menu, select Contains, and for the field, enter nginx.
  • Leave the Command field empty.
  • In the User menu, select Equals, and for the field, enter root.

Show root nginx

In the preceding figure, the graph shows an alerting threshold of one process and data for two instances. Neither instance is running enough processes to trigger an alerting policy.

Configuration

After specifying the target, you use the Configuration region to define when the alerting policy triggers. The configuration region defines which time series can cause an alerting policy to trigger and when these time series aren't meeting the policy.

The Condition triggers if menu lets you select the subset of the targets that must violate the condition:

  • Any time series violates
  • Percent of time series violates
  • Number of time series violates
  • All time series violate

The Condition menu defines the comparator:

  • Is above
  • Is below
  • Increases by
  • Decreases by
  • Is absent

For example, to configure an alerting policy to trigger if any time series is above 50 for 3 minutes, do the following:

  • In the Condition triggers if menu, select Any time series violates.
  • In the Condition menu, select is above.
  • In the Threshold field, enter 50.
  • In the For menu, select 3 minutes.

    Configuring the target metric

Finish defining the condition

To complete the definition of your condition and to return to the alerting policy dialog, click Add.

What's next

Using the Stackdriver Monitoring console

This page describes how to specify conditions for alerting policies.

The conditions for an alerting policy define what is monitored and when to trigger an alert. For example, suppose you want to define an alerting policy that emails you if the CPU utilization of a Compute Engine VM instance is above 80% for more than 3 minutes. You use the conditions dialog to specify that you want to monitor the CPU utilization of a Compute Engine VM instance, and that you want an alerting policy to trigger when that utilization is above 80% for 3 minutes.

Before you begin

To open the Conditions pane, go to the Monitoring console:

  1. From the Google Cloud navigation menu , select Monitoring.

  2. From the Monitoring console, select Alerting > Create a policy.

  3. Click Add Condition.

Title

The Title field is a required field. As you complete the fields in the conditions dialog, the title field is automatically populated. You can change the auto-populated content to something more meaningful.

Type of Condition

The conditions dialog lets you select the type of condition that you are adding. While all conditions include a configuration that defines when an alert occurs, each type of condition has unique fields:

  • A metric condition is defined by a resource type and a metric.
  • An uptime-check condition is defined by a resource type and an uptime check.
  • A process-health condition is defined by a resource type and a series of filters.

In the tab header, use the arrows to scroll and then click the type of condition you wish to add:

Add or edit a condition

Target

After you select the type of condition, you use the fields in the Target pane to define values for the condition's fields. For example, if you select a metric condition, the Target pane includes a list of resource types and metrics.

When you select a target for any type of alerting policy, you are selecting a set of time series that must stay within some constraint. These time series are plotted on the chart for the condition. For more information on time series, see Metrics, time series, and resources.

Adding a metric target

A metric target is defined by a resource type and a metric. For example, you might select Compute Engine VM Instance and CPU load (15m) as the resource type and metric, respectively. To add a metric condition, do the following:

  1. Click the Metric tab.

  2. Click the Find resource type and metric field to bring up a drop-down list of available resource types and metrics, and then select the resource type that you want to monitor:

    Select resource type

  3. After you select the resource type, the list displays only metrics for that resource type. Only metrics where data is available are listed. Scroll through the Metrics options and select the specific metric that you want your policy to monitor:

    Select metric

After you select the resource type and metric, this page expands to display a chart and to provide fine-grained control for your alerting condition. See Configuring a target metric for details on the new options. For additional information:

You can't create a condition based on the ratio of two metrics through the UI, but you can create such policies using the API. See Metric ratio for a sample policy.

Adding an uptime-check target

We recommend creating an alerting policy for an uptime check. In this case, the condition fields in the alerting policy are populated for you. See Alerting on uptime checks for details.

Adding a process-health target

A process-health target is defined by a resource type and a series of filters. You can configure this policy to trigger an incident if the number of processes that match a specific pattern falls above, or below, a threshold during a duration window. To add a process-health condition, do the following:

  1. Click the Process health tab.
  2. In the Resource Type fields, complete the following steps:

    • From the drop-down list, select a single resource, a group of resources, or all resources.
    • From the drop-down list, select the resource type you want to monitor. For example, you might select Compute Engine VM Instance. The UI provides the list of available resource types for your system.
  3. For the Command Line, Command, and User filters, select the fields to identify the processes that you want to monitor. In these filters, you can select the string-match operator and specify the query.

    • The string-match operators are: Equals, Contains, Starts with, Ends with, and Regex. The operations are case sensitive.
    • The syntax of the query depends on the operation choice. You can use wildcard operators in queries. For example, the wildcard * matches any process.

    The results of the three filters are combined using the following rules:

    • If you don't specify the query value for any of the filters, then all processes are counted.

    • If you enter a query for one filter, only processes that match the filter are counted.

    • If you enter command-line and command queries, processes that match either filter are counted. Note that command lines are truncated after 1024 characters, so text in a command line beyond that limit can't be matched against.

    • If you enter a user query, processes that match the user filter and the command-line-or-command filter are counted.

Example

As an example, to count the number of processes with nginx in their name, that are owned by root, on all Compute Engine VM instances in a project, you can configure the Target pane as follows:

  • In the Resource type drop-down list, select All, and for the other drop-down list, select Compute Engine VM Instance.
  • In the Command Line drop-down list, select Contains, and for the field, enter nginx.
  • Leave the Command field empty.
  • In the User drop-down list, select Equals, and for the field, enter root.

Show root nginx

In the preceding figure, the graph shows an alerting threshold of one process and data for two instances. One instance has no processes that meet the filter conditions, and the other instance has two processes that meet the filter conditions.

Configuration

After specifying the target, you have to indicate what constitutes a violation of the constraints on the target.

You use the Configuration region to define when the alerting policy triggers. The configuration region defines which time series can cause an alert to trigger and when these time series aren't meeting the policy.

For example, to configure an alerting policy to trigger if any time series is above 50 for 3 minutes, do the following:

  • In the Condition triggers if drop-down list, select Any time series violates.
  • In the Condition drop-down list, select is above.
  • In the Threshold field, enter 50.
  • In the For drop-down list, select 3 minutes.

    Configuring the target metric

Additional options

In addition to the configuration options described in the preceding example, you can specify different subsets of the time series that can trigger the alert and different criteria for violation.

The Condition triggers if drop-down list lets you select the subset of the targets that must violate the condition: all time series or a subset of time series. The list of options includes the following:

  • Any time series violates
  • Percent of time series violates
  • Number of time series violates
  • All time series violate

The Condition drop-down list includes the following choices:

  • Is above
  • Is below
  • Increases by
  • Decreases by
  • Is absent

In the preceding example, the constraint is violated if a single time series is in violation. For the criteria for a violation, the Condition fields are set to is above and 50, and the duration is three minutes. So, this alerting policy is triggered if any time series in the target set goes above 50 and stays there for three minutes.

Finish defining the condition

To complete the definition of your condition and return to the alerting policy dialog, click Save.

Czy ta strona była pomocna? Podziel się z nami swoją opinią:

Wyślij opinię na temat...

Stackdriver Monitoring