This page describes how to specify conditions for metric-based alerting policies. This content does not apply to log-based alerting policies. For information about log-based alerting policies, which notify you when a particular message appears in your logs, see Monitoring your logs.
The conditions for an alerting policy define what is monitored and when to trigger an alert. For example, suppose you want to define an alerting policy that emails you if the CPU utilization of a Compute Engine VM instance is above 80% for more than 3 minutes. You use the conditions dialog to specify that you want to monitor the CPU utilization of a Compute Engine VM instance, and that you want an alerting policy to trigger when that utilization is above 80% for 3 minutes.
Before you begin
To open the Conditions pane for a new alerting policy, do the following:
In the Cloud Console, select Monitoring:
Click Create policy.
Click Add Condition in the Create new alerting policy window.
Each condition must contain a title. As you complete the fields in the conditions dialog, the title field is automatically populated. You can change the auto-populated content to something more meaningful.
Type of Condition
The conditions dialog lets you select the type of condition that you are adding. While all conditions include a configuration that defines when an alert occurs, each type of condition has unique fields:
- A metric condition is defined by a resource type and a metric.
- An uptime-check condition is defined by a resource type and an uptime check.
- A process-health condition is defined by a resource type and a series of filters.
You can also create conditions using the text-based Monitoring Query Language (MQL). For information on using MQL to create conditions, see Creating MQL alerting policies.
Select the type of condition to add to the alerting policy.
After you select the type of condition, you use the fields in the Target pane to define values for the condition's fields. For example, if you select a metric condition, the Target pane includes a list of resource types and metrics.
When you select a target for any type of alerting policy, you are selecting a set of time series that must stay within some constraint. These time series are plotted on the chart for the condition. For more information on time series, see Metrics, time series, and resources.
Adding a metric target
A metric target is defined by a resource type and a metric. For example, you might select Compute Engine VM Instance and CPU load (15m) as the resource type and metric, respectively. To add a metric condition, do the following:
Ensure the Metric tab is selected.
Click the Find resource type and metric field to bring up a drop-down list of available resource types and metrics.
You can either enter text into the Find resource type and metric field or select the resource type that you want to monitor from the menu:
To choose a metric, scroll through the menu and make a selection. Another option is to filter the menu options by entering a partial service name or the metric name. For more information see Selecting metrics.
After you select the resource type and metric, this page expands to display a chart and to provide fine-grained control for your alerting condition. See Configuring a target metric for details on the new options. For additional information:
- See Using custom metrics for details on how to create your own custom metrics.
- See Overview of logs-based metrics for details on how to create metrics based on the content of log entries.
- See Sample policies for alerting policy samples and for representing alerting policies in JSON format.
You can't create a condition based on the ratio of two metrics through the UI, but you can create such policies using the API. See Metric ratio for a sample policy.
Adding an uptime-check target
To create an alerting policy for an uptime check, go to the details pane of the uptime check and click Add alert policy in the Uptime details pane. For details, see Alerting on uptime checks.
Adding a process-health target
A process-health target is defined by a resource type and a series of filters. You can configure this policy to create an alert if the number of processes that match a specific pattern falls above, or below, a threshold during a duration window.
To add a process-health condition, do the following:
- Ensure the Process health tab is selected.
In the Resource Type fields, complete the following steps:
- From the drop-down list, select a single resource, a group of resources, or all resources.
- From the drop-down list, select the resource type you want to monitor. For example, you might select GCE VM Instance. The UI provides the list of available resource types for your system.
For the Command Line, Command, and User filters, select the fields to identify the processes that you want to monitor. In these filters, you can select the string-match operator and specify the query.
- The string-match operators are:
Ends with, and
Regex. The operations are case sensitive.
- The syntax of the query depends on the operation choice.
You can use wildcard operators in queries. For example, the wildcard
*matches any process.
The results of the three filters are combined using the following rules:
If you don't specify the query value for any of the filters, then all processes are counted.
If you enter a query for one filter, only processes that match the filter are counted.
If you enter command-line and command queries, processes that match either filter are counted. Note that command lines are truncated after 1024 characters, so text in a command line beyond that limit can't be matched against.
If you enter a user query, processes that match the user filter and the command-line-or-command filter are counted.
- The string-match operators are:
Processes that are monitored
Not all processes running in your system can be monitored by a process-health condition. This condition selects processes to be monitored by using a regular expression that is applied to the command line that invoked the process. When the command line field isn't available, the process can't be monitored.
One way to determine if a process can be monitored by a process-health condition
is to view the output of the Linux
ps aux | grep nfs USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1598 0.0 0.0 0 0 ? S< Oct25 0:00 [nfsd4] root 1639 0.0 0.0 0 0 ? S Oct25 2:33 [nfsd] root 1640 0.0 0.0 0 0 ? S Oct25 2:36 [nfsd]
COMMAND entry is wrapped with square brackets, for example
the command-line information for the process isn't available, so you can't use
Cloud Monitoring to monitor the process.
As an example, to count the number of processes with
nginx in their name,
that are owned by
root, on all Compute Engine VM instances in a project,
you can configure the Target pane as follows:
- In the Resource type menu, select All, and for the other menu, select Compute Engine VM Instance.
- In the Command Line menu, select Contains,
and for the field, enter
- Leave the Command field empty.
- In the User menu, select Equals, and for the
- Click Apply.
In the preceding figure, the graph shows an alerting threshold of one process and data for two instances. Neither instance is running enough processes to trigger an alerting policy.
After specifying the target, you use the Configuration region to define when the alerting policy triggers. The configuration region defines which time series can cause an alerting policy to trigger and when these time series aren't meeting the policy.
The Condition triggers if menu lets you select the subset of the targets that must violate the condition:
- Any time series violates
- Percent of time series violates
- Number of time series violates
- All time series violate
The Condition menu defines the comparator:
- Is above
- Is below
- Increases by
- Decreases by
- Is absent
For example, to configure an alerting policy to trigger if any time series is above 50 for 3 minutes, do the following:
- In the Condition triggers if menu, select Any time series violates.
- In the Condition menu, select is above.
- In the Threshold field, enter
In the For menu, select 3 minutes.
Finish defining the condition
To complete the definition of your condition and to return to the alerting policy dialog, click Add.