This document describes how to use the Google Cloud console to create an alerting policy that monitors the number of processes running on your virtual machines (VMs) that meet conditions you specify. This type of alerting policy is sometimes called a process-health alerting policy. For example, you can count the number of processes started by the root user. You can also count the number of processes whose invocation command contained a specific string. An alerting policy can notify you when the number of processes is more than, or less than, a threshold. For information about which processes can be monitored, see Processes that are monitored.
This content does not apply to log-based alerting policies. For information about log-based alerting policies, which notify you when a particular message appears in your logs, see Monitoring your logs.
Before you begin
-
To get the permissions that you need to create and modify alerting policies by using the Google Cloud console, ask your administrator to grant you the Monitoring Editor (
roles/monitoring.editor
) IAM role on your project. For more information about granting roles, see Manage access to projects, folders, and organizations.You might also be able to get the required permissions through custom roles or other predefined roles.
For more information about Cloud Monitoring roles, see Control access with Identity and Access Management.
Ensure that you're familiar with the general concepts of alerting policies. For information about these topics, see Alerting overview.
Configure the notification channels that you want to use to receive any alerts. For redundancy purposes, we also recommend that you create multiple types of notification channels. For information about these steps, see Create and manage notification channels.
Ensure that you've installed the Ops Agent on the VMs that you want to monitor. For more information, see Google Cloud Observability agents.
Create alerting policy
To create an alerting policy that monitors the count of processes running on a VM by using the Cloud Monitoring API, the filter expression must specify a time series selector. For an example of a JSON file that specifies this selector, see Process-health policy.
To create an alerting policy that monitors the count of processes running on a VM, do the following:
-
In the Google Cloud console, go to the notifications Alerting page:
If you use the search bar to find this page, then select the result whose subheading is Monitoring.
- Select Create policy.
Select ? on the Select metric section header and then select Direct filter mode in the tooltip.
Enter a Monitoring filter.
For example, to count the number of processes that are running on Compute Engine VM instances whose name includes
nginx
, enter the following:select_process_count("monitoring.regex.full_match(\".*nginx.*\")") resource.type="gce_instance"
For syntax information see the following resources:
- For filters used to count processes running on virtual machines, see Process-health filters.
- For general syntax, see Monitoring filters.
Complete the alerting policy. You must configure the condition trigger, notifications, documentation, and policy name, and then click Create policy.
For more information, see Create metric-threshold alerting policies.
Processes that are monitored
Not all processes running in your system can be monitored by a process-health condition. This condition selects processes to be monitored by using a regular expression that is applied to the command line that invoked the process. When the command line field isn't available, the process can't be monitored.
One way to determine if a process can be monitored by a process-health condition
is to look at the active processes. For example, on a Linux system, you
can use the ps
command:
ps aux | grep nfs
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1598 0.0 0.0 0 0 ? S< Oct25 0:00 [nfsd4]
root 1639 0.0 0.0 0 0 ? S Oct25 2:33 [nfsd]
root 1640 0.0 0.0 0 0 ? S Oct25 2:36 [nfsd]
When a COMMAND
entry is wrapped with square brackets, for example [nfsd]
,
the command-line information for the process isn't available. In this situation,
you can't use Cloud Monitoring to monitor the process.