An event occurs when the conditions for an alerting policy are violated. When an event occurs, Cloud Monitoring opens an incident. To view a list of incidents and events, do the following:
In the Cloud Console toolbar, click menu Navigation menu, and then select Monitoring:
In the Monitoring navigation pane, select notifications Alerting.
In the Alerting window, the Summary pane lists the number of incidents while the Incidents pane displays the 10 most recent incidents. Each incident is in one of three states:
- error Open
- warning Acknowledged
- check_circle Closed
Click See all incidents to open the Incidents window.
If an incident is open, then the policy's set of conditions is currently being met or there is no data to indicate that the condition is no longer met.
If a policy specifies multiple conditions, incidents are opened depending on how the conditions are combined. See Combining conditions for more information.
If an incident is acknowledged, then the incident is open and has been marked as acknowledged. Marking an incident as acknowledge indicates that the incident is being investigated.
To mark an incident as acknowledged, go to the incident details and
click Acknowledge incident. You must have the
Monitoring Editor role,
roles/monitoring.editor, to acknowledge incidents;
for more information, see Access control: Predefined roles.
After an incident is created, there are two different reasons why it is closed automatically:
- Data is received that indicates the conditions are no longer met.
- No data is received for 7 days.
For example, assume you have an alerting policy that is configured to generate an incident if the HTTP latency is above 2 seconds for 10 consecutive minutes, and that an incident was created. If the next measurement of the HTTP latency is equal to or below 2 seconds, then the incident is closed. Similarly, if no data at all is received for 7 days, then the incident is closed.
To mark an incident as closed, in the Incidents window, identify the incident, click more_vert More, and select Resolve. Marking an incident as closed doesn't reconcile the underlying cause for the incident. That is, if the condition that generated the incident is true on the next alerting cycle, a new incident is generated.
Viewing and filtering incidents
The Incidents window, by default, displays open and acknowledged incidents. To view closed incidents, click Show closed incidents.
To control which incidents you see, add filters. To add a filter, do the following:
Click filter_list Filter table and then select a filtering attribute:
- Alerting policy name
- Metric type
- Resource type
Based on the attribute you select, a second menu opens and displays a partial list of options. If you enter a value on the filter bar, the list of options is modified to those options that contain the text you entered.
For example, to filter on the metric
container.googleapis.com/container/cpu/usage_time, you select the attribute of Metric. If you enter
usage_time, you might see the following options in the secondary menu:
agent.googleapis.com/cpu/usage_time compute.googleapis.com/guest/container/cpu/usage_time container.googleapis.com/container/cpu/usage_time
If you add multiple filters, an incident is displayed only if it satisfies all filters.
The Events pane of the Alerting dashboard displays the most recent events and includes a graphical indicator:
To view an events details, click the event name. The details window includes when the incident was opened, the duration, and the status.
To view all events, click See all events. This opens the Events window. All events are listed.
- To page through the events, use the Forward arrow_forward_ios and Backward arrow_back_ios buttons.
To filter the events, click Show filters. You use the filter dialog to select the types of activities, the resources, and the name. If you leave a field at the default value, then this field isn't considered.
For example, to show all activities that are open, select Opened in the Activity types menu, and leave all other fields at the default value.
The following table describes the graphical indicators:
|Cloud event message.|
|Database backup, configuration, or maintenance message.|
|Violation acknowledged (blue), closed (green), or opened (red) message.|
|Instance was migrated or pre-empted message. Kubernetes setup failure, not ready, or disk space limitation message.|