Incidents and events

An event occurs when the conditions for an alerting policy are violated. When an event occurs, Cloud Monitoring opens an incident. To view a list of incidents and events, do the following:

  1. In the Cloud Console toolbar, click  Navigation menu, and then select Monitoring:

    Go to Monitoring

  2. In the Monitoring navigation pane, select  Alerting.

Incidents

In the Alerting window, the Summary pane lists the number of incidents while the Incidents pane displays the 10 most recent incidents. Each incident is in one of three states:

  •  Open
  •  Acknowledged
  •  Closed

Click See all incidents to open the Incidents window.

Open incidents

If an incident is open, then the policy's set of conditions is currently being met or there is no data to indicate that the condition is no longer met.

If a policy specifies multiple conditions, incidents are opened depending on how the conditions are combined. See Combining conditions for more information.

Acknowledged incidents

If an incident is acknowledged, then the incident is open and has been marked as acknowledged. Marking an incident as acknowledge indicates that the incident is being investigated.

To mark an incident as acknowledged, go to the incident details and click Acknowledge incident. You must have the Monitoring Editor role, roles/monitoring.editor, to acknowledge incidents; for more information, see Access control: Predefined roles.

Closed incidents

After an incident is created, there are two different reasons why it is closed automatically:

  • Data is received that indicates the conditions are no longer met.
  • No data is received for 7 days.

For example, assume you have an alerting policy that is configured to generate an incident if the HTTP latency is above 2 seconds for 10 consecutive minutes, and that an incident was created. If the next measurement of the HTTP latency is equal to or below 2 seconds, then the incident is closed. Similarly, if no data at all is received for 7 days, then the incident is closed.

To mark an incident as closed, in the Incidents window, identify the incident, click  More, and select Resolve. Marking an incident as closed doesn't reconcile the underlying cause for the incident. That is, if the condition that generated the incident is true on the next alerting cycle, a new incident is generated.

Viewing and filtering incidents

The Incidents window, by default, displays open and acknowledged incidents. To view closed incidents, click Show closed incidents.

To control which incidents you see, add filters. To add a filter, do the following:

  1. Click  Filter table and then select a filtering attribute:

    • State
    • Alerting policy name
    • Metric type
    • Resource type
  2. Based on the attribute you select, a second menu opens and displays a partial list of options. If you enter a value on the filter bar, the list of options is modified to those options that contain the text you entered.

    For example, to filter on the metric container.googleapis.com/container/cpu/usage_time, you select the attribute of Metric. If you enter usage_time, you might see the following options in the secondary menu:

    agent.googleapis.com/cpu/usage_time
    compute.googleapis.com/guest/container/cpu/usage_time
    container.googleapis.com/container/cpu/usage_time

If you add multiple filters, an incident is displayed only if it satisfies all filters.

Inspecting events

The Events pane of the Alerting dashboard displays the most recent events and includes a graphical indicator:

Part of an events listing.

  • To view an events details, click the event name. The details window includes when the incident was opened, the duration, and the status.

  • To view all events, click See all events. This opens the Events window. All events are listed.

    • To page through the events, use the Forward and Backward buttons.
    • To filter the events, click Show filters. You use the filter dialog to select the types of activities, the resources, and the name. If you leave a field at the default value, then this field isn't considered.

      Display of the event filter dialog.

      For example, to show all activities that are open, select Opened in the Activity types menu, and leave all other fields at the default value.

The following table describes the graphical indicators:

Indicator Meaning
Maintenance message icon. Maintenance message.
Cloud account added message. Cloud event message.
Database backup, config, or maintenance message. Database backup, configuration, or maintenance message.
Violation acknowledged, closed, or opened message. Violation acknowledged (blue), closed (green), or opened (red) message.
Instance migration or pre-emption, or Kubernetes message. Instance was migrated or pre-empted message. Kubernetes setup failure, not ready, or disk space limitation message.

What's next

  • To create and manage alerting policies with the Cloud Monitoring API or from the command line, see Using the API.
  • For a detailed conceptual treatment of alerting policies, see Alerting policies in depth.