Introduction to alerting

Alerting gives timely awareness to problems in your cloud applications so you can resolve the problems quickly.

To create an alerting policy, you must describe the circumstances under which you want to be alerted and how you want to be notified. This page provides an overview of alerting policies and the concepts behind them.

For a more hands-on introduction, follow the steps in one of these quickstarts:

For an alerting policy that monitors usage and alerts you when you approach the threshold for billing, see Controlling your costs.

How does alerting work?

You can create and manage alerting policies with the Google Cloud Console, the Cloud Monitoring API, and Cloud SDK.

Each policy specifies the following:

  • Conditions that identify an unhealthy state for a resource or a group of resources. The conditions for an alerting policy are continuously monitored. You cannot configure the conditions to be monitored only for certain time periods.

  • Optional notifications sent through email, SMS, or other channels to let your support team know a resource is unhealthy.

  • Optional documentation that can be included in some types of notifications to help your support team resolve the issue.

When events trigger conditions in one of your alerting policies, Cloud Monitoring creates and displays an incident in the Google Cloud Console. If you set up notifications, Cloud Monitoring also sends notifications to people or third-party notification services. Responders can acknowledge receipt of the notification, but the incident remains open until resources are no longer in an unhealthy state.

For more information on these concepts, see Alerting Policies in Depth.

Example

You deploy a web application onto a Compute Engine VM instance that's running a LAMP stack. While you know that HTTP response latency may fluctuate as normal demand rises and falls, if your users start to experience high latency for a significant period of time, you want to take action.

To be notified when your users experience high latency, create the following alerting policy:

If HTTP response latency is higher than two seconds,
and if this condition lasts longer than five minutes,
open an incident and send email to your support team.

Your web app turns out to be more popular than you expected and the response latency grows beyond two seconds. Here's how your alerting policy responds:

  1. Cloud Monitoring opens an incident and sends email after five consecutive minutes of HTTP latency higher than two seconds.

  2. The support team receives the email, signs into the Google Cloud Console, and acknowledges receipt of the notification.

  3. Following the documentation in the notification email, the team is able to address the cause of the latency. Within a few minutes HTTP responses drop back below two seconds.

  4. As soon as Cloud Monitoring measures HTTP latency below two seconds, the policy's condition is no longer true (even a single measurement of lower latency breaks the "consecutive five minutes" requirement).

    Cloud Monitoring closes the incident and resets the five-minute timer. If latency rises above two seconds during the next consecutive five minutes, the policy opens a new incident.

Pricing and limits

There are no costs associated with using alerting policies or uptime checks, but the following limits apply:

Category Value
Uptime checks per Workspace 100
Alerting policies per Workspace 500
Conditions per alerting policy 6
Notification channels per alerting policy 16
Notification channels per Workspace 4000
Simultaneously open incidents per alerting policy 5000

What's next

  • To create and manage alerting policies with the graphical user interface, see Using the console.
  • To create and manage alerting policies with the Cloud Monitoring API or from the command line, see Using the API.
  • For a detailed conceptual treatment of alerting policies, see Alerting policies in depth.
  • For information on the currently available notification channels, see Notification options.
  • For an assortment of alerting policies, see Sample policies.