Alerting gives timely awareness to problems in your cloud applications so you can resolve the problems quickly.
To create an alerting policy, you must describe the circumstances under which you want to be alerted and how you want to be notified. This page provides an overview of alerting policies and the concepts behind them.
For a more hands-on introduction, follow the steps in one of these quickstarts:
For a policy that monitors Stackdriver usage and alerts you when you approach the threshold for billing, see Alerting on Stackdriver usage.
How does alerting work?
Each policy specifies the following:
Conditions that identify an unhealthy state for a resource or a group of resources. The conditions for an alerting policy are continuously monitored. You cannot configure the conditions to be monitored only for certain time periods.
Optional notifications sent through email, SMS, or other channels to let your support team know a resource is unhealthy.
Optional documentation that can be included in some types of notifications to help your support team resolve the issue.
When events trigger conditions in one of your alerting policies, Stackdriver Monitoring creates and displays an incident in the Stackdriver Monitoring console. If you set up notifications, Stackdriver Monitoring also sends notifications to people or third-party notification services. Responders can acknowledge receipt of the notification, but the incident remains open until resources are no longer in an unhealthy state.
For more information on these concepts, see Alerting Policies in Depth.
You deploy a web application onto a Compute Engine VM instance that's running a LAMP stack. While you know that HTTP response latency may fluctuate as normal demand rises and falls, if your users start to experience high latency for a significant period of time, you want to take action.
To be notified when your users experience high latency, create the following alerting policy:
If HTTP response latency is higher than two seconds,
and if this condition lasts longer than five minutes,
open an incident and send email to your support team.
Your web app turns out to be more popular than you expected and the response latency grows beyond two seconds. Here's how your alerting policy responds:
Stackdriver Monitoring opens an incident and sends email after five consecutive minutes of HTTP latency higher than two seconds.
The support team receives the email, signs into the Stackdriver Monitoring console, and acknowledges receipt of the notification.
Following the documentation in the notification email, the team is able to address the cause of the latency. Within a few minutes HTTP responses drop back below two seconds.
As soon as Stackdriver Monitoring measures HTTP latency below two seconds, the policy's condition is no longer true (even a single measurement of lower latency breaks the "consecutive five minutes" requirement).
Stackdriver Monitoring closes the incident and resets the five-minute timer. If latency rises above two seconds during the next consecutive five minutes, the policy opens a new incident.
- To create and manage alerting policies with the graphical user interface, see Using the console.
- To create and manage alerting policies with the Stackdriver Monitoring API or from the command line, see Using the API.
- For a detailed conceptual treatment of alerting policies, see Alerting policies in depth.
- For information on the currently available notification channels, see Notification options.
- For an assortment of alerting policies in JSON, see Sample policies.
Pricing and limits
There are no costs associated with using alerting policies or uptime checks, but the following limits apply:
|Uptime checks per Workspace||100|
|Alerting policies per Workspace||500|
|Conditions per alerting policy||6|
|Notification channels per alerting policy||16|
|Notification channels per Workspace||4000|
|Simultaneously open incidents per alerting policy||5000|