Alerting gives timely awareness to problems in your cloud applications so you can resolve the problems quickly.
To create an alerting policy, you must describe the circumstances under which you want to be alerted and how you want to be notified. This page provides an overview of alerting policies and the concepts behind them.
For a more hands-on introduction, follow the steps in one of these quickstarts:
For information on creating an alerting policy to monitor your Stackdriver Logging ingestion volume, see Alerting on logs usage.
How does alerting work?
Each policy specifies the following:
Conditions that identify an unhealthy state for a resource or a group of resources.
Optional notifications sent through email, SMS, or other channels to let your support team know a resource is unhealthy.
Optional documentation that can be included in some types of notifications to help your support team resolve the issue.
When events trigger conditions in one of your alerting policies, Stackdriver Monitoring creates and displays an incident in the Stackdriver Monitoring console. If you set up notifications, Stackdriver Monitoring also sends notifications to people or third-party notification services. Responders can acknowledge receipt of the notification, but the incident remains open until resources are no longer in an unhealthy state.
For more information on these concepts, see Alerting Policies in Depth.
You deploy a web application onto a Compute Engine VM instance that's running a LAMP stack. While you know that HTTP response latency may fluctuate as normal demand rises and falls, if your users start to experience high latency for a significant period of time, you want to take action.
To be notified when your users experience high latency, create the following alerting policy:
If HTTP response latency is higher than two seconds,
and if this condition lasts longer than five minutes,
open an incident and send email to your support team.
Your web app turns out to be more popular than you expected and the response latency grows beyond two seconds. Here's how your alerting policy responds:
Stackdriver Monitoring opens an incident and sends email after five consecutive minutes of HTTP latency higher than two seconds.
The support team receives the email, signs into the Stackdriver Monitoring console, and acknowledges receipt of the notification.
Following the documentation in the notification email, the team is able to addresses the cause of the latency. Within a few minutes HTTP responses drop back below two seconds.
As soon as Stackdriver Monitoring measures HTTP latency below two seconds, the policy's condition is no longer true (even a single measurement of lower latency breaks the "consecutive five minutes" requirement).
Stackdriver Monitoring closes the incident and resets the five-minute timer. If latency rises above two seconds during the next consecutive five minutes, the policy opens a new incident.
- To create and manage alerting policies with the graphical user interface, see Using the Console.
- To create and manage alerting policies with the Stackdriver Monitoring API or from the command line, see Using the API.
- For a detailed conceptual treatment of alerting policies, see Alerting Policies in Depth.
- For information on the currently available notification channels, see Notification Options.
- For an assortment of alerting policies in JSON, see Sample Policies.
The types of resources you can monitor in your policies depend on your Stackdriver account. For information on pricing and quotas associated with alerting policies, see Pricing.
The following limits apply to alerting policies and uptime checks:
|Uptime checks per Stackdriver account||100|
|Alerting policies per Stackdriver account||500|
|Conditions per alerting policy||6|
|Notification channels per alerting policy||16|
|Notification channels per Stackdriver account||4000|