To determine, from the perspective of a user or external system, if resources are available and behaving as expected, use uptime checks. An uptime check is a request sent to a resource to see whether it responds.
Components of an uptime check
An uptime check consists of the following components:
- The uptime-check configuration you create. This configuration is used to schedule continuous requests from Google Cloud regions to the target in your uptime-check configuration. You can also configure public uptime checks to send ICMP pings as part of the check.
- The request-execution system, provided by Google Cloud,
that manages the following:
- Execution of configured checks
- Validation of results
- Writing the results to uptime-check metrics.
After you create an uptime check, you can use an alerting policy to monitor the uptime check and notify you if the resource fails to respond.
Types of uptime checks
There are two types of uptime checks:
- Public uptime checks issue requests from multiple locations throughout the world to publicly available URLs or Google Cloud resources.
- Private uptime checks issue requests to internal IP addresses of Google Cloud resources.
The words public and private describe the resources being checked.
The requests made on behalf of uptime checks originate from checkers that
reside in several Google Cloud regions.
The uptime-check region USA
includes the USA_OREGON
,
USA_IOWA
, and USA_VIRGINIA
regions. Each of the USA_*
regions has one checker, and USA
includes all three. The other uptime-check regions,
EUROPE
, SOUTH_AMERICA
, and ASIA_PACIFIC
, each have one
checker.
When you create an uptime check, you must specify at least three checkers, or
you can request that all uptime checkers issue requests. An uptime check
succeeds if the resource responds and any requirements of the uptime-check
configuration are met.
You can observe the results of uptime checks in the Google Cloud console for uptime checks. The following screenshot shows an example of the Uptime checks page:
If you've configured a public uptime check to send pings, then the results of those pings are written to Cloud Logging logs if the check fails. For more information, see Use ICMP pings.
You can also monitor the availability of a resource by creating an alerting policy that creates an incident when the uptime check fails. The alerting policy can be configured to notify you by email or through a different channel, and that notification can include details about the resource that failed to respond.
The results of uptime checks are written to Cloud Monitoring metrics,
and alerting policies monitor these metrics. For more information on these
metrics, see the uptime_check
entries in the monitoring
metrics table.
Pricing and limits
For information about the pricing of uptime checks, see Cloud Monitoring pricing summary.
Because uptime checks are designed to simulate the perspective of users in different parts of the world, Cloud Monitoring doesn't guarantee that the data in the uptime check request is kept in a specific geographic location. Customers who have set up Assured Workloads because they have data-residency or Impact Level 4 (IL4) requirements, shouldn't use uptime checks.
The following limits apply to your use of uptime checks and alerting policies:
Category | Value | Policy type1 |
---|---|---|
Alerting policies (sum of metric and log) per metrics scope 2 | 500 | Metric, Log |
Conditions per alerting policy | 6 | Metric |
Maximum time period that a metric-absence condition evaluates3 |
1 day | Metric |
Maximum time period that a metric-threshold condition evaluates3 |
23 hours 30 minutes | Metric |
Maximum length of the filter used in a metric-threshold condition |
2,048 Unicode characters | Metric |
Maximum number of time series monitored by a forecast condition |
64 | Metric |
Minimum forecast window | 1 hour (3,600 seconds) | Metric |
Maximum forecast window | 7 days (604,800 seconds) | Metric |
Notification channels per alerting policy | 16 | Metric, Log |
Maximum rate of notifications | 1 notification every 5 minutes for each log-based alert | Log |
Maximum number of notifications | 20 notifications a day for each log-based alert | Log |
Maximum number of simultaneously open incidents per alerting policy |
1,000 | Metric |
Period after which an incident with no new data is automatically closed |
7 days | Metric |
Maximum duration of an incident if not manually closed | 7 days | Log |
Retention of closed incidents | 13 months | Not applicable |
Retention of open incidents | Indefinite | Not applicable |
Notification channels per metrics scope | 4,000 | Not applicable |
Maximum number of alerting policies per snooze | 16 | Metric, Log |
Retention of a snooze | 13 months | Not applicable |
Uptime checks per metrics scope 4 | 100 | Not applicable |
Maximum number of ICMP pings per public uptime check | 3 | Not applicable |
2Apigee and Apigee hybrid are deeply integrated with Cloud Monitoring. The alerting limit for all Apigee subscription levels—Standard, Enterprise, and Enterprise Plus—is the same as for Cloud Monitoring: 500 per metrics scope .
3The maximum time period that a condition evaluates is the sum of the alignment period and the duration window values. For example, if the alignment period is set to 15 hours, and the duration window is set 15 hours, then 30 hours of data is required to evaluate the condition.
4This limit applies to the number of uptime-check configurations. Each uptime-check configuration includes the time interval between testing the status of the specified resource. See Managing uptime checks for more information.
What's next
- Create public uptime checks
- Create private uptime checks
- Create alerts for uptime checks
- List uptime-check server IP addresses