Uptime checks overview

To determine, from the perspective of a user or external system, if resources are available and behaving as expected, use uptime checks. An uptime check is a request sent to a resource to see whether it responds.

Components of an uptime check

An uptime check consists of the following components:

  • The uptime-check configuration you create. This configuration is used to schedule continuous requests from Google Cloud regions to the target in your uptime-check configuration. You can also configure public uptime checks to send ICMP pings as part of the check.
  • The request-execution system, provided by Google Cloud, that manages the following:
    • Execution of configured checks
    • Validation of results
    • Writing the results to uptime-check metrics.

After you create an uptime check, you can use an alerting policy to monitor the uptime check and notify you if the resource fails to respond.

Types of uptime checks

There are two types of uptime checks:

  • Public uptime checks issue requests from multiple locations throughout the world to publicly available URLs or Google Cloud resources.
  • Private uptime checks issue requests to internal IP addresses of Google Cloud resources.

The words public and private describe the resources being checked.

The requests made on behalf of uptime checks originate from checkers that reside in several Google Cloud regions. The uptime-check region USA includes the USA_OREGON, USA_IOWA, and USA_VIRGINIA regions. Each of the USA_* regions has one checker, and USA includes all three. The other uptime-check regions, EUROPE, SOUTH_AMERICA, and ASIA_PACIFIC, each have one checker. When you create an uptime check, you must specify at least three checkers, or you can request that all uptime checkers issue requests. An uptime check succeeds if the resource responds and any requirements of the uptime-check configuration are met.

You can observe the results of uptime checks in the Google Cloud console for uptime checks. The following screenshot shows an example of the Uptime checks page:

Sample uptime checks overview.

If you've configured a public uptime check to send pings, then the results of those pings are written to Cloud Logging logs if the check fails. For more information, see Use ICMP pings.

You can also monitor the availability of a resource by creating an alerting policy that creates an incident when the uptime check fails. The alerting policy can be configured to notify you by email or through a different channel, and that notification can include details about the resource that failed to respond.

The results of uptime checks are written to Cloud Monitoring metrics, and alerting policies monitor these metrics. For more information on these metrics, see the uptime_check entries in the monitoring metrics table.

Pricing and limits

For information about the pricing of uptime checks, see Cloud Monitoring pricing summary.

Because uptime checks are designed to simulate the perspective of users in different parts of the world, Cloud Monitoring doesn't guarantee that the data in the uptime check request is kept in a specific geographic location. Customers who have set up Assured Workloads because they have data-residency or Impact Level 4 (IL4) requirements, shouldn't use uptime checks.

The following limits apply to your use of uptime checks and alerting policies:

Category Value Policy type1
Alerting policies (sum of metric and log) per metrics scope 2 500 Metric, Log
Conditions per alerting policy 6 Metric
Maximum time period that a
metric-absence condition evaluates3
1 day Metric
Maximum time period that a
metric-threshold condition evaluates3
23 hours 30 minutes Metric
Maximum length of the filter used
in a metric-threshold condition
2,048 Unicode characters Metric
Maximum number of time series
monitored by a forecast condition
64 Metric
Minimum forecast window 1 hour (3,600 seconds) Metric
Maximum forecast window 7 days (604,800 seconds) Metric
Notification channels per alerting policy 16 Metric, Log
Maximum rate of notifications 1 notification every 5 minutes for each log-based alert Log
Maximum number of notifications 20 notifications a day for each log-based alert Log
Maximum number of simultaneously open incidents
per alerting policy
1,000 Metric
Period after which an incident with no new data is
automatically closed
7 days Metric
Maximum duration of an incident if not manually closed 7 days Log
Retention of closed incidents 13 months Not applicable
Retention of open incidents Indefinite Not applicable
Notification channels per metrics scope 4,000 Not applicable
Maximum number of alerting policies per snooze 16 Metric, Log
Retention of a snooze 13 months Not applicable
Uptime checks per metrics scope 4 100 Not applicable
Maximum number of ICMP pings per public uptime check 3 Not applicable
1Metric: an alerting policy based on metric data; Log: an alerting policy based on log messages (log-based alerts)
2Apigee and Apigee hybrid are deeply integrated with Cloud Monitoring. The alerting limit for all Apigee subscription levels—Standard, Enterprise, and Enterprise Plus—is the same as for Cloud Monitoring: 500 per metrics scope .
3The maximum time period that a condition evaluates is the sum of the alignment period and the duration window values. For example, if the alignment period is set to 15 hours, and the duration window is set 15 hours, then 30 hours of data is required to evaluate the condition.
4This limit applies to the number of uptime-check configurations. Each uptime-check configuration includes the time interval between testing the status of the specified resource. See Managing uptime checks for more information.

What's next