Uptime checks verify the availability of your service from locations around the world. You can use them as part of alerting policies or you can use the uptime checks overview to review their status.
This guide explains how to monitor, create, edit, and delete uptime checks.
Service Tiers: Uptime checks are available to both Basic Tier and Premium Tier Stackdriver accounts, but uptime checks in Premium Tier accounts have some added information. For more information about service tiers, see Stackdriver Pricing.
Firewalls: If the resource you are checking is not publicly available, you must configure the resource's firewall to permit incoming traffic from the uptime-check servers. See Getting IP addresses to download a list of the IP addresses.
To create an uptime check, do one of the following:
Quickstart: The Stackdriver Monitoring Quickstart shows you how to configure an uptime check for a Google Compute Engine VM instance.
New resources: In some circumstances, Stackdriver invites you to create an uptime check for a new resource such as an App Engine version. Stackdriver prepopulates the forms needed to create the check.
Create an uptime check: To create your own uptime check, see Creating uptime checks.
Reviewing uptime checks
The uptime checks overview for your project is in the Stackdriver Monitoring Console at Uptime Checks > Uptime Checks Overview:
Click the following button to see your uptime checks overview page:
For each uptime check, you can see the last check result from each location, as shown in the following screenshot:
Clicking on any of your uptime checks brings you to a dashboard with more detail:
Premium Tier Stackdriver accounts display Uptime for each uptime check.
Uptime is a percentage calculated as
S is the number of
successful check responses and
T is the total number of check responses, from
For group checks, the values of
T are summed across all current
For example, over a 25 minute period, an uptime check with a default configurations would get 25 requests from each of 6 locations, for a total of 150 requests. If the dashboard reports an 83.3% uptime, that means 25 of 150 requests failed.
Creating uptime checks
This section explains how to create and configure uptime checks.
In the Stackdriver Monitoring Console, go to Uptime Checks for your Cloud Platform Console project:
Click Add Uptime Check in the top-right area of the page. You see the New Uptime Check panel:
To create a new uptime check, fill in the New Uptime Check form. You must
specify the following basic options for all uptime checks. Our example
configures an uptime check for the
Title: Enter a name for your check. For example, enter
Example.com uptime check.
Check Type: Choose a protocol: HTTP, HTTPS, or TCP. For example, choose HTTP.
Resource type: Choose one of the following resource types. For example, choose URL:
- App Engine: Google App Engine applications (modules).
- Elastic Load Balancer: Elastic load balancers.
- Instance: Compute Engine or AWS EC2 instances.
- URL: Any hostname and path.
Complete the connection information, depending on your check type and resource type:
Applies to (App Engine, ELB, or Instance): You can apply your uptime check to a single resource or to a group of resources, such as "All instances." If you choose a single resource, pick one from your existing resources as listed in the menu.
Module (App Engine): Specify your application module.
Hostname (All but App Engine): Specify your service's host name. For example, enter
Path (HTTP, HTTPS): Enter a path within your host or resource or use the default path. For example, leave this field blank.
Port (TCP): Choose a port for the connection.
Response content contains the text (TCP): Fill in a string whose presence in the check response indicates success.
Check every: 1, 5, 10, or 15 minutes. For example, choose five minutes. Each geographic location attempts to reach your service every five minutes. Using the default six locations, your service would see an average of 1.2 requests per minute.
Advanced options (HTTP, HTTPS, TCP): These settings are optional and are described in the following section. If you don't want to specify advanced options, you can skip ahead to Test and save.
Advanced options vary by check type:
HTTP Host Header: Fill this in to check virtual hosts.
Port: Specify a port number.
Response content contains the text: Fill in a string whose presence in the check response indicates success. The HTTP status code is also used to determine success and failure.
Locations: Select the applicable geographic regions where your check receives requests. Enough regions need to be selected so that at least three locations are active. New checker locations in selected regions automatically send requests to the configured destinations. To always send requests from all available locations, select "Global," the default.
Custom Headers: Supply custom headers, and encrypt them if necessary. Encryption causes the headers' values to be hidden in the form. Use encryption for headers related to authentication that you do not want to be seen by other members of your team.
Healthcheck: Specify a timeout, from 1 to 60 seconds. Uptime checks that do not get a response within this period are considered failures.
Authentication: Provide a single username and password.
Test and save
When you are finished specifying the basic and advanced options:
Click Test. If your uptime check fails, correct your configuration. Some possible causes of failure:
- Connection Error - Refused: If you are using the default HTTP connection type, check that you have a web server installed that is responding to HTTP requests. This can happen on a new instance if you have not installed a web server; see the Quickstart. If you use an HTTPS connection type, you might have to perform additional configuration steps. For firewall issues, see Getting IP addresses.
- Name or service not found: The host name might be incorrect.
- 403 Forbidden: The service is returning an error code to the uptime checker. For example, the default Apache web server configuration returns this code under Amazon Linux, but it returns code 200 (Success) under some other Linux versions. See the LAMP tutorial for Amazon Linux or your web server's documentation.
- 404 Not found: The path might be incorrect.
- 408 Request timeout, or no response: The port number might be incorrect, the service might not be running, or the service might be inaccessible. Check that your firewall allows traffic from the uptime servers; see Getting IP addresses. The timeout limit is specified in the Healthcheck part of Advanced Options.
There can be a delay of up to 25 minutes before the uptime check results start to flow into Stackdriver Monitoring. During that time, the uptime check dashboard reports the status as "no data available."
Editing uptime checks
To edit an uptime check:
In the Uptime Checks Overview, click Edit in the menu on the right side of your uptime check's summary.
Alternatively, in the dashboard for your uptime check, choose Edit Uptime Check from the menu at the top-right of the page.
Change the fields as needed. Click Test to see that the check works. If the test fails, see Test and save for possible causes.
There can be a delay of up to 25 minutes before you see the new uptime check results. During that time, the results of the former uptime check is displayed in the dashboard and is used in alerting policies.
Removing uptime checks
To remove an uptime check:
If your uptime check is part of an alerting policy, decide if you want to remove the uptime check from the policy or remove the policy.
If you do not remove the uptime check from the policy, the policy ignores the missing uptime check. The policy does not create an incident for the missing check.
In the Uptime Checks Overview, select Delete from the menu on the right side of your uptime check.
Alternatively, in the detail page for your uptime check, choose Delete Uptime Check from the menu at the top-right of the page.
Getting IP addresses for uptime servers
If you are checking a service that is behind a firewall, you can configure your service's firewall to accept traffic from the current set of IP addresses used for uptime checking. To get the IP addresses:
Go to the Uptime Checks page for your project.
In the More menu next to Add Uptime Check, select Download Source IPs:
The downloaded text file typically contains about 20 IP addresses. Uptime checks can come from any of the IP addresses, but only one address from each geographic location is used for each time interval. The geographic locations are listed in the uptime checks dashboard, shown in the previous section. You can also use free, web-based services to identify the registered locations of the IP addresses you downloaded.
The IP addresses used by uptime checking might change, but typically not more than once per quarter and not without an announcement. Go to the Uptime Checks page to get the current set of addresses.
Identifying uptime check traffic
You can identify requests from the uptime-check servers by the following information in your service's request logs:
- ip: The
ipfield contains one of the addresses used by the uptime-check servers. See Getting IP addresses.
- User-Agent: The
User-Agentheader value is always
"GoogleStackdriverMonitoring-UptimeChecks(https://cloud.google.com/monitoring)". Specifying a "User-Agent" custom header results in a form validation error and prevent the check configuration from being saved.