This guide explains how to use uptime checks to test the availability of your services from locations around the world. The uptime checks page for your project is in the Stackdriver Monitoring Console at Alerting > Uptime Checks.
The Stackdriver Monitoring Quickstart shows you how to configure an uptime check for a Google Compute Engine VM instance.
Before you begin
If the resource or service you are checking is not publicly available, you must configure your firewall to permit incoming traffic from the uptime-check servers. See Getting IP addresses to download a list of the IP addresses.
Configuring uptime checks for new instances
When your project starts a new virtual machine instance, Stackdriver Monitoring might ask you to create an uptime check for that instance. If you agree, you see the New Uptime Check panel with its fields pre-populated for your instance. You might be able to create an uptime check simply by clicking Save.
Configuring uptime checks
This section explains how to configure uptime checks.
In the Stackdriver Monitoring Console, go to Alerting > Uptime Checks for your Cloud Platform Console project:
Click Add Uptime Check in the top-right area of the page. You see the New Uptime Check panel:
Fill in the following basic options. If you follow the "For example"
suggestions, you can configure an uptime check for the
example.com web site.
Title: Enter a name for your check. For example, enter
Example.com uptime check.
Check Type: Choose a protocol: HTTP, HTTPS, or TCP. For example, choose HTTP.
Resource type: Choose one of the following resource types. For example, choose URL:
- App Engine: Google App Engine applications (modules).
- Elastic Load Balancer: Elastic load balancers.
- Instance: Compute Engine or AWS EC2 instances.
- URL: Any hostname and path.
Complete the connection information, depending on your check type and resource type:
Applies to (App Engine, ELB, or Instance): You can apply your uptime check to a single resource or to a group of resources, such as "All instances." If you choose a single resource, pick one from your existing resources as listed in the menu.
Module (App Engine): Specify your application module.
Hostname (All but App Engine): Specify your service's host name. For example, enter
Path (HTTP, HTTPS): Enter a path within your host or resource or use the default path. For example, leave this field blank.
Port (TCP): Choose a port for the connection.
Response content contains the text (TCP): Fill in a string whose presence in the check response indicates success.
Check every: 1, 5, 10, or 15 minutes. For example, choose five minutes. Each geographic location attempts to reach your service every five minutes. Using the default six locations, your service would see an average of 1.2 requests per minute.
Advanced options (HTTP, HTTPS, TCP): These options are shown in the following section, or you can skip ahead to Test and save.
Advanced options vary by check type:
HTTP Host Header: Fill this in to check virtual hosts.
Port: Specify a port number.
Response content contains the text: Fill in a string whose presence in the check response indicates success. The HTTP status code is also used to determine success and failure.
Locations: Select the applicable geographic regions where your check will receive requests. Enough regions need to be selected so that at least three locations are active. New checker locations in selected regions will automatically send requests to the configured destinations. To always send requests from all available locations, select "Global," the default.
Custom Headers: Supply custom headers, and encrypt them if necessary. Encryption causes the headers' values to be hidden in the form. Use encryption for headers related to authentication that you do not want to be seen by other members of your team.
Healthcheck: Specify a timeout, from 1 to 60 seconds. Uptime checks that do not get a response within this period are considered failures.
Authentication: Provide a single username and password.
Test and save
When you are finished with the basic and advanced options:
Click Test. If your uptime check fails, correct your configuration. Some possible causes of failure:
- Connection Error - Refused: If you are using the default HTTP connection type, check that you have a web server installed that is responding to HTTP requests. This can happen on a new instance if you have not installed a web server; see the Quickstart. If you use an HTTPS connection type, you might have to perform additional configuration steps. For firewall issues, see Getting IP addresses.
- Name or service not found: The host name might be incorrect.
- 403 Forbidden: The service is returning an error code to the uptime checker. For example, the default Apache web server configuration returns this code under Amazon Linux, but it returns code 200 (Success) under some other Linux versions. See the LAMP tutorial for Amazon Linux or your web server's documentation.
- 404 Not found: The path might be incorrect.
- 408 Request timeout, or no response: The port number might be incorrect, the service might not be running, or the service might be inaccessible. Check that your firewall allows traffic from the uptime servers; see Getting IP addresses. The timeout limit is specified in the Healthcheck part of Advanced Options.
There can be a delay of up to 25 minutes before the uptime check results start to flow into Stackdriver Monitoring. During that time, the uptime check dashboard will report the status as "no data available."
To monitor your uptime checks, see Monitoring uptime checks.
To be notified when the uptime checks fail, create an alerting policy. There is a Create Alerting Policy selection for each uptime check in the right-side menus. You can create the alerting policy while you are waiting for the uptime check data to arrive in Stackdriver Monitoring. For an example of setting up an alerting policy, see the Quickstart.
Monitoring uptime checks
Your uptime checks are listed on the Uptime Checks page. For each uptime check, you can see the last check result from each checker location:
It is normal for the dashboard to report "no data available" after you create new uptime checks.
Editing uptime checks
To edit uptime checks:
In the Uptime Checks page for your project, click Edit at the right side of your uptime check.
Change the fields as needed. Click Test to see that the check works. If the test fails, see Test and save for possible causes.
There can be a delay of up to 25 minutes before you see the new uptime check results. During that time, the results of the former uptime check is displayed in the dashboard and is used in alerting policies.
Removing uptime checks
To remove uptime checks:
- If your uptime check is part of an alerting policy, decide if you want to remove the uptime check from the policy or remove the policy. Otherwise, the alerting policy will ignore the uptime check.
In the Uptime Checks page for your project, select Delete from the More menu on the right side of your uptime check:
Getting IP addresses for uptime servers
If you are checking a service that is behind a firewall, you can configure your service's firewall to accept traffic from the current set of IP addresses used for uptime checking. To get the IP addresses:
Go to the Uptime Checks page for your project.
In the More menu next to Add Uptime Check, select Download Source IPs.
The downloaded text file typically contains about 20 IP addresses. Uptime checks can come from any of the IP addresses, but only one address from each geographic location is used for each time interval. The geographic locations are listed in the uptime checks dashboard, shown in the previous section. You can also use free, web-based services to identify the registered locations of the IP addresses you downloaded.
The IP addresses used by uptime checking might change, but typically not more than once per quarter and not without an announcement. Go to the Uptime Checks page to get the current set of addresses.
Identifying uptime check traffic
You can identify requests from the uptime-check servers by the following information in your service's request logs:
- ip: The
ipfield will contain one of the addresses used by the uptime-check servers. See Getting IP addresses.
- User-Agent: The
User-Agentheader value is always
"GoogleStackdriverMonitoring-UptimeChecks(https://cloud.google.com/monitoring)". Specifying a "User-Agent" custom header will result in a form validation error and prevent the check configuration from being saved.