Google Cloud provides health checking mechanisms that determine if backends — such as instance groups and network endpoint groups (NEGs) — properly respond to traffic. This document discusses health checking concepts specific to Google Cloud and its load balancers.
Google Cloud provides global and regional health check systems that connect to backends on a configurable, periodic basis. Each connection attempt is called a probe, and each health check system is called a prober. Google Cloud records the success or failure of each probe.
Health checks and load balancers work together. Based on a configurable number of sequential successful or failed probes, Google Cloud computes an overall health state for each backend in the load balancer. Backends that respond successfully for the configured number of times are considered healthy. Backends that fail to respond successfully for a separate number of times are unhealthy.
Google Cloud uses the overall health state of each backend to determine its eligibility for receiving new requests or connections. In addition to being able to configure probe frequency and health state thresholds, you can configure the criteria that define a successful probe. This document describes how health checks work in detail.
Google Cloud uses special routes not defined in your VPC network for health checks. For complete information on this, read Load balancer return paths.
Health check categories, protocols, and ports
Google Cloud organizes health checks by category and protocol.
There are two health check categories: health checks and legacy health checks. Each category supports a different set of protocols and a means for specifying the port used for health checking. The protocol and port determine how Google Cloud health check systems contact your backends. For example, you can create a health check that uses the HTTP protocol on TCP port 80, or you can create a health check that uses the TCP protocol for a named port configured on an instance group.
Most Google Cloud load balancers require non-legacy health checks, but Network Load Balancing requires legacy health checks that use the HTTP protocol. Refer to Selecting a health check for specific guidance on selecting the category and the protocol, and specifying the ports.
You cannot convert a legacy health check to a health check or vice versa.
The term health check does not refer to legacy health checks. Legacy health checks are explicitly called legacy health checks in this document.
Selecting a health check
Health checks must be compatible with the type of load balancer and the types of backends (instance groups or network endpoint groups) it uses. The three factors you must specify when you create a health check are:
- Category: health check or legacy health check, which must be compatible with the load balancer
- Protocol: defines what protocol the Google Cloud systems use to periodically probe your backends
- Port specification: defines which ports are used for the health check's protocol
The guide at the end of this section summarizes valid combinations of health check category, protocol, and port specification based on a given type of load balancer and backend type.
As used in this section, the term instance group refers to unmanaged instance groups, managed zonal instance groups, or managed regional instance groups.
Category and protocol
The type of load balancer and the types of backends that the load balancer uses determine the health check's category. Network Load Balancing requires legacy health checks that use the HTTP protocol. For all other load balancer types, use regular health checks.
You must select a protocol from the list of protocols supported by the health check's category. It's a best practice to use the same protocol as the load balancer itself; however, this is not a requirement, nor is it always possible. For example, network load balancers require legacy health checks, and they require that the legacy health checks use the HTTP protocol, despite the fact that Network Load Balancing supports TCP and UDP in general. For network load balancers, you must run an HTTP server on your VMs so that they can respond to health check probes.
The following table lists the health check categories and the protocols each category supports.
|Health check category||Supported protocols|
|Health check||• HTTP
• HTTP/2 (with TLS)
|Legacy health check||• HTTP|
• HTTPS (Legacy HTTPS health checks are not supported for network load balancers and cannot be used with most other types of load balancers.)
Category and port specification
In addition to a protocol, you must select a port specification for your health check. Health checks provide three port specification methods, and legacy health checks provide one method. Not all port specification methods are applicable to each type of load balancer. The type of load balancer and the types of backends it uses determine which port specification method you can use.
|Health check category||Port specification methods and meanings|
|Legacy health check||
Load balancer guide
This table shows supported category, scope, and port specification for each Google Cloud load balancer and backend type.
|Load balancer||Backend type||Health check category and scope||Port specification|
|Internal TCP/UDP Load Balancing 1||Instance groups||Health check (global)||
|Internal HTTP(S) Load Balancing||NEGs||Health check (regional)||
|Instance groups||Health check (regional)||
|Network Load Balancing||Instances
in target pools
health check (global)
using the HTTP protocol
|Legacy health checks only support the port number
|NEGs||Health check (global)||
|Instance groups||Health check (global)||
--use-serving-portflag because internal backend services do not have an associated named port.
2 It is possible, but not recommended, to use a legacy health check for backend services associated with external HTTP(S) load balancers under the following circumstances:
- The backends are instance groups, not network endpoint groups.
- The backend VMs serve traffic using either
How health checks work
When you create a health check or create a legacy health check, you specify the following flags or accept their default values. Each health check or legacy health check you create is implemented by multiple probes. These flags control how frequently each Google Cloud health check probe evaluates instances in instance group backends or endpoints in NEG backends.
A health check's settings cannot be configured on a per-backend basis. Health checks are associated with a whole backend service, and legacy health checks are associated with either a whole target pool (for Network Load Balancing) or backend service (for certain external HTTP(S) Load Balancing configurations). Thus, the parameters for the probe are the same for all backends referenced by a given backend service or target pool.
|Configuration flag||Purpose||Default value|
||The check interval is the amount of time from the start of one probe issued by one prober to the start of the next probe issued by the same prober. Units are seconds.||If omitted, Google Cloud uses
||The timeout is the amount of time that Google Cloud will wait for a response to a probe. Its value must be less than or equal to the check interval. Units are seconds.||If omitted, Google Cloud uses
Probe IP ranges and firewall rules
For health checks to work, you must create ingress
rules so that traffic from Google Cloud probers can
connect to your backends.
The following table shows the source IP ranges to allow, depending on the type of load balancer:
|Load balancer||Probe source IP ranges||Firewall rule example|
• Internal TCP/UDP Load Balancing
• Internal HTTP(S) Load Balancing
• External HTTP(S) Load Balancing
• SSL Proxy Load Balancing
• TCP Proxy Load Balancing
|Firewall rules for all load balancers except network load balancers|
|• Network Load Balancing||
|Firewall rules for network load balancers|
Importance of firewall rules
Google Cloud requires that you create the necessary ingress
firewall rules to permit traffic from probers to your backends. As a best
practice, limit these rules to just the protocols and ports that
match those used by your health checks. For the source IP ranges, make sure to
use the documented ranges.
If you do not have ingress
allow firewall rules that permit the protocol,
port, and source IP range used by your health check, the implied deny ingress
firewall rule blocks inbound
traffic from all sources. When probers can't contact your backends, the
Google Cloud load balancer categorizes all of your backends as unhealthy.
The behavior when all backends are unhealthy depends on the type of load
An external HTTP(S) load balancer returns HTTP 502 responses to clients when all backends are unhealthy.
An internal HTTP(S) load balancer returns HTTP 503 responses to clients when all backends are unhealthy.
SSL proxy load balancers and TCP proxy load balancers time out when all backends are unhealthy.
A network load balancer attempts to distribute traffic to all backend VMs when they are all unhealthy as a means of last resort.
An internal TCP/UDP load balancer without failover configured distributes traffic to all backend VMs when they are all unhealthy as a means of last resort. You can disable this behavior if you enable failover.
Security considerations for probe IP ranges
Consider the following information when planning health checks and the necessary firewall rules:
The probe IP ranges belong to Google. Google Cloud uses special routes, outside of your VPC network but within Google's production network, to facilitate communication from probers.
Google uses the probe IP ranges exclusively to execute health check probes and to send traffic from Google Front Ends (GFEs) for external HTTP(S) load balancers, SSL proxy load balancers, and TCP proxy load balancers. If a packet is received from the internet, including the external IP address of a Compute Engine instance or a GKE node, and the packet's source IP address is within a probe IP range, Google drops the packet.
The probe IP ranges are a complete set of possible IP addresses used by Google Cloud probers. If you use
tcpdumpor a similar tool, you might not observe traffic from all IP addresses in all of the probe IP ranges. As a best practice, create ingress
allowfirewall rules for your chosen load balancer using all of the probe IP ranges as sources because Google Cloud can implement new probers automatically without notification.
Multiple probes and frequency
Google Cloud sends health check probes from multiple redundant systems called probers. Probers use specific source IP ranges. Google Cloud does not rely on just one prober to implement a health check — multiple probers simultaneously evaluate the instances in instance group backends or the endpoints in NEG backends. If one prober fails, Google Cloud continues to track backend health states.
The interval and timeout settings you configure for a health check
are applied to each prober. For a given backend, software access logs and
tcpdump show more frequent probes than your configured settings.
This is expected behavior, and you cannot configure the number of probers that Google Cloud uses for health checks. However, you can estimate the effect of multiple simultaneous probes by considering the following factors:
To estimate the probe frequency per backend service, consider the following:
Base frequency per backend service: Each health check has an associated check frequency, inversely proportional to the configured check interval:
When you associate a health check with a backend service, you establish a base frequency used by each prober for backends on that backend service.
Probe scale factor: The backend service's base frequency is multiplied by the number of simultaneous probers that Google Cloud uses. This number can vary, but is generally between 5 and 10.
Multiple forwarding rules for internal TCP/UDP load balancers:: If you have configured multiple internal forwarding rules (each having a different IP address) pointing to the same regional internal backend service, Google Cloud uses multiple probers to check each IP address. The probe frequency per backend service is multiplied by the number of configured forwarding rules.
Multiple forwarding rules for network load balancers:: If you have configured multiple forwarding rules that point to the same target pool, Google Cloud uses multiple probers to check each IP address. The probe frequency as seen by each instance in the target pool is multiplied by the number of configured forwarding rules.
Multiple target proxies for external HTTP(S) load balancers: If you have configured multiple target proxies that direct traffic to the same URL map for external HTTP(S) Load Balancing, Google Cloud uses multiple probers to check the IP address associated with each target proxy. The probe frequency per backend service is multiplied by the number of configured target proxies.
Multiple target proxies for SSL proxy load balancers and TCP proxy load balancers: If you have configured multiple target proxies that direct traffic to the same backend service for SSL Proxy Load Balancing or TCP Proxy Load Balancing, Google Cloud uses multiple probers to check the IP address associated with each target proxy. The probe frequency per backend service is multiplied by the number of configured target proxies.
Sum over backend services: If a backend (such as an instance group) is used by multiple backend services, the backend instances are contacted as frequently as the sum of frequencies for each backend service's health check.
With network endpoint group backends (NEGs), it's more difficult to determine the exact number of health check probes. For example, the same endpoint can be in multiple NEGs, where those NEGs don't necessarily have the same set of endpoints, and different endpoints can point to the same backend.
Destination for probe packets
The following table shows what network interface and destination IP addresses are used by health check probers, depending on the type of load balancer:
|Load balancer||Destination network interface||Destination IP address|
|Internal TCP/UDP Load Balancing||
The network interface of the instance located in the network specified
for the internal backend service. If not specified, the primary network
Refer to Backend services and network interfaces for more information.
|The IP address of the internal forwarding rule.
If multiple forwarding rules point to the same backend service, Google Cloud sends probes to each forwarding rule's IP address. This can result in an increase in the number of probes.
|• Network Load Balancing||Primary network interface (
||The IP address of the external forwarding rule.
If multiple forwarding rules point to the same target pool, Google Cloud sends probes to each forwarding rule's IP address. This can result in an increase in the number of probes.
||Primary network interface (
Success criteria for HTTP, HTTPS, and HTTP/2
When a health check uses the HTTP, HTTPS, or HTTP/2 protocol, each probe
HTTP 200 (OK) response code to be delivered before the probe
timeout. In addition:
- You can configure Google Cloud probers to send HTTP requests to a
specific request path. If you don't specify a request path,
- If you configure a content-based health check, by specifying an expected response string, Google Cloud must find the expected string within the first 1,024 bytes of the HTTP response.
- If you configure an expected response string, each Google Cloud health check probe must find the expected response string within the first 1,024 bytes of the actual response from your backends.
The following combinations of request path and response string flags are available for health checks using HTTP, HTTPS, and HTTP/2 protocols:
|Configuration flag||Success Criteria|
||Specify the URL path to which Google Cloud sends health check
If omitted, Google Cloud sends probe requests to the root path,
||The optional response flag allows you to configure a content-based
health check. The expected response string must be less than or equal
to 1,024 ASCII (single byte) characters. When configured,
Google Cloud expects this string within the first 1,024 bytes of the
response in addition to receiving
Success criteria for SSL and TCP
Unless you specify an expected response string, probes for health checks using the SSL and TCP protocols are successful when both of the following base conditions are true:
- Each Google Cloud prober is able to successfully complete an SSL or TCP handshake before the configured probe timeout, and
- For TCP health checks, the TCP session is terminated gracefully by either your backend or the Google Cloud prober, or your backend sends a TCP RST (reset) packet while the TCP session to the prober is still established.
Be aware that if your backend sends a TCP RST (reset) packet to close a TCP session for a TCP health check, after the Google Cloud prober initiates a graceful TCP termination, the probe might be considered unsuccessful.
You can create a content based health check if you provide a request string and an expected response string, each up to 1,024 ASCII (single byte) characters in length. When an expected response string is configured, Google Cloud considers a probe successful only if the base conditions are satisfied and the response string returned exactly matches the expected response string. The following combinations of request and response flags are available for health checks using the SSL and TCP protocols:
|Configuration flags||Success Criteria|
|Neither request nor response specified
Neither flag specified:
|Google Cloud considers the probe successful when the base conditions are satisfied.|
|Both request and response specified
Both flags specified:
|Google Cloud sends your configured request string and waits for the expected response string. Google Cloud considers the probe successful when the base conditions are satisfied and the response string returned exactly matches the expected response string.|
|Only response specified
Flags specified: only
|Google Cloud waits for the expected response string, and considers
the probe successful when the base conditions are satisfied and the
response string returned exactly matches the expected response
You should only use
|Only request specified
Flags specified: only
|Google Cloud sends your configured request string, considers the probe successful when the base conditions are satisfied. The response, if any, is not checked.|
Google Cloud uses the following configuration flags and whether or not probes were successful to determine the overall health state of each backend being load balanced:
|Configuration flag||Purpose||Default value|
||The healthy threshold specifies the number of sequential successful probe results for a backend to be considered healthy.||If omitted, Google Cloud uses a threshold of
||The unhealthy threshold specifies the number of sequential failed probe results for a backend to be considered unhealthy.||If omitted, Google Cloud uses a threshold of
Google Cloud considers backends to be healthy once this healthy threshold has been met. Healthy backends are eligible to receive new connections.
Google Cloud considers backends to be unhealthy when the unhealthy threshold has been met. Unhealthy backends are not eligible to receive new connections; however, existing connections are not immediately terminated. Instead, the connection remains open until a timeout occurs or until traffic is dropped. The specific behavior differs depending on the type of load balancer that you're using.
Existing connections might fail to return responses, depending on the cause for failing the probe. An unhealthy backend can become healthy if it is able to meet the healthy threshold again.
Content-based health checks
A content-based health check is one whose success criteria depends on evaluation of an expected response string. Use a content-based health check to instruct Google Cloud health check probes to more completely validate your backend's response.
You configure an HTTP, HTTPS, or HTTP/2 content-based health check by specifying an expected response string, and, optionally, defining a request path. For more details, refer to Success criteria for HTTP, HTTPS, and HTTP/2.
You configure an SSL or TCP content-based health check by specifying an expected response string, and, optionally, a request string. For more details, refer to Success criteria for SSL and TCP.
Certificates and health checks
Google Cloud health check probers do not perform certificate validation, even for protocols that require that your backends use certificates (SSL, HTTPS, and HTTP/2). As examples:
- You can use self-signed certificates or certificates signed by any certificate authority (CA).
- Certificates that have expired or that are not yet valid are acceptable.
- Neither the
subjectAlternativeNameattributes need to match a
Hostheader or DNS PTR record.
Health checks using any protocol, but not legacy health checks, allow you to
set a proxy header by using the
Health checks using
HTTP2 protocols and legacy health
checks allow you to specify an HTTP Host header using the
Example health check
Suppose you set up a health check with the following settings:
- Interval: 30 seconds
- Timeout: 5 seconds
- Protocol: HTTP
- Unhealthy threshold: 2 (default)
- Healthy threshold: 2 (default)
With these settings, the health check behaves as follows:
- Multiple redundant systems are simultaneously configured with the health check parameters. Interval and timeout settings are applied to each system. For more information, see Multiple probes and frequency.
Each health check prober:
A backend is considered unhealthy when at least one health check probe system:
- Does not receive a receive a response with status code 200 for two consecutive probes. For example, the connection might be refused, or there might be a connection or socket timeout.
- Receives two consecutive responses that don't match the protocol-specific success criteria.
A backend is considered healthy when at least one health check probe system receives two consecutive responses that match the protocol-specific success criteria.
In this example, each prober initiates a connection every 30 seconds. Thirty seconds elapses between a prober's connection attempts regardless of the duration of the timeout (whether or not the connection timed out). In other words, the timeout must always be less than or equal to the interval, and the timeout never increases the interval.
In this example, each prober's timing looks like this, in seconds:
- t=0: Start probe A.
- t=5: Stop probe A.
- t=30: Start probe B.
- t=35: Stop probe B.
- t=60: Start probe C.
- t=65: Stop probe C.
For information on configuring health checks, see Creating Health Checks.