Health checks overview

Google Cloud provides health checks to determine if backends respond to traffic. This document discusses health checking concepts for Google Cloud load balancers and Traffic Director.

Health checks connect to backends on a configurable, periodic basis. Each connection attempt is called a probe. Google Cloud records the success or failure of each probe.

Based on a configurable number of sequential successful or failed probes, an overall health state is computed for each backend. Backends that respond successfully for the configured number of times are considered healthy. Backends that fail to respond successfully for a separately configurable number of times are unhealthy.

The overall health state of each backend determines eligibility to receive new requests or connections. You can configure the criteria that define a successful probe. This is discussed in detail in the section How health checks work.

Health checks use special routes that aren't defined in your Virtual Private Cloud (VPC) network. For complete information, see Load balancer return paths.

Health check categories, protocols, and ports

Health checks have a category and a protocol. The two categories are health checks and legacy health checks and their supported protocols are as follows:

The protocol and port determine how health check probes are done. For example, a health check can use the HTTP protocol on TCP port 80, or it can use the TCP protocol for a named port in an instance group.

You cannot convert a legacy health check to a health check, and you cannot convert a health check to a legacy health check.

Selecting a health check

Health checks must be compatible with the type of load balancer (or Traffic Director) and the backend types. The factors to consider when you select a health check are as follows:

  • Category: health check or legacy health check.
  • Protocol: protocol that Google Cloud uses to probe the backends.
  • Port specification: ports that Google Cloud uses with the protocol.

The load balancer guide describes the valid health check selections for each type of load balancer and backend. For a higher level summary, see the health checks features table.

Category and protocol

It's a best practice to use the same protocol as the load balancer; however, this is not a requirement, nor is it always possible.

For example, target pool-based network load balancers require legacy health checks, and they require that the legacy health checks use the HTTP protocol, even though target pool-based network load balancers support TCP or UDP. For target pool-based network load balancers, you must run an HTTP server on your virtual machine (VM) instances so that they can respond to health check probes.

For almost all other load balancer types, you must use regular, non-legacy health checks where the protocol matches the load balancer's backend service protocol.

Category and port specification

You must specify a port for your health check. Health checks have two port specification methods (--port and --use-serving-port). For legacy health checks, there is one method (--port).

Load balancer guide

This table shows the supported category, scope, and port specification for each load balancer and backend type.

Load balancer Backend type Health check category and scope Port specification
Internal TCP/UDP Load Balancing 1 Zonal NEGs Health check (global or regional)
  • Custom port number (--port)
Instance groups Health check (global or regional)
  • Custom port number (--port)
Internal HTTP(S) Load Balancing Zonal NEGs Health check (regional)
  • Custom port number (--port)
  • Endpoint's port number (--use-serving-port)
Instance groups Health check (regional)
  • Custom port number (--port)
  • Backend service's named port (--use-serving-port)
Network Load Balancing 1 Instance groups Health check (regional)
  • Custom port number (--port)
Instances
in target pools
Legacy health check
(global with the HTTP protocol)
Legacy health checks only support the port number (--port) specification.
Regional external HTTP(S) load balancer Zonal NEGs Health check (regional)
  • Custom port number (--port)
  • Endpoint's port number (--use-serving-port)
Instance groups Health check (regional)
  • Custom port number (--port)
  • Backend service's named port (--use-serving-port)
External HTTP(S) load balancer 2

TCP Proxy Load Balancing

SSL Proxy Load Balancing
Zonal NEGs Health check (global)
  • Custom port number (--port)
  • Endpoint's port number (--use-serving-port)
Instance groups Health check (global)
  • Custom port number (--port)
  • Backend service's named port (--use-serving-port)

1 You cannot use the --use-serving-port flag because backend services used with Internal TCP/UDP Load Balancing and Network Load Balancing don't subscribe to any named port.
2 For external HTTP(S) load balancers, it's possible but not recommended to use a legacy health check if both of the following are true:

  • The backends are instance groups, not zonal NEGs.
  • The backend VMs serve traffic that uses either HTTP or HTTPS protocols.

How health checks work

The following sections describe how health checks work.

Probes

When you create a health check or a legacy health check, you specify the following flags or accept their default values. Each health check or legacy health check that you create is implemented by multiple probes. These flags control how frequently each probe evaluates instances in instance groups or endpoints in zonal NEGs.

A health check's settings cannot be configured on a per-backend basis. Health checks are associated with an entire backend service. For a target pool-based Network Load Balancing, a legacy HTTP health check is associated with the entire target pool. Thus, the parameters for the probe are the same for all backends referenced by a given backend service or target pool.

Configuration flag Purpose Default value
Check interval
check-interval
The check interval is the amount of time from the start of one probe issued by one prober to the start of the next probe issued by the same prober. Units are seconds. 5s (5 seconds)
Timeout
timeout
The timeout is the amount of time that Google Cloud waits for a response to a probe. Its value must be less than or equal to the check interval. Units are seconds. 5s (5 seconds)

Probe IP ranges and firewall rules

For health checks to work, you must create ingress allow firewall rules so that traffic from Google Cloud probers can connect to your backends.

The following table shows the source IP ranges to allow:

Product Probe source IP ranges Firewall rule example
Internal TCP/UDP Load Balancing
Internal HTTP(S) Load Balancing
External HTTP(S) Load Balancing (global and regional)
SSL Proxy Load Balancing
TCP Proxy Load Balancing
Traffic Director
35.191.0.0/16
130.211.0.0/22
Firewall rules for all products except network load balancers
Network Load Balancing For all backend types:
35.191.0.0/16
209.85.152.0/22
209.85.204.0/22

In addition, for target pool-based only:
169.254.169.254
(metadata servers)

Firewall rules for network load balancers

Importance of firewall rules

Google Cloud requires that you create the necessary ingress allow firewall rules to permit traffic from probers to your backends. As a best practice, limit these rules to just the protocols and ports that match those used by your health checks. For the source IP ranges, make sure to use the documented probe IP ranges listed in the preceding section.

If you don't have ingress allow firewall rules that permit the health check, the implied deny rule blocks inbound traffic. When probers can't contact your backends, the load balancer considers your backends to be unhealthy. The behavior when all backends are unhealthy depends on the type of load balancer:

  • An external HTTP(S) load balancer returns HTTP 502 responses to clients when all backends are unhealthy.

  • Regional external HTTP(S) load balancers and internal HTTP(S) load balancers return HTTP 503 responses to clients when all backends are unhealthy.

  • SSL proxy load balancers and TCP proxy load balancers time out when all backends are unhealthy.

  • A network load balancer attempts to distribute traffic to all backend VMs when they are all unhealthy as a last resort.

  • An internal TCP/UDP load balancer without failover configured distributes traffic to all backend VMs when they are all unhealthy as a last resort. You can disable this behavior if you enable failover.

Security considerations for probe IP ranges

Consider the following information when planning health checks and the necessary firewall rules:

  • The probe IP ranges belong to Google. Google Cloud uses special routes outside of your VPC network but within Google's production network to facilitate communication from probers.

  • Google uses the probe IP ranges to send health check probes for external HTTP(S) load balancers, SSL proxy load balancers, and TCP proxy load balancers. If a packet is received from the internet and the packet's source IP address is within a probe IP range, Google drops the packet. This includes the external IP address of a Compute Engine instance or a Google Kubernetes Engine (GKE) node.

  • The probe IP ranges are a complete set of possible IP addresses used by Google Cloud probers. If you use tcpdump or a similar tool, you might not observe traffic from all IP addresses in all probe IP ranges. As a best practice, create ingress firewall rules that allow all of the probe IP ranges as sources. Google Cloud can implement new probers automatically without notification.

Multiple probes and frequency

Google Cloud sends health check probes from multiple redundant systems called probers. Probers use specific source IP ranges. Google Cloud does not rely on just one prober to implement a health check—multiple probers simultaneously evaluate the instances in instance group backends or the endpoints in zonal NEG backends. If one prober fails, Google Cloud continues to track backend health states.

The interval and timeout settings that you configure for a health check are applied to each prober. For a given backend, software access logs and tcpdump show more frequent probes than your configured settings.

This is expected behavior, and you cannot configure the number of probers that Google Cloud uses for health checks. However, you can estimate the effect of multiple simultaneous probes by considering the following factors.

  • To estimate the probe frequency per backend service, consider the following:

    • Base frequency per backend service. Each health check has an associated check frequency, inversely proportional to the configured check interval:

      1(check interval)

      When you associate a health check with a backend service, you establish a base frequency used by each prober for backends on that backend service.

    • Probe scale factor. The backend service's base frequency is multiplied by the number of simultaneous probers that Google Cloud uses. This number can vary, but is generally between 5 and 10.

  • Multiple forwarding rules for internal TCP/UDP load balancers. If you have configured multiple internal forwarding rules (each having a different IP address) pointing to the same regional internal backend service, Google Cloud uses multiple probers to check each IP address. The probe frequency per backend service is multiplied by the number of configured forwarding rules.

  • Multiple forwarding rules for network load balancers. If you have configured multiple forwarding rules that point to the same backend service or target pool, Google Cloud uses multiple probers to check each IP address. The probe frequency per backend VM, is multiplied by the number of configured forwarding rules.

  • Multiple target proxies for external HTTP(S) load balancers. If you have multiple target proxies that direct traffic to the same URL map, Google Cloud uses multiple probers to check the IP address associated with each target proxy. The probe frequency per backend service is multiplied by the number of configured target proxies.

  • Multiple target proxies for SSL proxy load balancers and TCP proxy load balancers. If you have configured multiple target proxies that direct traffic to the same backend service, Google Cloud uses multiple probers to check the IP address associated with each target proxy. The probe frequency per backend service is multiplied by the number of configured target proxies.

  • Sum over backend services. If a backend is used by multiple backend services, the backend instances are contacted as frequently as the sum of frequencies for each backend service's health check.

    With zonal NEG backends, it's more difficult to determine the exact number of health check probes. For example, the same endpoint can be in multiple zonal NEGs. Those zonal NEGs don't necessarily have the same set of endpoints, and different endpoints can point to the same backend.

Destination for probe packets

The following table shows the network interface and destination IP addresses to which health check probers send packets, depending on the type of load balancer.

For network load balancers and internal TCP/UDP load balancers, the application must bind to the load balancer's IP address (or any IP address 0.0.0.0).

Load balancer Destination network interface Destination IP address
Internal TCP/UDP Load Balancing The network interface of the instance located in the network specified for the internal backend service. If not specified, the primary network interface (nic0) is used.

For more information, see Backend services and network interfaces.
The IP address of the internal forwarding rule.

If multiple forwarding rules point to the same backend service, Google Cloud sends probes to each forwarding rule's IP address. This can result in an increase in the number of probes.
Network Load Balancing Primary network interface (nic0) The IP address of the external forwarding rule.

If multiple forwarding rules point to the same backend service (for target-pool based Network Load Balancings, the same target pool), Google Cloud sends probes to each forwarding rule's IP address. This can result in an increase in the number of probes.
External HTTP(S) Load Balancing (global and regional)

Internal HTTP(S) Load Balancing

SSL Proxy Load Balancing

TCP Proxy Load Balancing

Primary network interface (nic0)
  • For instance group backends, the primary internal IP address associated with the primary network interface (nic0) of each instance.
  • For zonal NEG backends, the IP address of the endpoint. This can be either a primary internal IP address or an alias IP range of the primary network interface, nic0, on the instance hosting the endpoint.

Success criteria for HTTP, HTTPS, and HTTP/2

When a health check uses the HTTP, HTTPS, or HTTP/2 protocol, each probe requires an HTTP 200 (OK) response code to be delivered before the probe timeout. In addition, you can do the following:

  • You can configure Google Cloud probers to send HTTP requests to a specific request path. If you don't specify a request path, / is used.

  • If you configure a content-based health check by specifying an expected response string, Google Cloud must find the expected string within the first 1,024 bytes of the HTTP response body.

The following combinations of request path and response string flags are available for health checks that use HTTP, HTTPS, and HTTP/2 protocols.

Configuration flag Success criteria
Request path
request-path
Specify the URL path to which Google Cloud sends health check probe requests.
If omitted, Google Cloud sends probe requests to the root path, /.
Response
response
The optional response flag allows you to configure a content-based health check. The expected response string must be less than or equal to 1,024 ASCII (single byte) characters. When configured, Google Cloud expects this string within the first 1,024 bytes of the response in addition to receiving HTTP 200 (OK) status.

Success criteria for SSL and TCP

Unless you specify an expected response string, probes for health checks that use the SSL and TCP protocols are successful when both of the following base conditions are true:

  • Each Google Cloud prober can successfully complete an SSL or TCP handshake before the configured probe timeout.
  • For TCP health checks, the TCP session is terminated gracefully either by:
    • The backend, or
    • The Google Cloud prober sending a TCP RST (reset) packet while the TCP session to the prober is still established

If the backend sends a TCP RST (reset) packet to close a TCP session for a TCP health check, the probe might be considered unsuccessful. This happens when the Google Cloud prober has already initiated a graceful TCP termination.

You can create a content-based health check if you provide a request string and an expected response string, each up to 1,024 ASCII (single byte) characters in length. When an expected response string is configured, Google Cloud considers a probe successful only if the base conditions are satisfied and the response string returned exactly matches the expected response string.

The following combinations of request and response flags are available for health checks that use the SSL and TCP protocols.

Configuration flags Success criteria
Neither request nor response specified

Neither flag specified: --request, --response
Google Cloud considers the probe successful when the base conditions are satisfied.
Both request and response specified

Both flags specified: --request, --response
Google Cloud sends your configured request string and waits for the expected response string. Google Cloud considers the probe successful when the base conditions are satisfied and the response string returned exactly matches the expected response string.
Only response specified

Flags specified: only --response
Google Cloud waits for the expected response string, and considers the probe successful when the base conditions are satisfied and the response string returned exactly matches the expected response string.

You should only use --response by itself if your backends would automatically send a response string as part of the TCP or SSL handshake.
Only request specified

Flags specified: only --request
Google Cloud sends your configured request string and considers the probe successful when the base conditions are satisfied. The response, if any, is not checked.

Success criteria for gRPC

If you are using gRPC health checks, make sure that the gRPC service sends the RPC response with the status OK and the status field set to SERVING or NOT_SERVING accordingly.

Note the following:

  • gRPC health checks are used only with gRPC applications and Traffic Director.
  • gRPC health checks don't support TLS.

For more information, see the following:

Health state

Google Cloud uses the following configuration flags to determine the overall health state of each backend to which traffic is load balanced.

Configuration flag Purpose Default value
Healthy threshold
healthy-threshold
The healthy threshold specifies the number of sequential successful probe results for a backend to be considered healthy. A threshold of 2 probes.
Unhealthy threshold
unhealthy-threshold
The unhealthy threshold specifies the number of sequential failed probe results for a backend to be considered unhealthy. A threshold of 2 probes.

Google Cloud considers backends to be healthy after this healthy threshold has been met. Healthy backends are eligible to receive new connections.

Google Cloud considers backends to be unhealthy when the unhealthy threshold has been met. Unhealthy backends are not eligible to receive new connections; however, existing connections are not immediately terminated. Instead, the connection remains open until a timeout occurs or until traffic is dropped. The specific behavior differs depending on the type of load balancer that you're using.

Existing connections might fail to return responses, depending on the cause for failing the probe. An unhealthy backend can become healthy if it is able to meet the healthy threshold again.

Additional notes

Content-based health checks

A content-based health check is one whose success criteria depends on evaluation of an expected response string. Use a content-based health check to instruct Google Cloud health check probes to more completely validate your backend's response.

  • You configure an HTTP, HTTPS, or HTTP/2 content-based health check by specifying an expected response string, and optionally by defining a request path. For more details, see Success criteria for HTTP, HTTPS, and HTTP/2.

  • You configure an SSL or TCP content-based health check by specifying an expected response string, and optionally by defining a request string. For more details, see Success criteria for SSL and TCP.

Certificates and health checks

Google Cloud health check probers do not perform certificate validation, even for protocols that require that your backends use certificates (SSL, HTTPS, and HTTP/2)—for example:

  • You can use self-signed certificates or certificates signed by any certificate authority (CA).
  • Certificates that have expired or that are not yet valid are acceptable.
  • Neither the CN nor the subjectAlternativeName attributes need to match a Host header or DNS PTR record.

Headers

Health checks that use any protocol, but not legacy health checks, allow you to set a proxy header by using the --proxy-header flag.

Health checks that use HTTP, HTTPS, or HTTP/2 protocols and legacy health checks allow you to specify an HTTP Host header by using the --host flag.

Example health check

Suppose you set up a health check with the following settings:

  • Interval: 30 seconds
  • Timeout: 5 seconds
  • Protocol: HTTP
  • Unhealthy threshold: 2 (default)
  • Healthy threshold: 2 (default)

With these settings, the health check behaves as follows:

  1. Multiple redundant systems are simultaneously configured with the health check parameters. Interval and timeout settings are applied to each system. For more information, see Multiple probes and frequency.
  2. Each health check prober does the following:

    1. Initiates an HTTP connection from one of the source IP addresses to the backend instance every 30 seconds.
    2. Waits up to five seconds for an HTTP 200 (OK) response code (the success criteria for HTTP, HTTPS, and HTTP/2 protocols).
  3. A backend is considered unhealthy when at least one health check probe system does the following:

    1. Does not receive an HTTP 200 (OK) response code for two consecutive probes. For example, the connection might be refused, or there might be a connection or socket timeout.
    2. Receives two consecutive responses that don't match the protocol-specific success criteria.
  4. A backend is considered healthy when at least one health check probe system receives two consecutive responses that match the protocol-specific success criteria.

In this example, each prober initiates a connection every 30 seconds. Thirty seconds elapses between a prober's connection attempts regardless of the duration of the timeout (whether or not the connection timed out). In other words, the timeout must always be less than or equal to the interval, and the timeout never increases the interval.

In this example, each prober's timing looks like the following, in seconds:

  1. t=0: Start probe A.
  2. t=5: Stop probe A.
  3. t=30: Start probe B.
  4. t=35: Stop probe B.
  5. t=60: Start probe C.
  6. t=65: Stop probe C.

What's next