Backend Services

An HTTP(S) Load Balancing backend service is a centralized service for managing backends, which in turn manage instances that handle user requests. You configure your load balancing service to route requests to your backend service. The backend service in turn knows which instances it can use, how much traffic they can handle, and how much traffic they are currently handling. In addition, the backend service monitors health checking and does not send traffic to unhealthy instances.

Backend service components

A configured backend service contains one or more backends. Each backend service contains the following components:

  • Session affinity (optional). Normally, HTTP(S) Load Balancing uses a round-robin algorithm to distribute requests among available instances. This can be overridden with session affinity. Session affinity attempts to send all requests from the same client to the same virtual machine instance.
  • A Timeout setting. Set to 30 seconds by default, this is the amount of time the backend service will wait on the backend before considering the request a failure. This is a fixed timeout, not an idle timeout. If you require longer-lived connections, set this value appropriately.
  • A Health check. The health checker polls instances attached to the backend service at configured intervals. Instances that pass the health check are allowed to receive new requests. Unhealthy instances are not sent requests until they are healthy again.
  • One or more Backends. A backend contains the following components:
    • An Instance group containing virtual machine instances. The instance group may be a Managed Instance Group with or without Autoscaling or an Unmanaged Instance Group. A backend cannot be added to a backend service if it doesn't contain an instance group.
    • A balancing mode, which tells the load balancing system how to determine when the backend is at full usage. If all the backends for the backend service in a region are at full usage, new requests are automatically routed to the nearest region that can still handle requests. The balancing mode can be based on CPU utilization (Utilization) or requests per second (Rate).
    • A capacity setting. Capacity is an additional control that interacts with the balancing mode setting. For example, if you normally want your instances to operate at a maximum of 80% CPU utilization, you would set your balancing mode to 80% CPU utilization and your capacity to 100%. If you want to cut instance utilization in half, you could leave the balancing mode at 80% CPU utilization and set Capacity to 50%. To drain the backend service, set Capacity to 0% and leave the balancing mode as is.

A backend service may only have up to 500 endpoints (IP address and port pairs) in a given zone.

Changes to your backend services are not instantaneous. It can take several minutes for your changes to propagate throughout the network.

See the backend service API resource or the gcloud command-line tool user guide for descriptions of the properties that are available when working with backend services.

Protocol to the backends

When you configure a backend service for the HTTP(S) load balancer, you set the protocol that the backend service uses to communicate with the backends. You can choose HTTP, HTTPS, or HTTP/2. The load balancer uses only the protocol that you specify. The load balancer does not fall back one of the other protocols if it is unable to negotiate a connection to the backend with the specified protocol.

Backend services and regions

HTTP(S) Load Balancing is a global service. You may have more than one backend service in a region, and you may assign backend services to more than one region, all serviced by the same global load balancer. Traffic is allocated to backend services as follows:

  1. When a user request comes in, the load balancing service determines the approximate origin of the request from the source IP address.
  2. The load balancing service knows the locations of the instances owned by the backend service, their overall capacity, and their overall current usage.
  3. If the closest instances to the user have available capacity, then the request is forwarded to that closest set of instances.
  4. Incoming requests to the given region are distributed evenly across all available backend services and instances in that region. However, at very small loads, the distribution may appear to be uneven.
  5. If there are no healthy instances with available capacity in a given region, the load balancer instead sends the request to the next closest region with available capacity.

Instance groups

Backend services and autoscaled managed instance groups

Autoscaled managed instance groups are useful if you need many machines all configured the same way, and you want to automatically add or remove instances based on need.

The autoscaling percentage works with the backend service balancing mode. For example, suppose you set the balancing mode to a CPU utilization of 80% and leave the capacity scaler at 100%, and you set the Target load balancing usage in the autoscaler to 80%. Whenever the CPU utilization of the group rises above 64% (80% of 80%), the autoscaler will instantiate new instances from the template until usage drops down to about 64%. If the overall usage drops below 64%, the autoscaler will remove instances until usage gets back to 64%.

New instances have a cooldown period before they are considered part of the group, so it's possible for traffic to exceed the backend service's 80% CPU utilization during that time, causing excess traffic to be routed to the next available backend service. Once the instances are available, new traffic will be routed to them. Also, if the number of instances reaches the maximum permitted by the autoscaler's settings, the autoscaler will stop adding instances no matter what the usage is. In this case, extra traffic will be load balanced to the next available region.

Configuring autoscaled managed instance groups

To configure autoscaled managed instance groups, perform the following steps:

  1. Create an instance template for your instance group.
  2. Create a managed instance group and assign the template to it.
  3. Turn on autoscaling based on load balancing serving capacity.

Restrictions and guidance for instance groups

Because Cloud Load Balancing offers a great deal of flexibility in how you configure load balancing, it is possible to create configurations that do not behave well. Keep the following restrictions and guidance in mind when creating instance groups for use with load balancing.

  • Do not put a virtual machine instance in more than one instance group.
  • Do not delete an instance group if it is being used by a backend.
  • Your configuration will be simpler if you do not add the same instance group to two different backends. If you do add the same instance group to two backends:
    • Both backends must use the same balancing mode, either UTILIZATION or RATE.
    • You can use maxRatePerInstance and maxRatePerGroup together. It is acceptable to set one backend to use maxRatePerInstance and the other to maxRatePerGroup.
    • If your instance group serves two or more ports for several backends respectively, you have to specify different port names in the instance group.
  • All instances in a managed or unmanaged instance group must be in the same VPC network and, if applicable, the same subnet.
  • If you are using a managed instance group with autoscaling, do not use the maxRate balancing mode in the backend service. You may use either the maxUtilization or maxRatePerInstance mode.
  • Do not make an autoscaled managed instance group the target of two different load balancers.
  • When resizing a managed instance group, the maximum size of the group should be smaller than or equal to the size of subnet.

Session affinity

By default, HTTP(S) Load Balancing distributes requests evenly among available instances. However, some applications, such as stateful servers used by ads serving, games, or services with heavy internal caching, need multiple requests from a given user to be directed to the same instance. Session affinity makes this possible, identifying requests from a user by the client IP address or the value of a cookie, and directing such requests to a consistent instance as long as that instance is healthy and has capacity. Affinity can break if the instance becomes unhealthy or overloaded, so your system must not assume perfect affinity. Note that session affinity works best with the balancing mode set to requests per second (RPS).

HTTP(S) Load Balancing offers two types of session affinity: client IP affinity and generated cookie affinity. The following two sections discuss each type of session affinity.

Setting client IP affinity

Client IP affinity directs requests from the same client IP address to the same instance based on a hash of the IP address. This is simple and does not involve a user cookie. However, because of NATs, CDNs, and other internet routing technologies, sometimes requests from multiple independent users can look as if they come from the same client, causing many users to clump unnecessarily onto the same instances. In addition, clients who move from one network to another may change IP address, thus losing affinity.

Console


To set client IP affinity:

  1. In the Google Cloud Platform Console, go to the Backend configuration portion of the HTTP(S) load balancer page.
    Go to the Load balancing page
  2. Select the Edit pencil for your load balancer.
  3. Select Backend configuration.
  4. Select the Edit pencil for a Backend service.
  5. In the Edit backend service dialog box, select Client IP from the Session affinity drop-down menu.
    This action enables client IP session affinity. The Affinity cookie TTL field is grayed out as it has no meaning for client IP affinity.
  6. Click the Update button for the Backend service.
  7. Click the Update button for the load balancer.

gcloud


You can use the create command to set session affinity for a new backend service, or the update command to set it for an existing backend service. This example shows using it with the update command.

gcloud compute backend-services update [BACKEND_SERVICE_NAME] \
    --session-affinity client_ip

API


Consult the API reference for backend services.

When generated cookie affinity is set, the load balancer issues a cookie named GCLB on the first request and then directs each subsequent request that has the same cookie to the same instance. Cookie-based affinity allows the load balancer to distinguish different clients using the same IP address so it can spread those clients across the instances more evenly. Cookie-based affinity allows the load balancer to maintain instance affinity even when the client’s IP address changes.

The path of the cookie is always /, so if there are two backend services on the same hostname that enable cookie-based affinity, the two services are balanced by the same cookie.

The lifetime of the HTTP cookie generated by the load balancer is configurable. It can be set to 0 (default), which means the cookie is only a session cookie, or it can have a lifetime of 1 to 86400 seconds (24 hours).

Console


To set generated cookie affinity:

  1. In the Google Cloud Platform Console, you can modify Generated Cookie Affinity in the Backend configuration portion of the HTTP(S) load balancer page.
    Go to the Load balancing page
  2. Select the Edit pencil for your load balancer.
  3. Select Backend configuration.
  4. Select the Edit pencil for a Backend service.
  5. Select Generated cookie from the Session affinity drop-down menu to select Generated Cookie Affinity.
  6. In the Affinity cookie TTL field, set the cookie's lifetime in seconds.
  7. Click the Update button for the Backend service.
  8. Click the Update button for the load balancer.

gcloud


Turn on generated cookie affinity by setting --session-affinity to generated_cookie and setting --affinity-cookie-ttl to the cookie lifetime in seconds. You can use the create command to set it for a new backend service, or the update command to set it for an existing backend service. This example shows using it with the update command.

gcloud compute backend-services update [BACKEND_SERVICE_NAME] \
    --session-affinity generated_cookie \
    --affinity-cookie-ttl 86400

API


Consult the API reference for backend services.

Disabling session affinity

You can turn off session affinity by updating the backend service and setting session affinity to none, or you can edit the backend service and set session affinity to none in a text editor. You can also use either command to modify the cookie lifetime.

Console


To disable session affinity:

  1. In the Google Cloud Platform Console, you can disable session affinity in the Backend configuration portion of the HTTP(S) load balancer page.
    Go to the Load balancing page
  2. Select the Edit pencil for your load balancer.
  3. Select Backend configuration.
  4. Select the Edit pencil for a Backend service.
  5. Select None from the Session affinity drop-down menu to turn off session affinity.
  6. Click the Update button for the Backend service.
  7. Click the Update button for the load balancer.

gcloud


To disable session affinity run the following command

  gcloud compute backend-services update [BACKEND_SERVICE_NAME] \
  --session-affinity none

OR

gcloud compute backend-services edit [BACKEND_SERVICE_NAME]

API


Consult the API reference for backend services.

Losing session affinity

Regardless of the type of affinity chosen, a client can lose affinity with the instance in the following scenarios.

  • The instance group runs out of capacity, and traffic has to be routed to a different zone. In this case, traffic from existing sessions may be sent to the new zone, breaking affinity. You can mitigate this by ensuring that your instance groups have enough capacity to handle all local users.
  • Autoscaling adds instances to, or removes instances from, the instance group. In either case, the backend service reallocates load, and the target may move. You can mitigate this by ensuring that the minimum number of instances provisioned by autoscaling is enough to handle expected load, then only using autoscaling for unexpected increases in load.
  • The target instance fails health checks. Affinity is lost as the session is moved to a healthy instance.
  • The balancing mode is set to CPU utilization, which may cause your computed capacities across zones to change, sending some traffic to another zone within the region. This is more likely at low traffic when computed capacity is less stable.

Configuring the timeout setting

For longer-lived connections to the backend service from the load balancer, configure a timeout setting longer than the 30-second default.

Console


To configure the timeout setting:

  1. In the Google Cloud Platform Console, you can modify the timeout setting in the Backend configuration portion of the HTTP(S) load balancer page.
    Go to the Load balancing page
  2. Select the Edit pencil for your load balancer.
  3. Select Backend configuration.
  4. Select the Edit pencil for the Backend service.
  5. On the line for Protocol, Port, and Timeout settings, select the Edit pencil.
  6. Enter a new Timeout Setting in seconds.
  7. Click the Update button for the Backend service.
  8. Click the Update button for the load balancer.

gcloud


To change the timeout setting with the gcloud command-line tool, use the `gcloud compute backend-services update' command. Append the command with --help for detailed information.

gcloud compute backend-services update [BACKEND_SERVICE] [--timeout=TIMEOUT]

API


Consult the API reference for backend services.

Named ports

A backend service sends traffic to its backends through a named port.

Named ports are key-value metadata representing the service name and the ports that the service is running on. The port name is mapped to one or more port numbers in each instance group. Named ports can be assigned to an instance group, which indicates that the service is available on all instances in the group. This information is used by the HTTP(S) Load Balancing service and TCP/SSL proxy.

Only one port name may be added to a backend service, and that name must exist as a service on all instance groups that are a part of the backend service. Each instance in an instance group has the same set of named ports.

Health checks

Each backend service must have a Health Check associated with it. A health check runs continuously and its results help determine which instances should receive new requests.

Unhealthy instances do not receive new requests and continue to be polled. If an unhealthy instance passes a health check, it is deemed healthy and will begin receiving new connections.

Best practices for health checks

The best practice when configuring a health check is to check health and serve traffic on the same port. However, it is possible to perform health checks on one port while serving traffic on another. If you use two different ports, ensure that firewall rules and services running on instances are configured appropriately. If you run health checks and serve traffic on the same port, but decide to switch ports at some point, be sure to update both the backend service and the health check.

Backend services that do not have a valid global forwarding rule referencing it will not be health checked and will have no health status.

Creating health checks

You need to a create a health check before you create a backend service. We recommend you create a health check for the same protocol as the traffic you are load balancing.

Console

In the console, you can create a health check when you create your backend service.

gcloud

For gcloud commands, see the Health Check page to create a health check.

API

For API commands, see the Health Check page to create a health check.

Creating a backend service

Console


To create a new backend service:

  1. In the Google Cloud Platform Console, you can create a backend service in the Backend configuration portion of the HTTP(S) load balancer page.
    Go to the Load balancing page
  2. Select the Edit pencil for your load balancer.
  3. Select Backend configuration.
  4. In the Create or select backend services & backend buckets drop-down menu, select Backend services, create a backend service.
  5. Enter a Name for the backend service.
  6. Optionally, enter a Description.
  7. Select the Edit pencil on the line for Protocol, Port, and Timeout settings.
    • Choose a Protocol of http, https, or http2.
    • Enter a port name for the Named port.
    • Optionally, change the default Timeout setting.
  8. In the Backends dialog box, you can create one or more backends.
    1. In the New backend dialog box, select an existing Instance group from the drop-down menu.
    2. Enter one or more Port numbers, separated by commas, through which the backend will receive requests.
      • For a Protocol of http, this field is set to 80.
      • For a Protocol of https, this field is set to 443.
      • For a Protocol of http2, this field is set to 443.
    3. Set the percentage for Maximum CPU utilization.
    4. Optionally, set the Maximum RPS, leaving the field blank for unlimited. Set RPS Per instance or RPS Per group.
    5. Set the percentage for Capacity.
    6. Click the Done button.
  9. In the Health check drop-down menu, select an existing health check or Create another health check by completing the following steps:
    1. Set the Name.
    2. Optionally, set the Description.
    3. Set the Protocol and Port. Consult the Health checks section for best practices.
    4. Set the Proxy protocol.
    5. Optionally, set the values for Request and Response.
    6. Under Health Criteria, set the following items:
      1. Set the Check interval in seconds.
      2. Set the Timeout in seconds.
      3. Set the number of consecutive successes in Healthy threshold.
      4. Set the number of consecutive failures in Unhealthy threshold.
      5. Click Save and continue.
    7. Click Advanced Configurations to modify the Connection draining timeout.
    8. Check or uncheck the Enable Cloud Content Delivery Network box.
  10. Click the Update button for the Backend service.
  11. Click the Update button for the load balancer.

gcloud


To create a backend service using the gcloud command-line tool, consult the Cloud SDK documentation.

API


For API commands, see the Health Check page to create a health check.

Modifying a backend service

Console


To modify an existing backend service:

  1. In the Google Cloud Platform Console, you can edit a backend service in the Backend configuration portion of the HTTP(S) load balancer page.
    Go to the Load balancing page
  2. Select the Edit pencil for your load balancer.
  3. Select Backend configuration.
  4. Under Backend services, select the Edit pencil for a backend service. You can modify the following fields:
    1. Under Backends, add a new Backend or select the Edit pencil for an existing backend.
    2. Select the Edit pencil on the line for Protocol, Port, and Timeout settings.
    3. Select an existing Health Check or create a new one by following the previous health check creation steps.
    4. Modify the Session affinity and, when required, the Affinity cookie TTL.
    5. Click Advanced Configurations to modify the Connection draining timeout.
    6. Check or uncheck the Enable Cloud Content Delivery Network box.
    7. Click the Update button for the Backend service.
  5. Click the Update button for the load balancer.

gcloud


To modify a backend service using the gcloud command-line tool, consult the Cloud SDK documentation.

API


To modify a backend service with the API, see the API docs.

Adding instance groups to a backend service

To define the instances that are included in a backend service, you must add a backend and assign an instance group to it. You must create the instance group before you add it to the backend.

When adding the instance group to the backend, you must also define certain parameters.

Console


To add an instance group to a backend service:

  1. In the Google Cloud Platform Console, you can add an instance group to a backend in the Backend configuration portion of the HTTP(S) load balancer page.
    Go to the Load balancing page
  2. Select the Edit pencil for your load balancer.
  3. Select Backend configuration.
  4. Select the Edit pencil for the Backend configuration.
  5. Select the Edit pencil for a Backend.
  6. In the Edit Backend dialog box, in the Instance group drop-down menu, select an Instance group.
  7. Click the Done button for Edit Backend.
  8. Click the Update button for the Backend service.
  9. Click the Update button for the load balancer.

gcloud


To add an instance group to a backend service using the gcloud command-line tool, consult theCloud SDK

API


To add an instance group to a backend service with the API, consult the API docs.

Viewing the results of a backend services health check

Once you have created your health checks and backend service, you can view the health check results.

Console


To view the result of a health check on a backend service:

  1. Go to the load balancing summary page. Go to the Load balancing page
  2. Click the name of a load balancer.
  3. Under Backend, for a Backend service, view the Healthy column in the Instance group table.

gcloud


To view the results of the latest health check with the gcloud command-line tool, use the backend-services get-health command.

gcloud compute backend-services get-health [BACKEND_SERVICE]

The command returns a healthState value for all instances in the specified backend service, with a value of either HEALTHY or UNHEALTHY:

  healthStatus:
    - healthState: UNHEALTHY
      instance: us-central1-b/instances/www-video1
    - healthState: HEALTHY
      instance: us-central1-b/instances/www-video2
  kind: compute#backendServiceGroupHealth
  

API


For API commands, consult the Health Check page.

User-defined request headers

User-defined request headers allow you to specify additional headers that the load balancer adds to requests. These headers can include information that the load balancer knows about the client connection, including the latency to the client, the geographic location of the client's IP address, and parameters of the TLS connection.

User-defined request headers are supported for backend services associated with HTTP(S) Load Balancers.

Note that Google Cloud Load Balancer adds the certain headers by default to all HTTP(S) requests that it proxies to backends. For more information on this, see Target proxies.

How user-defined request headers work

To enable user-defined request headers, you specify a list of headers in a property of the Backend Service resource. The load balancer adds the headers to incoming HTTP requests.

You specify each header as a header-name:header-value string. The header must contain a colon separating the header name and header value. Header names have the following properties:

  • The header name and header value must be a valid HTTP header field definition (per RFC 7230 , with obsolete forms disallowed).
  • The header name must not be X-User-IP and it must not begin with X-Google or X-GFE.
  • A header name must not appear more than once in the list of added headers.

Header values have the following properties:

  • The header value can be blank.
  • The header value can include one or more variables, enclosed by curly braces, that expand to values that the load balancer provides. A list of variables allowed in the header value is in the next section.

For example, you might specify a header with two variable names, for the client region and client city:

X-Client-Geo-Location:{client_region},{client_city}

For clients located in Mountain View, California, the load balancer adds a header as follows:

X-Client-Geo-Location:US,Mountain View

Variables that can appear in the header value

The following variables can appear in request header values.

Variable Description
client_rtt_msec Estimated round-trip transmission time between the load balancer and the HTTP(S) client, in milliseconds.This is the smoothed round-trip time (SRTT) parameter measured by the load balancer’s TCP stack, per RFC 2988.
client_region The country (or region) associated with the client’s IP address. This is a Unicode CLDR region code, such as “US” or “FR”. (For most countries, these codes correspond directly to ISO-3166-2 codes.)
client_region_subdivision Subdivision, e.g. a province or state, of the country associated with the client’s IP address. This is a Unicode CLDR subdivision ID, such as “USCA” or “CAON”. (These Unicode codes are derived from the subdivisions defined by the ISO-3166-2 standard.)
client_city Name of the city from which the request originated, for example, “Mountain View” for Mountain View, California. There is no canonical list of valid values for this variable. The city names may contain US-ASCII letters, numbers, spaces, and the following characters: ``!#$%&'*+-.^_`|~``.
client_city_lat_long Latitude and Longitude of the city from which the request originated, for example, "37.386051,-122.083851" for a request from Mountain View.
tls_sni_hostname Server name indication (as defined in RFC 6066), if provided by the client if provided by the client during the TLS or QUIC handshake. The hostname is converted to lowercase and with any trailing dot removed.
tls_version TLS version negotiated between client and load balancer during the SSL handshake. Possible values include: "TLSv1", "TLSv1.1", "TLSv1.2", and, if the client connected using QUIC instead of TLS, "QUIC".
tls_cipher_suite Cipher suite negotiated during the TLS handshake. The value is four hex digits defined by the IANA TLS Cipher Suite Registry, for example, “009C” for TLS_RSA_WITH_AES_128_GCM_SHA256. This value is empty for QUIC and for unencrypted client connections.

The load balancer expands variables to empty strings when it cannot determine their values, for example for geographic location variables when the IP address’s location is unknown, or for TLS parameters when TLS is not in use.

Geographic values (regions, subdivisions, and cities) are estimates based on the client’s IP address. From time to time, Google updates the data that provides these values in order to improve accuracy and to reflect geographic and political changes.

Headers added by the load balancer overwrite any existing headers that have the same name. Header names are case insensitive. When header names are passed to an HTTP/2 backend, the HTTP/2 protocol encodes header names as lower case.

In header values, leading whitespace and trailing whitespace are insignificant, and are not passed to the backend. To allow for curly braces in header values, the load balancer interprets two opening curly braces ({{) as a single opening brace ({), and two closing curly braces (}}) as a single closing brace (}).

Working with user-defined request headers

The following limitations apply to user-defined request headers:

  • You can specify a maximum of 16 custom request headers per backend service.
  • The total size of all user-defined request headers per backend service (name and value combined, and before variable expansion) cannot exceed 8 KB.

gcloud


To create a backend service with user-defined request headers:

gcloud beta compute backend-services create NAME \
  --protocol HTTPS \
  --https-health-check https-basic-check \
  --custom-request-header 'HEADER_NAME:[HEADER_VALUE]' \
  --custom-request-header ‘HEADER_NAME:[HEADER_VALUE]’

To add user-defined request headers to an existing backend service:

gcloud beta compute backend-services update NAME \
  --custom-request-header 'HEADER_NAME:[HEADER_VALUE]' \
  --custom-request-header ‘HEADER_NAME:[HEADER_VALUE]’

To remove all headers from a backend service:

gcloud beta compute backend-services update NAME \
  --no-custom-request-headers

API


Consult the API reference for backend services.

Troubleshooting issues with HTTP/2 to the backends

  • Invalid value for field resource.loadBalancingScheme: 'EXTERNAL'. Backend Service based Network Load Balancing is not yet supported.

This could happen if you create a backend service without selecting the global option. When you issue a gcloud command as follows, you are prompted to designate a region or designate the load balancer as global:

`gcloud beta compute backend-services create service-test \ --health-checks=hc-test \ --project=test1 \ --protocol=http2 \

For the following backend service: - [service-test] - [service-test] choose a region or global: [1] global [2] region: asia-east1 [3] region: asia-northeast1 [4] region: asia-southeast1 [5] region: australia-southeast1 [6] region: europe-west1 [7] region: europe-west2 [8] region: europe-west3 [9] region: southamerica-east1 [10] region: us-central1 [11] region: us-east1 [12] region: us-east4 [13] region: us-west1 Please enter your numeric choice:

For the L7 load balancer, the backend services must be global, so you must choose option 1 or issue the gcloud command with the --global option:

gcloud alpha compute backend-services create service-test \ --health-checks=hc-test \ --project=test \ --protocol=http2 \ --global \

  • Why do I get a 502 error?

Make sure that your backend is healthy and supports HTTP/2 protocol. You can verify this by testing connectivity to the backend instance using HTTP/2. Ensure that the VM uses HTTP/2 spec-compliant cipher suites. For example, certain TLS 1.2 cipher suites are disallowed by HTTP/2. Refer to the TLS 1.2 Cipher Suite Black List.

After you verify that the VM uses the HTTP/2 protocol, make sure your firewall setup allows the health checker and load balancer to pass through.

If there are no problems with the firewall setup, ensure that the load balancer is configured to talk to the correct port on the VM.

HTTP/2 limitations

  • HTTP/2 between the load balancer and the backend can require significantly more TCP connections to the backend than HTTP(S). Connection pooling, an optimization that reduces the number of these connections with HTTP(S), is not currently available with HTTP/2.
  • HTTP/2 between the load balancer and the backend does not support:
    • server push
    • WebSockets
  • HTTP/2 to backends is not supported for Kubernetes Engine.

What's next

For related documentation and information on how backend services are used in load balancing, see the following:

Was this page helpful? Let us know how we did:

Send feedback about...

Load Balancing