This guide describes how to troubleshoot configuration issues for a Google Cloud external HTTP(S) load balancer.
Overview
The types of issues discussed in this guide include the following:
- General connectivity issues
- Troubleshooting issues with HTTP/2 to the backends
- Custom origin and internet NEG issues
- Serverless NEG issues (beta)
Before you begin
Before investigating issues, familiarize yourself with the following pages.
For general connectivity:
For custom origins and internet NEGs:
- Overview of custom origins
- Overview of internet NEGs
- Setting up an HTTP(S) load balancer with a custom origin (internet NEG)
For serverless NEGs:
Troubleshooting general connectivity issues
Unexplained 502 errors
If 502 errors persist longer than a few minutes after you complete the load balancer configuration, it's likely that either:
- There's no firewall rule configured to allow health checks.
- The software on the backends isn't running.
To verify that health check traffic reaches your backend VMs, enable health check logging and search for successful log entries.
Load balanced traffic does not have the source address of the original client
Traffic from the load balancer to your instances has an IP address in the
ranges of 130.211.0.0/22
and 35.191.0.0/16
. When viewing logs on your
load balanced instances, you will not see the source address of the
original client. Instead, you will see source addresses from this range.
Getting a permission error when trying to view an object in my Cloud Storage bucket
In order to serve objects through load balancing, the Cloud Storage objects must be publicly accessible. Make sure to update the permissions of the objects being served so they are publicly readable.
URL doesn't serve expected Cloud Storage object
The Cloud Storage object to serve is determined based on your URL map and the URL that you request. If the request path maps to a backend bucket in your URL map, the Cloud Storage object is determined by appending the full request path onto the Cloud Storage bucket that the URL map specifies.
For example, if you map /static/*
to gs://[EXAMPLE_BUCKET]
, the request to
https://<GCLB IP or Host>/static/path/to/content.jpg
will try to serve
gs://[EXAMPLE_BUCKET]/static/path/to/content.jpg
. If that object doesn't
exist, you will get the following error message instead of the object:
NoSuchKey
The specified key does not exist.
Compression isn't working
HTTP(S) Load Balancing does not compress or decompress responses itself, but it can serve responses generated by your backend service that are compressed by using tools such as gzip or DEFLATE.
If responses served by HTTP(S) Load Balancing are not compressed but should be,
check to be sure that the web server software running on your instances is
configured to compress responses. By default, some web server software
automatically disables compression for requests that include a Via
header,
which indicates that the request was forwarded by a proxy. Because it is a
proxy, HTTP(S) Load Balancing adds a Via
header to each request as
required by the HTTP
specification.
To enable compression, you may have to override your web server's default
configuration to tell it to compress responses even if the request had a
Via
header.
To configure nginx backends to serve compressed responses proxied through an external HTTP(S) load balancer:
- Set the
gzip_proxied
directive appropriately (for example, toany
), and - Set the
gzip_vary
directive toon
.
To configure Apache backends to serve compressed responses proxied through an external HTTP(S) load balancer:
- Use the
DEFLATE
filter, and - Add
Vary Accept-Encoding
to the response header using themod_headers
module.
Resolving HTTP 408
errors
With HTTP traffic, the maximum amount of time for the client to complete
sending its request is equal to the backend service timeout. If you see
HTTP 408
responses with the jsonPayload.statusDetail
client_timed_out
,
this means that there was insufficient progress while the request from the
client was proxied or the response from the backend was proxied. If the
problem is because of clients that are experiencing performance issues,
you can resolve this issue by increasing the backend service timeout.
Troubleshooting issues with HTTP/2 to the backends
Invalid value for field resource.loadBalancingScheme
: 'EXTERNAL'
This could happen if you create a backend service without selecting the
global option. When you issue a gcloud
command as follows, you are prompted
to designate a region or designate the load balancer as global:
gcloud beta compute backend-services create service-test \ --health-checks=hc-test \ --project=test1 \ --protocol=http2
For the following backend service:
- [service-test] choose a region or global: [1] global [2] region: [REGION_A_NAME] [3] region: [REGION_B_NAME] .... Please enter your numeric choice:
For the HTTP(S) load balancer, the backend services must be global, so you must
choose option 1 or issue the gcloud
command with the --global
option:
gcloud beta compute backend-services create service-test \ --health-checks=hc-test \ --project=test \ --protocol=http2 \ --global
Unexplained 502 errors
Make sure that your backend instance is healthy and supports HTTP/2 protocol. You can verify this by testing connectivity to the backend instance using HTTP/2. Ensure that the VM uses HTTP/2 spec-compliant cipher suites. For example, certain TLS 1.2 cipher suites are disallowed by HTTP/2. Refer to the TLS 1.2 Cipher Suite Black List.
After you verify that the VM uses the HTTP/2 protocol, make sure your firewall setup allows the health checker and load balancer to pass through.
If there are no problems with the firewall setup, ensure that the load balancer is configured to talk to the correct port on the VM.
Troubleshooting custom origin and internet NEG issues
Traffic does not reach the endpoints
After you configure a service, the new endpoint becomes reachable through the external HTTP(S) load balancer when:
- The endpoint is attached to the internet NEG.
- The associated FQDN can be DNS resolved successfully (if you are using FQDN endpoint type).
- The endpoint is accessible over the internet.
If traffic cannot reach the endpoint, which results in a 502 error code for HTTP(s), check the following:
- Query the
_cloud-eoips.googleusercontent.com
DNS TXT record using a tool like dig or nslookup. Note the CIDRs (followingip4:
) and ensure these ranges are allowed by your firewall or cloud Access Control List (ACL).
After configuring a custom origin, requests to custom origin failed with a 5xx error
- Check Logging.
- Verify that the network endpoint group is configured with the correct IP:Port or FQDN:Port for your custom origin.
- If you are using FQDN, make sure that it is resolvable through Google Public DNS. You can verify that the FQDN is resolvable through Google Public DNS using these steps or the web interface directly.
- If you are accessing the HTTP(S) load balancer on its external IP only, and your origin web-server is expecting a hostname, ensure that you are sending a valid HTTP Host header to your backend by configuring a custom request header.
- If communicating to a backend over HTTPS or HTTP2 (as set in the
protocol
field of the backend service) configured as anINTERNET_FQDN_PORT
custom origin endpoint, ensure that your origin is presenting a valid TLS (SSL) certificate and the configured FQDN matches a SAN (Subject Alternative Name) in the certificates' list of SANs. A valid certificate is defined as one signed by a public Certificate Authority and that has not expired. - When using
INTERNET_FQDN_PORT
custom origin endpoints, self-signed certificates are not accepted by the HTTPS load balancer, and are rejected. - When using HTTPS or HTTP/2 with
INTERNET_IP_PORT
type endpoints, no SSL certificate validation/SAN check is performed. This means one can use self-signed certificates. When using SSL, our recommendation is to useINTERNET_FQDN_PORT
endpoints to make sure server certificates and SANs can be validated.
Responses from my custom origin are not cached by Cloud CDN
Ensure that:
- You have enabled Cloud CDN on the backend service containing the NEG that points to your custom origin by setting enableCDN to true.
- Responses served by your custom origin meet Cloud CDN
caching requirements. For example, you are
sending
Cache-Control: public, max-age=3600
response headers from the origin.
Troubleshooting serverless NEG issues
Requests fail with a 404 error
Ensure that the underlying serverless resource (such as an App Engine, Cloud Functions, or Cloud Run (fully managed) service) is still running. If the serverless resource is deleted but the serverless NEG still exists, the external HTTP(S) load balancer will continue to attempt to route requests to the non-existence service. This results in a 404 response.
In general, an external HTTP(S) load balancer cannot detect if the underlying serverless resource is working as expected. This means that if your service in one region is returning errors but the overall Cloud Run (fully managed), Cloud Functions, or App Engine infrastructure in that region is operating normally, your external HTTP(S) load balancer will not automatically direct traffic away to other regions. Make sure you thoroughly test new versions of your services before routing user traffic to them.
Handling URL mask mismatches
If applying the configured URL mask to a user request URL doesn't result in a service name, or if it results in a service name that does not exist, the load balancer might handle these mismatches differently depending on the serverless compute platform in use.
Cloud Run (fully managed): In case of a URL mask mismatch, the load balancer returns an HTTP error 404 (Not Found).
Cloud Functions: In case of a URL mask mismatch, the load balancer returns an HTTP error 404 (Not Found).
App Engine: In case of a URL mask mismatch,
App Engine uses dispatch.yaml
and App Engine's default
routing logic to determine which service to send the request to.