This guide describes how to troubleshoot configuration issues for a Google Cloud internal Application Load Balancer. Before following this guide, familiarize yourself with the following:
- Internal Application Load Balancer overview
- Proxy-only subnets
- Internal Application Load Balancer logging and monitoring
Troubleshoot common issues with Network Analyzer
Network Analyzer automatically monitors your VPC network configuration and detects both suboptimal configurations and misconfigurations. It identifies network failures, provides root cause information, and suggests possible resolutions. To learn about the different misconfiguration scenarios that are automatically detected by Network Analyzer, see Load balancer insights in the Network Analyzer documentation.
Network Analyzer is available in the Google Cloud console as a part of Network Intelligence Center.
Go to Network AnalyzerBackends have incompatible balancing modes
When creating a load balancer, you might see the error:
Validation failed for instance group INSTANCE_GROUP: backend services 1 and 2 point to the same instance group but the backends have incompatible balancing_mode. Values should be the same.
This happens when you try to use the same backend in two different load balancers, and the backends don't have compatible balancing modes.
For more information, see the following:
Load balanced traffic does not have the source address of the original client
This is expected behavior. An internal Application Load Balancer operates as an HTTP(S) reverse proxy (gateway). When a client program opens a connection to the IP address of an INTERNAL_MANAGED forwarding rule, the connection terminates at a proxy. The proxy processes the requests that arrive over that connection. For each request, the proxy selects a backend to receive the request based on the URL map and other factors. The proxy then sends the request to the selected backend. As a result, from the point of view of the backend, the source of an incoming packet is an IP address from the region's proxy-only subnet.
Requests are rejected by the load balancer
For each request, the proxy selects a backend to receive the request based on a
path matcher in the load balancer's URL map. If the URL map doesn't have a path
matcher defined for a request, it cannot select a backend service, so it returns
an HTTP 404
(Not Found) response code.
Load balancer doesn't connect to backends
The firewalls protecting your backend servers need to be configured to allow ingress traffic from the proxies in the proxy-only subnet range that you allocated to your internal HTTP(S) load balancer's region.
The proxies connect to backends using the connection settings specified by the configuration of your backend service. If these values don't match the configuration of the server(s) running on your backends, the proxy cannot forward requests to the backends.
Health check probes can't reach the backends
To verify that health check traffic reaches your backend VMs, enable health check logging and search for successful log entries.
Clients cannot connect to the load balancer
The proxies listen for connections to the load balancer's IP address and port
configured in the forwarding rule (for example, 10.1.2.3:80
), and with the
protocol specified in the forwarding rule (HTTP or HTTPS). If your clients can't
connect, ensure that they are using the correct address, port, and protocol.
Ensure that a firewall isn't blocking traffic between your client instances and the load balanced IP address.
Ensure that the clients are in the same region as the load balancer. Internal HTTP(S) Load Balancing is a regional product, so all clients (and backends) must be in the same region as the load balancer resource.
Organizational policy restriction for Shared VPC
If you are using Shared VPC and you cannot create a new internal Application Load Balancer
in a particular subnet, an organization policy might be the
cause. In the organization policy, add the subnet to the list of allowed
subnets or contact your organization administrator. For more information, see
constraints/compute.restrictSharedVpcSubnetworks
.
Load balancer doesn't distribute traffic evenly across zones
You might observe an imbalance in your internal Application Load Balancer traffic across zones. This can happen especially when there is low utilization (< 10%) of your backend capacity.
Such behavior can affect overall latency due to traffic being sent to only a few servers in one zone.
To even out the traffic distribution across zones, you can make the following configuration changes:
- Use the
RATE
balancing mode with a lowmax-rate-per-instance
target capacity. - Use the
LocalityLbPolicy
backend traffic policy with a load balancing algorithm ofLEAST_REQUEST
.
Unexplained 5xx
errors
For error conditions caused by a communications issue between the load balancer
proxy and its backends, the load balancer generates an HTTP status code
(5xx
) and returns that status code to the client. Not all HTTP 5xx
errors are generated by the load balancer—for example, if a backend sends
an HTTP 5xx
response to the load balancer, the load balancer relays that
response to its client. To determine whether an HTTP 5xx
response was relayed
from a backend or if it was generated by the load balancer proxy, see the
proxyStatus
field of the load balancer
logs.
Configuration changes to the internal Application Load Balancer, such as addition or
removal of a backend service, can result in a brief period of time where users
see the HTTP status code 503
. While these configuration changes
propagate to
Envoys globally,
you see log entries where the proxyStatus
field matches the
connection_refused
log string.
If HTTP 5xx
status codes persist longer than a few minutes after you complete
the load balancer configuration, take the following steps to troubleshoot HTTP
5xx
responses:
Verify that there is a firewall rule configured to allow health checks. In the absence of one, load balancer logs typically have a
proxyStatus
matchingdestination_unavailable
, which indicates that the load balancer considers the backend to be unavailable.Verify that health check traffic reaches your backend VMs. To do this, enable health check logging and search for successful log entries.
For new load balancers, the lack of successful health check log entries doesn't mean that health check traffic is not reaching your backends. It might mean that the backend's initial health state has not yet changed from
UNHEALTHY
to a different state. You see successful health check log entries only after the health check prober receives an HTTP200 OK
response from the backend.Verify that the keepalive configuration parameter for the HTTP server software running on the backend instance is not less than the keepalive timeout of the load balancer, whose value is fixed at 10 minutes (600 seconds) and is not configurable.
The load balancer generates an HTTP
5xx
status code when the connection to the backend has unexpectedly closed while sending the HTTP request or before the complete HTTP response has been received. This can happen because the keepalive configuration parameter for the web server software running on the backend instance is less than the fixed keepalive timeout of the load balancer. Ensure that the keepalive timeout configuration for HTTP server software on each backend is set to slightly greater than 10 minutes (the recommended value is 620 seconds).
Limitations
If you are having trouble using an internal Application Load Balancer with other Google Cloud networking features, note the current compatibility limitations.