Troubleshooting Internal TCP/UDP Load Balancing

This guide describes how to troubleshoot configuration issues for a Google Cloud internal TCP/UDP load balancer.


The types of issues discussed in this guide include the following:

  • General connectivity issues
  • Backend failover issues
  • Load balancer as next-hop issues

Before you begin

Before investigating issues, familiarize yourself with the following pages.

For general connectivity:

For failover:

For next hop:

Troubleshooting general connectivity issues

  • Symptom: I can't connect to my internal TCP/UDP load balancer from a VM client in another region.
  • Reason: Global access isn't enabled.

If you can't connect to an internal TCP/UDP load balancer, check for the following issues:

  • Ensure that ingress allow firewall rules are defined to permit health checks to backend VMs.
  • Ensure that ingress allow firewall rules allow traffic to the backend VMs from clients.
  • Make sure that the client connecting to the load balancer is in the same region as the load balancer. If the client is in another region, make sure that global access is enabled.
  • To verify that health check traffic reaches your backend VMs, enable health check logging and search for successful log entries.

  • Symptom: I can't connect to my service through my internal TCP/UDP load balancer, but I can connect directly to a backend VM.

  • Reason: The Guest environment is either not running or is unable to communicate to the metadata server (,

Check for the following:

  • Ensure that the Guest environment is installed and running on the backend VM.
  • Ensure the firewall rules within the guest operating system of the backend VM (iptables, Windows Firewall) don't block access to the metadata server.
  • Ensure the service on the backend VM is listening to the IP address of the load balancer's forwarding rule.

Troubleshooting Shared VPC issues

If you are using Shared VPC and you cannot create a new internal TCP/UDP load balancer in a particular subnet, an organization policy might be the cause. In the organization policy, add the subnet to the list of allowed subnets or contact your organization administrator. For more information, refer to the constraints/compute.restrictSharedVpcSubnetworks constraint.

Troubleshooting failover issues

If you've configured failover for an internal TCP/UDP load balancer, the following sections describe the issues that can occur.


  • Make sure that you've designated at least one failover backend.
  • Verify your failover policy settings:
    • Failover ratio
    • Dropping traffic when all backend VMs are unhealthy
    • Disabling connection draining on failover

Issues with managed instance groups and failover

  • Symptom: The active pool is changing back and forth (flapping) between the primary and failover backends.
  • Possible reason: Using managed instance groups with autoscaling and failover might cause the active pool to repeatedly failover and failback between the primary and failover backends. Google Cloud doesn't prevent you from configuring failover with managed instance groups, because your deployment might benefit from this setup.

Disable connection draining restriction for failover groups

Disabling connection draining only works if the backend service is set up with protocol TCP.

The following error message appears if you create backend service with UDP while connection draining is disabled:

gcloud compute backend-services create my-failover-bs
  --global-health-checks \
  --load-balancing-scheme internal
  --health-checks my-tcp-health-check
  --region us-central1
  --failover-ratio 0.5
  --protocol UDP
ERROR: (gcloud.compute.backend-services.create) Invalid value for
[--protocol]: can only specify --connection-drain-on-failover if the protocol is

Traffic is sent to unexpected backend VMs

First check the following: If the client VM is also a backend VM of the load balancer, it's expected behavior that connections sent to the IP address of the load balancer's forwarding rule are always answered by the backend VM itself. For more information, refer to testing connections from a single client and sending requests from load balanced VMs.

If the client VM is not a backend VM of the load balancer:

  • For requests from a single client, refer to testing connections from a single client so that you understand the limitations of this method.

  • Ensure that you have configured ingress allow firewall rules to allow health checks.

  • For a failover configuration, make sure that you understand how membership in the active pool works, and when Google Cloud performs failover and failback. Inspect your load balancer's configuration:

    • Use the Cloud Console to check for the number of healthy backend VMs in each backend instance group. The Cloud Console also shows you which VMs are in the active pool.

    • Make sure that your load balancer's failover ratio is set appropriately. For example, if you have ten primary VMs and a failover ratio set to 0.2, this means Google Cloud performs a failover when fewer than two (10 × 0.2 = 2) primary VMs are healthy. A failover ratio of 0.0 has a special meaning: Google Cloud performs a failover when no primary VMs are healthy.

Existing connections are terminated during failover or failback

Edit your backend service's failover policy. Ensure that connection draining on failover is enabled.

Troubleshooting load balancer as next hop

When you set an internal TCP/UDP load balancer to be a next hop of a custom static route, the following issues might occur:


  • If you can't ping your backend, keep in mind that a route with the internal TCP/UDP load balancer set to be the next hop is only supported for TCP and UDP traffic. Other protocol packets, such as ICMP, are ignored by the load balancer. For more information, see TCP, UDP, and other protocol traffic.

  • When using an internal TCP/UDP load balancer as a next hop for a custom static route, all TCP and UDP traffic is delivered to the load balancer's healthy backend VMs, regardless of the protocol configured for the load balancer's internal backend service, and regardless of the port or ports configured on the load balancer's internal forwarding rule.

  • Ensure that you have created ingress allow firewall rules that correctly identify sources of traffic that should be delivered to backend VMs via the custom static route's next hop. Packets that arrive on backend VMs preserve their source IP addresses, even when delivered by way of a custom static route.

Invalid value for destination range

The destination range of a custom static route can't be more specific than any subnet route in your VPC network. If you receive the following error message when creating a custom static route:

Invalid value for field 'resource.destRange': [ROUTE_DESTINATION].
[ROUTE_DESTINATION] hides the address space of the network .... Cannot change
the routing of packets destined for the network.
  • You cannot create a custom static route with a destination that exactly matches or is more specific (with a longer mask) than a subnet route. Refer to applicability and order for further information.

  • If packets go to an unexpected destination, remove other routes in your VPC network with more specific destinations. Review the routing order to understand Google Cloud route selection.

Network tags are not supported

You cannot assign a network tag to a custom static route when the next hop is an internal TCP/UDP load balancer. For example, the following gcloud command produces the error message listed below:

$ gcloud compute routes create example-route \
--destination-range= \
--next-hop-ilb=internal-lb-forwarding-rule \

ERROR: (gcloud.compute.routes.create) Could not fetch resource:
 - Invalid value for field 'resource.tags': ''. Tag is not supported for routes
 with next hop ilb.

What's next