Troubleshooting Internal TCP/UDP Load Balancing

This guide describes how to troubleshoot configuration issues for a Google Cloud Platform internal TCP/UDP load balancer.

Overview

The types of issues discussed in this guide include the following:

  • General connectivity issues
  • Backend failover issues (beta)
  • Load balancer as next-hop issues (beta)

Before you begin

Before investigating issues, familiarize yourself with the following pages.

For general connectivity:

For failover:

For next hop:

Troubleshooting general connectivity issues

  • Symptom: I can't connect to my internal TCP/UDP load balancer from a VM client in another region.
  • Reason: internal TCP/UDP load balancers is regional. They're only accessible from their own region.

If you can't connect to an internal TCP/UDP load balancer, check for the following issues:

  • Ensure that ingress allow firewall rules are defined to permit health checks to backend VMs.
  • Ensure that ingress allow firewall rules allow traffic to the backend VMs from clients.
  • Make sure that the client connecting to the load balancer is in the same region as the load balancer.

Troubleshooting failover issues

If you've configured failover for an internal TCP/UDP load balancer, the following sections describe the issues that can occur.

Connectivity

  • Make sure that you've designated at least one failover backend.
  • Verify your failover policy settings:
    • Failover ratio
    • Dropping traffic when all backend VMs are unhealthy
    • Disabling connection draining on failover

Issues with managed instance groups and failover

  • Symptom: The active pool is changing back and forth (flapping) between the primary and failover backends.
  • Possible reason: Using managed instance groups with autoscaling and failover might cause the active pool to repeatedly failover and failback between the primary and failover backends. GCP doesn't prevent you from configuring failover with managed instance groups, because your deployment might benefit from this setup.

Disable connection draining restriction for failover groups

Disabling connection draining only works if the backend service is set up with protocol TCP.

The following error message appears if you create backend service with UDP while connection draining is disabled:

gcloud beta compute backend-services create my-failover-bs
  --load-balancing-scheme internal
  --health-checks my-tcp-health-check
  --region us-central1
  --no-connection-drain-on-failover
  --drop-traffic-if-unhealthy
  --failover-ratio 0.5
  --protocol UDP
ERROR: (gcloud.beta.compute.backend-services.create) Invalid value for
[--protocol]: can only specify --connection-drain-on-failover if the protocol is
TCP.

API version for failover backends

Currently, failover options are only available in the Beta API. If creation of a backend service fails with an error saying that failover options is not a valid field, make sure that you've created the backend service using the correct API (gcloud beta compute backend-services...).

Traffic is sent to unexpected backend VMs

First check the following: If the client VM is also a backend VM of the load balancer, it's expected behavior that connections sent to the IP address of the load balancer's forwarding rule are always answered by the backend VM itself. For more information, refer to testing connections from a single client and sending requests from load balanced VMs.

If the client VM is not a backend VM of the load balancer:

  • For requests from a single client, refer to testing connections from a single client so that you understand the limitations of this method.

  • Ensure that you have configured ingress allow firewall rules to allow health checks.

  • For a failover configuration, make sure that you understand how membership in the active pool works, and when GCP performs failover and failback. Inspect your load balancer's configuration:

    • Use the GCP Console to check for the number of healthy backend VMs in each backend instance group. The GCP Console also shows you which VMs are in the active pool.

    • Make sure that your load balancer's failover ratio is set appropriately. For example, if you have ten primary VMs and a failover ratio set to 0.2, this means GCP performs a failover when fewer than two (10 × 0.2 = 2) primary VMs are healthy. A failover ratio of 0.0 has a special meaning: GCP performs a failover when no primary VMs are healthy.

Existing connections are terminated during failover or failback

Edit your backend service's failover policy. Ensure that connection draining on failover is enabled.

Troubleshooting load balancer as next hop

When you set an internal TCP/UDP load balancer to be a next hop of a custom static route, the following issues might occur:

Connectivity

  • If you can't ping your backend, keep in mind that a route with the internal TCP/UDP load balancer set to be the next hop is only supported for TCP and UDP traffic. Other protocol packets, such as ICMP, are dropped.

  • When using an internal TCP/UDP load balancer as a next hop for a custom static route, all TCP and UDP traffic is delivered to the load balancer's healthy backend VMs, regardless of the protocol configured for the load balancer's internal backend service, and regardless of the port or ports configured on the load balancer's internal forwarding rule.

  • Ensure that you have created ingress allow firewall rules that correctly identify sources of traffic that should be delivered to backend VMs via the custom static route's next hop. Packets that arrive on backend VMs preserve their source IP addresses, even when delivered by way of a custom static route.

Invalid value for destination range

The destination range of a custom static route can't be more specific than any subnet route in your VPC network. If you receive the following error message when creating a custom static route:

Invalid value for field 'resource.destRange': [ROUTE_DESTINATION].
[ROUTE_DESTINATION] hides the address space of the network .... Cannot change
the routing of packets destined for the network.
  • You cannot create a custom static route with a destination that exactly matches or is more specific (with a longer mask) than a subnet route. Refer to applicability and order for further information.

  • If packets go to an unexpected destination, remove other routes in your VPC network with more specific destinations. Review the routing order to understand GCP route selection.

API version for next hop

Currently, the internal TCP/UDP load balancer as next hop (--next-hop-ilb) is only available in the Beta API. If creation of a static route fails, make sure that you've created the route using the correct API (gcloud beta compute routes create...).

Network tags are not supported

You cannot assign a network tag to a custom static route when the next hop is an internal TCP/UDP load balancer. For example, the following gcloud command produces the error message listed below:

$ gcloud beta compute routes create example-route \
--destination-range=0.0.0.0/0 \
--next-hop-ilb=internal-lb-forwarding-rule \
--tags='my_tag'

ERROR: (gcloud.beta.compute.routes.create) Could not fetch resource:
 - Invalid value for field 'resource.tags': ''. Tag is not supported for routes
 with next hop ilb.

What's next

Bu sayfayı yararlı buldunuz mu? Lütfen görüşünüzü bildirin:

Şunun hakkında geri bildirim gönderin...