Troubleshooting Internal TCP/UDP Load Balancing

This guide describes how to troubleshoot configuration issues for a Google Cloud Platform internal TCP/UDP load balancer. Before following this guide, familiarize yourself with the following:


Regional restriction for internal TCP/UDP load balancers

  • Symptom: I can't connect to my internal TCP/UDP load balancer from a VM client in another region.
  • Reason: internal TCP/UDP load balancers is regional. They're only accessible from their own region.

Unmanaged instance group restriction for failover groups

  • Symptom: The active pool is changing back and forth (flapping) between the primary and failover backends.
  • Possible reason: Using managed instance groups with autoscaling and failover might cause the active pool to repeatedly failover and failback between the primary and failover backends. GCP doesn't prevent you from configuring failover with managed instance groups, because your deployment might benefit from this setup.

Disable connection draining restriction for failover groups

Disabling connection draining only works if the backend service is set up with protocol TCP.

The following error message appears if you create backend service with UDP while connection draining is disabled:

gcloud beta compute backend-services create my-failover-bs
  --load-balancing-scheme internal
  --health-checks my-tcp-health-check
  --region us-central1
  --failover-ratio 0.5
  --protocol UDP
ERROR: (gcloud.beta.compute.backend-services.create) Invalid value for
[--protocol]: can only specify --connection-drain-on-failover if the protocol is

API version for failover backends

Currently, failover options are only available in the Beta API. If creation of a backend service fails with an error saying that failover options is not a valid field, make sure that you've created the backend service using the correct API (gcloud beta compute backend-services...).

Traffic is sent to unexpected backend VMs

Make sure that you understand how membership in the active pool works, and when GCP performs (failover and failback](failover-overview#failover_failback). Then, inspect your load balancer:

  • Use the GCP Console to check for the number of healthy backend VMs in each backend instance group. The GCP Console also shows you which VMs are in the active pool.
  • Ensure that you have configured ingress allow firewall rules to allow health checks.
  • Make sure that your load balancer's failover ratio is set appropriately. For example, if you have ten primary VMs and a failover ratio set to 0.2, this means GCP performs a failover when fewer than two (10 × 0.2 = 2) primary VMs are healthy. A failover ratio of 0.0 has a special meaning: GCP performs a failover when no primary VMs are healthy.

Existing connections are terminated during failover or failback

Edit your backend service's failover policy. Ensure that connection draining on failover is enabled.

General connectivity issues

If you can't connect to an internal TCP/UDP load balancer:

  • Ensure that ingress allow firewall rules allow health checks.
  • Ensure that ingress allow firewall rules allow traffic to the backend VMs from clients.
  • Make sure that the client connecting to the load balancer is in the same region as the load balancer.

If you've configured failover for an internal TCP/UDP load balancer:

  • Make sure that you've designated at least one failover backend.
  • Verify your failover policy settings:
    • Failover ratio
    • Dropping traffic when all backend VMs are unhealthy
    • Disabling connection draining on failover

What's next

Esta página foi útil? Conte sua opinião sobre:

Enviar comentários sobre…