You can configure an internal TCP/UDP load balancer to distribute connections among virtual machine (VM) instances in primary backends, and then switch, if needed, to using failover backends. Failover provides one method of increasing availability, while also giving you greater control over how to manage your workload when your primary backend VMs aren't healthy.
This page describes concepts and requirements specific to failover for internal TCP/UDP load balancers. Make sure that you are familiar with the conceptual information in the following documents before you configure failover for Internal TCP UDP Load Balancing:
These concepts are important to understand because configuring failover modifies the internal TCP/UDP load balancer's standard traffic distribution algorithm.
By default, when you add a backend to an internal TCP/UDP load balancer's backend service, that backend is a primary backend. You can designate a backend to be a failover backend when you add it to the load balancer's backend service, or by editing the backend service later. Failover backends only receive connections from the load balancer after a configurable ratio of primary VMs don't pass health checks.
Supported instance groups
Managed and unmanaged instance groups are supported as backends. For simplicity, the examples on this page show unmanaged instance groups.
Using managed instance groups with autoscaling and failover might cause the active pool to repeatedly failover and failback between the primary and failover backends. Google Cloud doesn't prevent you from configuring failover with managed instance groups, because your deployment might benefit from this setup.
The following simple example depicts an internal TCP/UDP load balancer with one
primary backend and one failover backend. The primary backend is an unmanaged
instance group in
us-west1-a, and the failover backend is a different unmanaged
instance group in
The next example depicts an internal TCP/UDP load balancer with two primary
backends and two failover backends, both distributed between two zones in the
us-west1 region. This configuration increases reliability because it doesn't
depend on a single zone for all primary or all failover backends.
- Primary backends are unmanaged instance groups
- Failover backends are unmanaged instance groups
During failover, both of the primary backends become inactive, while the healthy VMs in both of the failover backends become active. For a full explanation of how failover works in this example, see Failover example.
About backend instance groups and VMs
The unmanaged instance groups in Internal TCP/UDP Load Balancing are either primary backends or failover backends. You can designate a backend to be a failover backend at the time that you add it to the backend service or by editing the backend after you add it. Otherwise, unmanaged instance groups are primary, by default.
You can configure multiple primary backends and multiple failover backends in a single internal TCP/UDP load balancer by adding them to the load balancer's backend service.
A primary VM is a member of an instance group that you've defined to be a primary backend. The VMs in a primary backend participate in the load balancer's active pool (described in the next section), unless the load balancer switches to using its failover backends.
A backup VM is a member of an instance group that you've defined to be a failover backend. VMs in a failover backend participate in the load balancer's active pool when a configurable ratio of primary VMs become unhealthy.
VMs - You can have up to 250 VMs in the active pool. In other words, your primary backend instance groups can have a total of up to 250 primary VMs, and your failover backend instance groups can have a total of up to 250 backup VMs.
Unmanaged instance groups - You can have up to 50 primary backend instance groups and up to 50 failover backend instance groups.
As an example, a maximum deployment might have 5 primary backends and 5 failover backends, with each instance group containing 50 VMs.
The active pool is the collection of backend VMs to which an internal TCP/UDP load balancer sends new connections. Membership of backend VMs in the active pool is computed automatically based on which backends are healthy and conditions that you can specify, as described in Failover ratio.
The active pool never combines primary VMs and backup VMs. The following examples clarify the membership possibilities. During failover, the active pool contains only backup VMs. During normal operation (failback), the active pool contains only primary VMs.
Failover and failback
Failover and failback are the automatic processes that switch backend VMs into or out of the load balancer's active pool. When GCP removes primary VMs from the active pool and adds healthy failover VMs to the active pool, the process is called failover. When GCP reverses this, the process is called failback.
A failover policy is a collection of parameters that Google Cloud uses for failover and failback. Each internal TCP/UDP load balancer has one failover policy, but the failover policy has multiple settings, as follows:
- Failover ratio
- Dropping traffic when all backend VMs are unhealthy
- Connection draining on failover and failback
A configurable failover ratio
determines when Google Cloud performs a failover or failback, changing
membership in the active pool. The ratio can be from
If you don't specify a failover ratio, Google Cloud uses a default value
0.0. It's a best practice to set your failover ratio to a number that works
for your use case rather than relying on this default.
|Conditions||VMs in active pool|
||All healthy primary VMs|
|If at least one backup VM is healthy and:
||All healthy backup VMs|
|When all primary VMs and all backup VMs are unhealthy and you haven't configured your load balancer to drop traffic during this situation||All primary VMs, as a last resort|
The following examples clarify membership in the active pool. Refer to the Failover example for an example with calculations.
- A failover ratio of
1.0requires all primary VMs to be healthy. When at least one primary VM becomes unhealthy, Google Cloud performs a failover, moving the backup VMs into the active pool.
- A failover ratio of
0.1requires that at least 10% of the primary VMs be healthy; otherwise, Google Cloud performs a failover.
- A failover ratio of
0.0means that Google Cloud performs a failover only when all of the primary VMs are unhealthy. Failover doesn't happen if at least one primary VM is healthy.
An internal TCP/UDP load balancer distributes connections among VMs in the active pool according to the traffic distribution algorithm.
Dropping traffic when all backend VMs are unhealthy
By default, when all primary and backup VMs are unhealthy, Google Cloud distributes new connections among all primary VMs. It does so as a last resort.
If you prefer, you can configure your internal TCP/UDP load balancer to drop new connections when all primary and backup VMs are unhealthy.
Connection draining on failover and failback
Connection draining allows existing TCP sessions to remain active for up to a configurable time period even after backend VMs become unhealthy. If the protocol for your load balancer is TCP:
By default, connection draining is enabled. Existing TCP sessions can persist on a backend VM for up to 300 seconds (10 minutes), even if the backend VM becomes unhealthy or isn't in the load balancer's active pool.
You can disable connection draining during failover and failback events. Disabling connection draining during failover and failback ensures that all TCP sessions, including established ones, are quickly terminated. Connections to backend VMs might be closed with a TCP reset packet.
Disabling connection draining on failover and failback is useful for scenarios such as:
Patching backend VMs: Prior to patching, configure your primary VMs to fail health checks so that the load balancer performs a failover. Disabling connection draining ensures that all connections are moved to the backup VMs quickly and in a planned fashion. This allows you to install updates and restart the primary VMs without existing connections persisting. After patching, Google Cloud can perform a failback when a sufficient number of primary VMs (as defined by the failover ratio) pass their health checks.
Single backend VM for data consistency: If you need to ensure that only one primary VM is the destination for all connections, disable connection draining so that switching from a primary to a backup VM does not allow existing connections to persist on both. This reduces the possibility of data inconsistencies by keeping just one backend VM active at any given time.
The following example describes failover behavior for the multi-zone internal TCP/UDP load balancer example presented in the architecture section.
The primary backends for this load balancer are the unmanaged instance groups
us-west1-c. Each instance group contains
two VMs. All four VMs from both instance groups are primary VMs:
The failover backends for this load balancer are the unmanaged instance groups
us-west1-c. Each instance group contains
two VMs. All four VMs from both instance groups are backup VMs:
Suppose you want to configure a failover policy for this load balancer such that
new connections are delivered to backup VMs when the number of healthy primary
VMs is less than two. To accomplish this, set the failover ratio to
50%). Google Cloud uses the failover ratio to calculate the minimum
number of primary VMs that must be healthy by multiplying the failover ratio by
the number of primary VMs:
4 × 0.5 = 2
When all four primary VMs are healthy, Google Cloud distributes new connections to all of them. When primary VMs fail health checks:
vm-d1become unhealthy, Google Cloud distributes new connections between the remaining two healthy primary VMs,
vm-d2, because the number of healthy primary VMs is at least the minimum.
vm-a2also fails health checks, leaving only one healthy primary VM,
vm-d2, Google Cloud recognizes that the number of healthy primary VMs is less than the minimum, so it performs a failover. The active pool is set to the four healthy backup VMs, and new connections are distributed among those four (in instance groups
ig-c). Even though
vm-d2remains healthy, it is removed from the active pool and does not receive new connections.
vm-a2recovers and passes its health check, Google Cloud recognizes that the number of healthy primary VMs is at least the minimum of two, so it performs a failback. The active pool is set to the two healthy primary VMs,
vm-d2, and new connections are distributed between them. All backup VMs are removed from the active pool.
As other primary VMs recover and pass their health checks, Google Cloud adds them to the active pool. For example, if
vm-a1becomes healthy, Google Cloud sets the active pool to the three healthy primary VMs,
vm-d2, and distributes new connections among them.
- See Internal TCP/UDP Load Balancing Concepts for important fundamentals.
- See Internal Load Balancing and DNS Names for available DNS name options your load balancer can use.
- See Setting Up Internal TCP/UDP Load Balancing for an example internal TCP/UDP load balancer configuration.
- See Configuring failover for Internal TCP/UDP Load Balancing for configuration steps and an example internal TCP/UDP load balancer failover configuration.
- See Internal TCP/UDP Load Balancing Logging and Monitoring for information on configuring Stackdriver logging and monitoring for Internal TCP/UDP Load Balancing.
- See Internal TCP/UDP Load Balancing and Connected Networks for information about accessing internal TCP/UDP load balancers from peer networks connected to your VPC network.
- See Troubleshooting Internal TCP/UDP Load Balancing for information on how to troubleshoot issues with your internal TCP/UDP load balancer.