Internal TCP/UDP Load Balancing is a regional load balancer that enables you to run and scale your services behind a private load balancing IP address that is accessible only to your internal virtual machine instances.
Google Cloud Platform (GCP) Internal TCP/UDP Load Balancing distributes traffic among VM instances in the same region in a VPC network using a private, internal (RFC 1918) IP address.
As shown in the following high-level diagram, an Internal TCP/UDP Load Balancing service has a frontend (the forwarding rule) and a backend (the backend service and instance groups).
Protocols, scheme, and scope
Each internal TCP/UDP load balancer supports either TCP or UDP traffic (not both).
An internal TCP/UDP load balancer uses a single backend service with an internal load balancing scheme, meaning that it balances traffic within GCP across your instances. You cannot use it to balance traffic that originates from the internet.
The scope of an internal TCP/UDP load balancer is regional, not global. This means that an internal TCP/UDP load balancer cannot span multiple regions. Within a single region, the load balancer services all zones. See Regions and Zones.
You can access an internal TCP/UDP load balancer in your VPC network from a connected network, using:
- VPC Network Peering
- Cloud VPN and Cloud Interconnect
For detailed examples, see Internal TCP/UDP Load Balancing and Connected Networks.
Three-tier web service example
You can use Internal TCP/UDP Load Balancing in conjunction with other load balancers, such as HTTP(S) load balancers, where your web tier uses the external load balancer, which then relies on services behind the internal load balancer.
The following diagram depicts an example of a three-tier configuration using external HTTP(S) and internal TCP/UDP load balancers:
How Internal TCP/UDP Load Balancing works
Internal TCP/UDP Load Balancing has the following characteristics:
- The load balancer is a managed service.
- The load balancer is not a proxy.
- It's implemented in virtual networking.
- There's no intermediate device or single point of failure.
- Responses from the backend VMs go directly to the clients, not back through the load balancer.
Unlike a device-based or instance-based load balancer, Internal TCP/UDP Load Balancing doesn't terminate connections from clients. Instead of traffic being sent to a load balancer then onto backends, traffic is sent to the backends directly. The GCP Linux or Windows guest environment configures each backend VM with the IP address of the load balancer, and GCP virtual networking manages traffic delivery, scaling as appropriate.
An internal TCP/UDP load balancer with multiple backend instance groups distributes connections among backends VMs in all of those instance groups. Refer to traffic distribution for details about the distribution method and its configuration options.
You can use any type of instance group — unmanaged instance groups, managed zonal instance groups, or managed regional instance groups — but not Network Endpoint Groups (NEGs), as backends for your load balancer.
High availability describes how to design an internal load balancer that is not dependent on a single zone.
Instances that participate as backend VMs for internal TCP/UDP load balancers must be running the appropriate Linux Guest Environment, Windows Guest Environment, or other processes that provide equivalent functionality.
This diagram illustrates traffic
distribution among VMs located in two separate instance groups. Traffic sent
from the client instance to the IP address of the load balancer (
is distributed among backend VMs in either instance group. Responses sent from
any of the serving backend VMs are delivered directly to the client VM.
Only client VMs in the region can access the internal TCP/UDP load balancer. Client VMs in the same VPC network must be located in the same region, but not necessarily in the same subnet, in order to send packets to an internal TCP/UDP load balancer. You can also communicate with an internal TCP/UDP load balancer from a client system in a different network — as long as that other network is properly connected to the VPC network where the load balancer is defined. See Internal TCP/UDP Load Balancing and Connected Networks.
The internal TCP/UDP load balancer is highly available by design. There are no special steps to make the load balancer highly available because the mechanism doesn't rely on a single device or a VM instance.
The following best practices describe how to deploy backend VM instances so that you are not reliant on a single zone:
Use regional managed instance groups if you can deploy your software using instance templates. Regional managed instance groups automatically distribute traffic among multiple zones, providing the best option to avoid potential issues in any given zone.
If you use zonal managed instance groups or unmanaged instance groups, use multiple instance groups in different zones (in the same region) for the same backend service. Using multiple zones protects against potential issues in any given zone.
An internal TCP/UDP load balancer consists of the following GCP components.
|Internal IP address||This is the address for the load balancer.||The internal IP address must be in the same subnet as the internal forwarding rule. The subnet must be in the same region as the backend service.|
|Internal forwarding rule||An internal forwarding rule in combination with the internal IP address is the frontend of the load balancer. It defines the protocol and port(s) that the load balancer accepts, and it directs traffic to a regional internal backend service.||Forwarding rules for internal TCP/UDP load balancers must:
• Have a
• Use an
• Reference a
|Regional internal backend service||The regional internal backend service defines the protocol used to communicate with the backends (instance groups), and it specifies a health check. Backends can be unmanaged instance groups, managed zonal instance groups, or managed regional instance groups.||The backend service must:
• Have a
• Use a
• Have an associated health check
• Reference backends in the same region. Backend instance groups can be in any subnet of the region. The backend service itself is not tied to any specific subnet.
|Health check||Every backend service must have an associated health check. The health check defines the parameters under which GCP considers the backends it manages to be eligible to receive traffic. Only healthy VMs in the backend instance groups will receive traffic sent from client VMs to the IP address of the load balancer.||Even though the forwarding rule and backend service can use either
Internal IP address
Internal TCP/UDP Load Balancing uses an internal (private, RFC 1918) IPv4 address from the primary IP range of the subnet you select when you create the internal forwarding rule. The IP address can't be from a secondary IP range of the subnet.
Internal TCP/UDP Load Balancing requires at least one internal forwarding rule in a subnet in the same region as its backend service and instance groups (collectively, its backend components). An internal forwarding rule must be in the same region and use the same protocol as the load balancer's backend service.
The forwarding rule is where you define the ports on which the load balancer accepts traffic. Internal TCP/UDP load balancers are not proxies; they pass traffic to backends on the same port on which the traffic is received. You must specify at least one port number per forwarding rule.
In addition to ports, you must reference a specific subnet in your VPC network when you create an internal forwarding rule. The subnet you specify for the forwarding rule does not have to be the same as any of the subnets used by backend VMs; however, the forwarding rule, subnet, and backend service must all be the in the same region. When you create an internal forwarding rule, GCP chooses an available internal IP address from the primary IP address range of the subnet you select. Alternatively, you can specify an internal IP address in the subnet's primary IP range.
Forwarding rules and port specifications
When you create an internal forwarding rule, you must choose one of the following port specifications:
- Specify at least one and up to five ports, by number
ALLto forward traffic on all ports
Create an internal forwarding rule that supports all ports in order to forward all traffic for a given protocol (such as TCP) to a single internal load balancer. This allows backend VMs to run multiple applications, one per port. Traffic sent to a given port is delivered to the corresponding application, and all applications use the same IP address.
If you are concerned about opening up backend applications to all ports, you can deploy firewall rules in the backend VMs to set the scope of received traffic to expected ports.
You cannot modify forwarding rules after you create them. If you need to change the specified ports or the internal IP address for an internal forwarding rule, you must delete and replace the forwarding rule.
Multiple forwarding rules
You can configure multiple internal forwarding rules for the same internal load balancer. Each forwarding rule must have its own, unique IP address, and can only reference a single backend service. Multiple internal forwarding rules can reference the same backend service.
Configuring multiple internal forwarding rules can be useful if you need more than one IP address for the same internal TCP/UDP load balancer or if you need to associate certain ports with different IP addresses. When using multiple internal forwarding rules, make sure that you configure the software running on your backend VMs so that it binds to all necessary IP addresses because the destination IP addresses for packets delivered through the load balancer is the internal IP address associated with the respective internal forwarding rule.
On the setting up Internal TCP/UDP Load Balancing page,
refer to the procedure for creating a
secondary forwarding rule.
In that example, two forwarding rules, one using
the other using
10.5.6.99, are configured for the same load balancer. The
backend VMs must be configured to receive packets on either of these IP
addresses. One way to do this is to configure the backends to bind to any IP
Each internal TCP/UDP load balancer uses one regional internal backend service. The name of the backend service is the name of the internal TCP/UDP load balancer shown in the GCP Console.
The backend service accepts traffic directed to it by one or more internal
forwarding rules. A regional internal backend service accepts either
UDP traffic, but not both, and it delivers traffic to backend VMs on the same
ports to which traffic was sent, based on the forwarding rule.
Backend services must have at least one backend instance group and an associated health check. Backends can be unmanaged instance groups, zonal managed instance groups, or regional managed instance groups in the same region as the backend service (and forwarding rule). The backend service distributes traffic to the backend VMs and manages session affinity, if configured.
The load balancer's backend service must be associated with a health check. You can use an existing health check or define a new one. Internal TCP/UDP load balancers only send traffic to backend VMs that pass their health checks; however, if all the backend VMs fail their health checks, GCP distributes traffic among all of them.
You can use any of the following health check protocols. The protocol of the health check does not have to match the protocol of the load balancer itself.
- HTTP, HTTPS, or HTTP/2: If your backend VMs serve traffic using HTTP, HTTPS, or HTTP/2, it's best to use a health check that matches that protocol, because HTTP-based health checks offer options appropriate to that protocol. Note that serving HTTP type traffic through an internal TCP/UDP load balancer means that the load balancer's protocol is TCP.
- SSL or TCP: If your backend VMs do not serve HTTP-type traffic, you should use either an SSL or TCP health check.
Regardless of the type of health check you create, GCP sends health check probes to the IP address of the internal TCP/UDP load balancer, simulating how load balanced traffic would be delivered. Software running on your backend VMs must respond to both load balanced traffic and health check probes sent to the IP address of the load balancer itself. For more information, see Destination for health check packets.
Health checks and UDP traffic
GCP does not offer a health check that uses the UDP protocol. When you use Internal TCP/UDP Load Balancing with UDP traffic, you must run a TCP-based service on your backend VMs to provide health check information.
In this configuration, client requests are load balanced using the UDP protocol, and a TCP service is used to provide information to GCP health check probers. For example, you can run a simple HTTP server on each backend VM that returns a HTTP 200 response to GCP. In this example, you should use your own logic running on the backend VM to ensure that the HTTP server returns 200 only if the UDP service is properly configured and running.
Under normal circumstances, internal TCP/UDP load balancers distribute traffic among healthy backend VMs. By default, the traffic distribution method uses a hash calculated from five pieces of information: the client's IP address, source port, the load balancer's internal forwarding rule IP address, destination port, and the protocol. You can modify the traffic distribution method for TCP traffic by specifying a session affinity option.
If all of your backend VMs fail their health checks, an internal TCP/UDP load balancer will distribute new connections among them all as if they were all healthy. However, as long as at least one backend VM is healthy, the load balancer will only distribute new connections to the healthy backend VMs.
The health check state only controls the distribution of new connections. An established TCP session can remain on an unhealthy backend VM, provided that the backend VM is still handling the connection.
Session affinity options
Session affinity controls the distribution of new connections from clients to the load balancer's backend VMs. Set session affinity when your backend VMs need to keep track of state information for their clients when sending TCP traffic. This is a common requirement for web applications.
Session affinity works on a best-effort basis for TCP traffic. Because the UDP protocol does not have support for sessions, it has no effect on UDP traffic.
Internal TCP/UDP load balancers support the following session affinity options, which you specify for the entire internal backend service, not per backend instance group:
- None: This is the default setting, which is effectively the same as client IP protocol and port.
- Client IP: Directs a particular client's requests to the same backend VM based on a hash created from the client's IP address and the load balancer's IP address (the internal IP address of an internal forwarding rule).
- Client IP and protocol: Directs a particular client's requests to the same backend VM based on a hash created from three pieces of information: the client's IP address, the load balancer's (internal forwarding rule) IP address, and the load balancer's protocol.
- Client IP, protocol and port: Directs a particular client's requests to the
same backend VM based on a hash created from these five pieces of information:
- Source IP address of the client sending the request
- Source port of the client sending the request
- Load balancer's (internal forwarding rule) IP address
- Destination port
- Protocol (TCP or UDP)
Session affinity and health check state
Changing health states of backend VMs can cause a loss of session affinity. For example, if a backend VM becomes unhealthy, and there is at least one other healthy backend VM, an internal TCP/UDP load balancer will not distribute new connections to the unhealthy VM. If a client had session affinity with that unhealthy VM, it will be directed to the other, healthy backend VM instead, losing its session affinity.
Testing connections from a single client
Because each of the session affinity options rely on at least the client's IP address, requests from the same client could be distributed to the same backend VM more frequently than you might expect. Practically, this means that you cannot monitor traffic distribution through an internal TCP/UDP load balancer by sending requests from a single client.
|Internal forwarding rules per network||50||This is a VPC network limit. Refer to VPC network quotas for more information.|
|Internal forwarding rules per peering group||75||This is a VPC network limit. Refer to VPC network quotas for more information.|
|Number of ports you can specify per internal forwarding rule||• 5
if you specify a range
if you specify all ports
|This limit cannot be changed.|
|Number of backend services per forwarding rule||1||This limit cannot be increased. Each forwarding rule can only send traffic to one backend service.|
|Number of forwarding rules per backend service||No separate limit||Multiple internal forwarding rules can reference the same internal backend service, subject to the maximum number of internal forwarding rules per network.|
|Maximum number of VM instances per internal load balancer||250||This limit cannot be increased. It represents the total number of VMs that a backend service can manage, regardless of how the VMs are allocated among backends (instance groups).|
See Setting Up Internal TCP/UDP Load Balancing for an example internal TCP/UDP load balancer configuration.
See Internal TCP/UDP Load Balancing Logging and Monitoring for information on configuring Stackdriver logging and monitoring for Internal TCP/UDP Load Balancing.
See Internal TCP/UDP Load Balancing and Connected Networks for information about accessing internal TCP/UDP load balancers from other networks connected to your VPC network.
Learn about load balancing and forwarding rules pricing.