Network Load Balancing is a regional, non-proxied load balancer.
Google Cloud Network Load Balancing distributes traffic among VM instances in the same region in a VPC network.
In a network load balancer, forwarding rules direct TCP and UDP traffic across regional backends.
The following diagram shows users in California, New York, and Singapore. They're all connecting into their backend resources, which are myapp, test, and travel. When a user in Singapore connects into the U.S. West backend, the traffic ingresses closest to Singapore, because the range is anycasted. From there, the traffic is routed to the regional backend.
Protocols, scheme, and scope
Each network load balancer supports either TCP or UDP traffic (not both).
A network load balancer uses a target pool to contain the backend instances among which traffic is load balanced.
A network load balancer balances traffic originating from the internet. You cannot use it to load balance traffic that originates within GCP between your instances.
The scope of a network load balancer is regional, not global. This means that a network load balancer cannot span multiple regions. Within a single region, the load balancer services all zones. See Regions and Zones.
Use Network Load Balancing in the following circumstances:
- You need to load balance UDP traffic, or you need to load balance a TCP port that isn't supported by other load balancers.
- It is acceptable to have SSL traffic decrypted by your backends instead of by the load balancer. The network load balancer cannot perform this task. When the backends decrypt SSL traffic, there is a greater CPU burden on the VMs.
- Self-managing the load balancer's SSL certificates is acceptable to you. Google-managed SSL certificates are only available for HTTPS and SSL Proxy Load Balancing.
- You need to forward the original packets unproxied.
- You have an existing setup that uses a pass-through load balancer, and you want to migrate it without changes.
About Network Load Balancing
Network Load Balancing has the following characteristics:
- Network Load Balancing is a managed service.
- Network Load Balancing is implemented using Andromeda virtual networking and Google Maglev.
- Network load balancers are not proxies.
- Responses from the backend VMs go directly to the clients, not back through the load balancer. The industry term for this is direct server return.
- The load balancer preserves the source IP addresses of packets.
- The destination IP address for packets is the regional external IP address associated with the load balancer's forwarding rule.
Instances that participate as backend VMs for network load balancers must be running the appropriate Linux Guest Environment, Windows Guest Environment, or other processes that provide equivalent functionality.
The guest OS environment (or an equivalent process) is responsible for configuring local routes on each backend VM. These routes allow the VM to accept packets that have a destination that matches the IP address of the load balancer's forwarding rule.
On the backend instances that accept load balanced traffic, you must configure the software to bind to the IP address associated with the load balancer's forwarding rule (or to any IP address,
Network load balancers balance the load on your systems based on incoming IP protocol data, such as address, port, and protocol type.
The network load balancer is a pass-through load balancer, so your backends receive the original client request. The network load balancer doesn't do any Transport Layer Security (TLS) offloading or proxying. Traffic is directly routed to your VMs.
When you create a forwarding rule for the load balancer, you receive an ephemeral virtual IP address (VIP) or reserve a VIP that originates from a regional network block.
You then associate that forwarding rule with your backends. The VIP is anycasted from Google's global points of presence, but the backends for a network load balancer are regional. The load balancer cannot have backends that span multiple regions.
You can use GCP firewalls to control or filter access to the backend VMs.
The network load balancer examines the source and destination ports, IP address, and protocol to determine how to forward packets. For TCP traffic, you can modify the forwarding behavior of the load balancer by configuring session affinity.
Load distribution algorithm
By default, to distribute traffic to instances, the
session affinity value is
NONE. Cloud Load Balancing picks an instance based on a hash of the
source IP and port, destination IP and port, and protocol. This means that
incoming TCP connections are spread across instances and each new connection may
go to a different instance. All packets for a connection are directed to the same
instance until the connection is closed. Established connections are not taken
into account in the load balancing process.
Regardless of the session affinity setting, all packets for a connection are directed to the chosen instance until the connection is closed. An existing connection has no impact on load balancing decisions for new incoming connections. This can result in an imbalance among backends if long-lived TCP connections are in use.
You can choose a different session affinity setting if you need multiple
connections from a client to go to the same instance. See
the Target Pools
documentation for more information.
A Target Pool resource defines a group of instances that should receive incoming traffic from forwarding rules. When a forwarding rule directs traffic to a target pool, Cloud Load Balancing picks an instance from these target pools based on a hash of the source IP and port and the destination IP and port. See the section on the Load distribution algorithm for more information about how traffic is distributed to instances.
Target pools can only be used with forwarding rules that handle TCP and UDP traffic. For all other protocols, you must create a target instance. You must create a target pool before you can use it with a forwarding rule. Each project can have up to 50 target pools.
If you intend for your target pool to contain a single virtual machine instance, you should consider using the Protocol Forwarding feature instead.
Network Load Balancing supports Cloud Load Balancing Autoscaler, which allows users to perform autoscaling on the instance groups in a target pool based on CPU utilization. For more information, see Scaling based on CPU utilization.
Learn more about target pools and how to configure them.
Forwarding rules work in conjunction with target pools and target instances to support load balancing and protocol forwarding features. To use load balancing and protocol forwarding, you must create a forwarding rule that directs traffic to specific target pools (for load balancing) or target instances (for protocol forwarding). It is not possible to use either of these features without a forwarding rule.
Forwarding Rule resources live in the Forwarding Rules collection. Each forwarding rule matches a particular IP address, protocol, and optionally, port range to a single target pool or target instance. When traffic is sent to an external IP address that is served by a forwarding rule, the forwarding rule directs that traffic to the corresponding target pool or target instances. You can create up to 50 forwarding rule objects per project.
Learn more about forwarding rules and how to configure them.
If you are load balancing UDP packets that are likely to be fragmented before arriving at your Google Cloud Platform (GCP) Virtual Private Cloud (VPC) network, see Load balancing and fragmented UDP packets.
Multiple forwarding rules
You can configure multiple regional external forwarding rules for the same network TCP/UDP load balancer. Each forwarding rule can have a unique regional external IP address, or multiple forwarding rules can reference the same regional external IP address. Multiple forwarding rules can reference the same target proxy.
Configuring multiple regional external forwarding rules can be useful for these use cases:
- You need to configure more than one external IP address for the same target pool.
- You need to configure different port ranges or different protocols, using the same external IP address, for the same target pool.
When using multiple forwarding rules, make sure that you configure the software running on your backend VMs so that it binds to all necessary IP addresses. This is required because the destination IP addresses for packets delivered through the load balancer is the regional external IP address associated with the respective regional external forwarding rule.
Health checks ensure that Compute Engine forwards new connections only to instances that are up and ready to receive them. Compute Engine sends health check requests to each instance at the specified frequency. Once an instance exceeds its allowed number of health check failures, it is no longer considered an eligible instance for receiving new traffic. Existing connections are not actively terminated, which allows instances to shut down gracefully and close TCP connections.
The health checker continues to query unhealthy instances, and returns an
instance to the pool when the specified number of successful checks occur. If
all instances are marked as
UNHEALTHY, the load balancer directs new traffic
to all existing instances.
Network Load Balancing relies on legacy HTTP Health checks to determine instance health. Even if your service does not use HTTP, you must run a basic web server on each instance that the health check system can query.
Google Cloud uses special routes not defined in your VPC network for health checks. For complete information on this, read Load balancer return paths.
Firewall rules and Network Load Balancing
Health checks for Network load balancers are sent from the following IP ranges. You'll need to create ingress allow firewall rules that permit traffic from those ranges. For an example firewall rule, refer to rules for Network Load Balancing.
Network Load Balancing is a pass-through load balancer, which means that your firewall rules must allow traffic from the client source IP addresses. If your service is open to the Internet, then it is easiest to allow traffic from all IP ranges. If you want to restrict access so that only certain source IP addresses are allowed, you can set up firewall rules to enforce that restriction, but you must allow access from the health check IP ranges.
For a configuration example, refer to Rules for Network Load Balancing.
Network Load Balancing doesn't use backend services session affinity. Instead, network load balancers use target pools for session affinity.
sessionAffinity parameter in
Load balancing and fragmented UDP packets
If you are load balancing UDP packets, be aware of the following:
- Unfragmented packets are handled normally in all configurations.
- UDP packets may become fragmented before reaching GCP. Intervening networks may wait for all fragments to arrive before forwarding them, causing delay, or may drop fragments. GCP does not wait for all fragments; it forwards each fragment as soon as it arrives.
Since subsequent UDP fragments do not contain the destination port, problems can occur in these situations:
- If Target Pools session
affinity is set to
NONE(5-tuple affinity), the subsequent fragments may be dropped because the load balancer cannot calculate the 5-tuple hash.
- If there is more than one UDP forwarding rule for the same load balanced IP address, subsequent fragments may arrive at the wrong forwarding rule.
- If Target Pools session affinity is set to
If you expect fragmented UDP packets, do the following:
- Set session affinity to
CLIENT_IP. Do not use
NONE(5-tuple hashing). Since
CLIENT_IPdo not use the destination port for hashing, they can calculate the same hash for subsequent fragments as for the first fragment.
- Use only one UDP forwarding rule per load balanced IP address. This ensures that all fragments arrive at the same forwarding rule.
With these settings, UDP fragments from the same packet are forwarded to the same instance for reassembly.
See Setting Up Network Load Balancing for information on how to configure a network load balancer and distribute traffic across a set of Apache instances.