Setting Up Network Load Balancing

Network load balancing allows you to balance load of your systems based on incoming IP protocol data, such as address, port, and protocol type.

Network load balancing uses forwarding rules that point to target pools, which list the instances available for load balancing and define which type of health check that should be performed on these instances. See the Network load balancing example for more information.

Network load balancing is a regional, non-proxied load balancer. You can use it to load balance UDP traffic, and TCP and SSL traffic on ports that are not supported by the SSL proxy and TCP proxy load balancers.

A network load balancer is a pass-through load balancer. It does not proxy connections from clients.

For other types of load balancers, see the main load balancing page.

Load distribution algorithm

By default, to distribute traffic to instances, the Session Affinity is set to NONE. Google Compute Engine picks an instance based on a hash of the source IP and port, destination IP and port, and protocol. This means that incoming TCP connections are spread across instances and each new connection may go to a different instance. All packets for a connection are directed to the same instance until the connection is closed. Established connections are not taken into account when balancing.

Regardless of the session affinity setting, all packets for a connection are directed to the chosen instance until the connection is closed and have no impact on load balancing decisions for new incoming connections. This can result in imbalance between backends if long-lived TCP connections are in use.

You can choose a different Session Affinity setting if you need multiple connections from a client to go to the same instance. See sessionAffinity in the Target Pools documentation for more information.

Target pools

A Target Pool resource defines a group of instances that should receive incoming traffic from forwarding rules. When a forwarding rule directs traffic to a target pool, Google Compute Engine picks an instance from these target pools based on a hash of the source IP and port and the destination IP and port. See the Load distribution algorithm for more information about how traffic is distributed to instances.

Target pools can only be used with forwarding rules that handle TCP and UDP traffic. For all other protocols, you must create a target instance. You must create a target pool before you can use it with a forwarding rule. Each project can have up to 50 target pools.

If you intend for your target pool to contain a single virtual machine instance, you should consider using the Protocol Forwarding feature instead.

Network load balancing supports Compute Engine Autoscaler, which allows users to perform autoscaling on the instance groups in a target pool based on CPU utilization. For more information, see Scaling based on CPU utilization.

Learn more about target pools and how to configure them.

Forwarding rules

Forwarding rules work in conjunction with target pools and target instances to support load balancing and protocol forwarding features. To use load balancing and protocol forwarding, you must create a forwarding rule that directs traffic to specific target pools (for load balancing) or target instances (for protocol forwarding). It is not possible to use either of these features without a forwarding rule.

Forwarding Rule resources live in the Forwarding Rules collection. Each forwarding rule matches a particular IP address, protocol, and optionally, port range to a single target pool or target instance. When traffic is sent to an external IP address that is served by a forwarding rule, the forwarding rule directs that traffic to the corresponding target pool or target instances. You can create up to 50 forwarding rule objects per project.

Learn more about forwarding rules and how to configure them.

If you are load balancing UDP packets that are likely to be fragmented before arriving at your Google Cloud Platform (GCP) network, see Load balancing and fragmented UDP packets.

Health checking

Health checks ensure that Compute Engine forwards new connections only to instances that are up and ready to receive them. Compute Engine sends health check requests to each instance at the specified frequency; once an instance exceeds its allowed number of health check failures, it is no longer considered an eligible instance for receiving new traffic. It will continue to receive packets for its existing connections until they're terminated or the instance terminates. This allows instances to shut down gracefully without abruptly breaking TCP connections.

The health check continues to query unhealthy instances, and returns an instance to the pool once the specified number of successful checks is met.

Network load balancing relies on legacy HTTP Health checks for determining instance health. Even if your service does not use HTTP, you'll need to at least run a basic web server on each instance that the health check system can query.

Firewall rules and Network load balancing

HTTP health check probes are sent from the IP ranges 209.85.152.0/22, 209.85.204.0/22, and 35.191.0.0/16. You must create firewall rules that allows traffic from those ranges to reach your instances on port 80.

Network load balancing is a pass-through load balancer, which means that your firewall rules must allow traffic from the client source IP addresses. If your service is open to the Internet, then it is easiest to allow traffic from all IP ranges. If you want to restrict access so that only certain source IPs are allowed, you may set up firewall rules to enforce that restriction, but you must allow access from the health check IP ranges.

Session affinity

See the sessionAffinity parameter in Target Pools.

Load balancing and fragmented UDP packets

If you are load balancing UDP packets, be aware of the following:

  1. Unfragmented packets are handled normally in all configurations.
  2. UDP packets may become fragmented before reaching GCP. Intervening networks may wait for all fragments to arrive before forwarding them, causing delay, or may drop fragments. GCP does not wait for all fragments; it forwards each fragment as soon as it arrives.
  3. The fact that UDP fragments after the first one do not contain the destination port causes problems in two situations:

    • if Target Pools session affinity is set to NONE (5-tuple affinity), the subesequent fragments may be dropped because the load balancer cannot calculate the 5-tuple hash.
    • if there is more than one UDP forwarding rule for the same load balanced IP address, subsequent fragments may arrive at the wrong forwarding rule.

If you expect fragmented UDP packets, do the following:

  • set session affinity to CLIENT_IP_PROTO or CLIENT_IP. Do not use NONE (5-tuple hashing). CLIENT_IP_PROTO and CLIENT_IP do not use the destination port for hashing, so can calculate the same hash for subsequent fragments as for the first fragment.
  • use only one UDP forwarding rule per load balanced IP address. This ensures that all fragments arrive at the same forwarding rule.

With these settings, UDP fragments from the same packet are forwarded to the same instance for reassembly.

Get started

The network load balancing guide demonstrates how to quickly configure a load balancing solution and distribute traffic across a set of Apache instances. You can build on top of this scenario to work for other types of traffic or more complex configurations.

Get started

Send feedback about...

Compute Engine Documentation