TCP Proxy Load Balancing overview

TCP Proxy Load Balancing is a reverse proxy load balancer that distributes TCP traffic coming from the internet to virtual machine (VM) instances in your Google Cloud VPC network. When using TCP Proxy Load Balancing, traffic coming over a TCP connection is terminated at the load balancing layer, and then forwarded to the closest available backend using TCP or SSL.

TCP Proxy Load Balancing lets you use a single IP address for all users worldwide. The TCP proxy load balancer automatically routes traffic to the backends that are closest to the user.

With the Premium Tier, TCP Proxy Load Balancing can be configured as a global load balancing service. With Standard Tier, the TCP proxy load balancer handles load balancing regionally. For details, see Load balancer behavior in Network Service Tiers.

In this example, the connections for traffic from users in Seoul and Boston are terminated at the load balancing layer. These connections are labeled 1a and 2a. Separate connections are established from the load balancer to the selected backend instances. These connections are labeled 1b and 2b.

Cloud Load Balancing with TCP termination (click to enlarge)
Cloud Load Balancing with TCP termination (click to enlarge)

TCP Proxy Load Balancing is intended for TCP traffic on specific well-known ports, such as port 25 for Simple Mail Transfer Protocol (SMTP). For more information, see Port specifications. For client traffic that is encrypted on these same ports, use SSL Proxy Load Balancing.

For information about how the Google Cloud load balancers differ from each other, see the following documents:


Some benefits of the TCP proxy load balancer include:

  • IPv6 termination. TCP Proxy Load Balancing supports both IPv4 and IPv6 addresses for client traffic. Client IPv6 requests are terminated at the load balancing layer, and then proxied over IPv4 to your backends.
  • Intelligent routing. The load balancer can route requests to backend locations where there is capacity. In contrast, an L3/L4 load balancer must route to regional backends without considering capacity. The use of smarter routing allows provisioning at N+1 or N+2 instead of x*N.
  • Security patching. If vulnerabilities arise in the TCP stack, Cloud Load Balancing applies patches at the load balancer automatically to keep your backends safe.
  • Support for the following well-known TCP ports. 25, 43, 110, 143, 195, 443, 465, 587, 700, 993, 995, 1883, 3389, 5222, 5432, 5671, 5672, 5900, 5901, 6379, 8085, 8099, 9092, 9200, and 9300.

Load balancer behavior in Network Service Tiers

TCP Proxy Load Balancing can be configured as a global load balancing service with Premium Tier, and as a regional service in the Standard Tier.

Premium Tier

You can have only one backend service, and the backend service can have backends in multiple regions. For global load balancing, you deploy your backends in multiple regions, and the load balancer automatically directs traffic to the region closest to the user. If a region is at capacity, the load balancer automatically directs new connections to another region with available capacity. Existing user connections remain in the current region.

Traffic is allocated to backends as follows:

  1. When a client sends a request, the load balancing service determines the approximate origin of the request from the source IP address.
  2. The load balancing service determines the locations of the backends owned by the backend service, their overall capacity, and their overall current usage.
  3. If the closest backend instances to the user have available capacity, the request is forwarded to that closest set of backends.
  4. Incoming requests to the given region are distributed evenly across all available backend instances in that region. However, at very small loads, the distribution might appear to be uneven.
  5. If there are no healthy backend instances with available capacity in a given region, the load balancer instead sends the request to the next closest region with available capacity.

Standard Tier

With Standard Tier, TCP Proxy Load Balancing is a regional service. Its backends must all be located in the region used by the load balancer's external IP address and forwarding rule.


The following are components of TCP proxy load balancers.

Forwarding rules and IP addresses

Forwarding rules route traffic by IP address, port, and protocol to a load balancing configuration consisting of a target proxy and a backend service.

Each forwarding rule provides a single IP address that you can use in DNS records for your application. No DNS-based load balancing is required. You can either reserve a static IP address that you can use or let Cloud Load Balancing assign one for you. We recommend that you reserve a static IP address; otherwise, you must update your DNS record with the newly- assigned ephemeral IP address whenever you delete a forwarding rule and create a new one.

External forwarding rules used in the definition of a TCP proxy load balancer can reference exactly one of the ports listed in: Port specifications for forwarding rules.

Target proxies

TCP Proxy Load Balancing terminates TCP connections from the client and creates new connections to the backends. By default, the original client IP address and port information is not preserved. You can preserve this information by using the PROXY protocol. The target proxies route incoming requests directly to backend services.

Backend services

Backend services direct incoming traffic to one or more attached backends. Each backend is composed of an instance group or network endpoint group, and information about the backend's serving capacity. Backend serving capacity can be based on CPU or requests per second (RPS).

TCP proxy load balancers each have a single backend service resource. Changes to the backend service are not instantaneous. It can take several minutes for changes to propagate to Google Front Ends (GFEs).

Each backend service specifies the health checks to perform for the available backends.

To ensure minimal interruptions to your users, you can enable connection draining on backend services. Such interruptions might happen when a backend is terminated, removed manually, or removed by an autoscaler. To learn more about using connection draining to minimize service interruptions, see Enabling connection draining.

Protocol for communicating with the backends

When you configure a backend service for the TCP proxy load balancer, you set the protocol that the backend service uses to communicate with the backends. You can choose either SSL or TCP. The load balancer uses only the protocol that you specify, and does not attempt to negotiate a connection with the other protocol.

Firewall rules

The backend instances must allow connections from the load balancer GFE/health check ranges. This means that you must create an ingress allow firewall rule for traffic from and to your backend instances or endpoints. These IP address ranges are used as sources for health check packets and for all load-balanced packets sent to your backends.

The ports you configure for this firewall rule must:

  • Allow traffic to the destination port needed by each backend service's configured health check.

  • For instance group backends: Allow traffic to the destination port matching the port number(s) to which the backend service's named port subscribes.

  • For GCE_VM_IP_PORT NEG backends: Allow traffic to the port(s) of the endpoints in the NEGs.

Firewall rules are implemented at the VM instance level, not on Google Front End (GFE) proxies. You cannot use Google Cloud firewall rules to prevent traffic from reaching the load balancer.

For more information about health check probes and why it's necessary to allow traffic from and, see Probe IP ranges and firewall rules.

Source IP addresses

The source IP addresses for packets, as seen by each backend virtual machine (VM) instance or container, is an IP address from these ranges:


The source IP address for actual load-balanced traffic is the same as the health checks probe IP ranges.

The source IP addresses for traffic, as seen by the backends, is not the Google Cloud external IP address of the load balancer. In other words, there are two HTTP, SSL, or TCP sessions:

  • Session 1, from original client to the load balancer (GFE):

    • Source IP address: the original client (or external IP address if the client is behind NAT).
    • Destination IP address: your load balancer's IP address.
  • Session 2, from the load balancer (GFE) to the backend VM or container:

    • Source IP address: an IP address in one of these ranges: or

      You cannot predict the actual source address.

    • Destination IP address: the internal IP address of the backend VM or container in the Virtual Private Cloud (VPC) network.

Open ports

The TCP proxy load balancers are reverse proxy load balancers. The load balancer terminates incoming connections, and then opens new connections from the load balancer to the backends. The reverse proxy functionality is provided by Google Front Ends (GFE).

GFEs have several open ports to support other Google Cloud load balancers and other Google services. If you run a security or port scan against the external IP address of your load balancer, additional ports appear to be open.

This does not affect TCP proxy load balancers. Each external forwarding rule that you use in a TCP proxy load balancer can reference exactly one of the ports listed in: Port specifications for forwarding rules. Traffic with a different TCP destination port is not forwarded to the load balancer's backend. You can verify that traffic to additional ports is not processed by attempting to open a TCP session to an unauthorized port. The GFE that handles your request closes the connection with a TCP reset (RST) packet.

Traffic distribution

The way a TCP proxy load balancer distributes traffic to its backends depends on the balancing mode and the hashing method selected to choose a backend (session affinity).

Balancing mode

When you add a backend to the backend service, you set a load balancing mode.

For TCP Proxy Load Balancing, the balancing mode can be CONNECTION or UTILIZATION.

If the load balancing mode is CONNECTION, the load is spread based on how many concurrent connections the backend can handle. You must also specify exactly one of the following parameters: maxConnections (except for regional managed instance groups), maxConnectionsPerInstance, or maxConnectionsPerEndpoint.

If the load balancing mode is UTILIZATION, the load is spread based on the utilization of instances in an instance group.

For information about comparing the load balancer types and the supported balancing modes, see Load balancing methods.

Session affinity

Session affinity sends all requests from the same client to the same backend, if the backend is healthy and has capacity.

TCP Proxy Load Balancing offers client IP affinity, which forwards all requests from the same client IP address to the same backend.


If a backend becomes unhealthy, traffic is automatically redirected to healthy backends within the same region. If all backends within a region are unhealthy, traffic is distributed to healthy backends in other regions (Premium Tier only). If all backends are unhealthy, the load balancer drops traffic.

What's next