Internal proxy Network Load Balancer overview

The Google Cloud internal proxy Network Load Balancer is a proxy-based load balancer powered by open source Envoy proxy software and the Andromeda network virtualization stack.

The internal proxy Network Load Balancer is a Layer 4 load balancer that lets you run and scale your TCP service traffic behind a regional internal IP address that is accessible only to clients in the same VPC network or clients connected to your VPC network. The load balancer first terminates the TCP connection between the client and the load balancer at an Envoy proxy. The proxy opens a second TCP connection to backends hosted in Google Cloud, on-premises, or other cloud environments. For more use cases, see Proxy Network Load Balancer overview.

Modes of operation

You can configure an internal proxy Network Load Balancer in the following modes:

Regional internal proxy Network Load Balancer. This is a regional load balancer that is implemented as a managed service based on the open source Envoy proxy. With regional mode, all clients and backends are from a specified region, which helps when you need regional compliance.

Cross-region internal proxy Network Load Balancer. This is a multi-region load balancer that is implemented as a managed service based on the open source Envoy proxy. The cross-region mode lets you load balance traffic to backend services that are globally distributed with backends in multiple regions, including traffic management that ensures traffic is directed to the closest backend. This load balancer also enables high availability. Placing backends in multiple regions helps avoid failures in a single region. If one region's backends are down, traffic can fail over to another region. The load balancer's forwarding rule IP addresses are always accessible in all regions.

The following table describes the important differences between regional and cross-region modes:

Feature	Regional internal proxy Network Load Balancer	Cross-region internal proxy Network Load Balancer
Virtual IP address (VIP) of the load balancer.	Allocated from a subnet in a specific Google Cloud region.	Allocated from a subnet in a specific Google Cloud region. VIP addresses from multiple regions can share the same global backend service.
Client access	Not globally accessible by default. You can optionally enable global access.	Always globally accessible. Clients from any Google Cloud region can send traffic to the load balancer.
Load balanced backends	Regional backends. Load balancer can only send traffic to backends that are in the same region as the proxy of the load balancer.	Global backends. Load balancer can send traffic to backends in any region.
High availability and failover	Automatic failover to healthy backends in the same region.	Automatic failover to healthy backends in the same or different regions.

Identify the mode

Console

In the Google Cloud console, go to the Load balancing page.

Go to Load balancing

In the Load Balancers tab, you can see the load balancer type, protocol, and region. If the region is blank, then the load balancer is in the cross-region mode. The following table summarizes how to identify the mode of the load balancer.

Load balancer mode	Load balancer type	Access type	Region
Regional internal proxy Network Load Balancer	Network (Proxy)	Internal	Specifies a region
Cross-region internal proxy Network Load Balancer	Network (Proxy)	Internal

gcloud

To determine the mode of a load balancer, run the following command:

gcloud compute forwarding-rules describe FORWARDING_RULE_NAME

In the command output, check the load balancing scheme, region, and network tier. The following table summarizes how to identify the mode of the load balancer.

Load balancer mode	Load balancing scheme	Forwarding rule
Regional internal proxy Network Load Balancer	INTERNAL_MANAGED	Regional
Cross-region internal proxy Network Load Balancer	INTERNAL_MANAGED	Global

Architecture

The following diagram shows the Google Cloud resources required for internal proxy Network Load Balancers.

Regional

This diagram shows the components of a regional internal proxy Network Load Balancer deployment in Premium Tier.

Cross-region

This diagram shows the components of a cross-region internal proxy Network Load Balancer deployment in Premium Tier within the same VPC network. Each global forwarding rule uses a regional IP address that the clients use to connect.

Proxy-only subnet

In the previous diagram, the proxy-only subnet provides a set of IP addresses that Google uses to run Envoy proxies on your behalf. You must create a proxy-only subnet in each region of a VPC network where you use an Envoy-based internal proxy Network Load Balancer.

The following table describes proxy-only subnet requirements for internal proxy Network Load Balancers:

Load balancer mode Value of the proxy-only subnet --purpose flag

Regional internal proxy Network Load Balancer

Load balancer mode	Value of the proxy-only subnet `--purpose` flag
Regional internal proxy Network Load Balancer	`REGIONAL_MANAGED_PROXY` Regional and cross-region load balancers cannot share the same subnets All the regional Envoy-based load balancers in a region and VPC network share the same proxy-only subnet.
Cross-region internal proxy Network Load Balancer	`GLOBAL_MANAGED_PROXY` Regional and cross-region load balancers cannot share the same subnets The cross-region Envoy-based load balancer must have a proxy-only subnet in each region in which the load balancer is configured. Cross-region load balancer proxies in the same region and network share the same proxy-only subnet.

REGIONAL_MANAGED_PROXY

Regional and cross-region load balancers cannot share the same subnets

All the regional Envoy-based load balancers in a region and VPC network share the same proxy-only subnet.

Cross-region internal proxy Network Load Balancer

GLOBAL_MANAGED_PROXY

Regional and cross-region load balancers cannot share the same subnets

The cross-region Envoy-based load balancer must have a proxy-only subnet in each region in which the load balancer is configured. Cross-region load balancer proxies in the same region and network share the same proxy-only subnet.

Further:

Proxy-only subnets are only used for Envoy proxies, not your backends.
Backend virtual machine (VM) instances or endpoints of all internal proxy Network Load Balancers in a region and VPC network receive connections from the proxy-only subnet.
The IP address of an internal proxy Network Load Balancer is not located in the proxy-only subnet. The load balancer's IP address is defined by its internal managed forwarding rule.

Forwarding rules and IP addresses

Forwarding rules route traffic by IP address, port, and protocol to a load balancing configuration consisting of a target proxy and a backend service.

IP address specification. Each forwarding rule references a single regional IP address that you can use in DNS records for your application. You can either reserve a static IP address that you can use or let Cloud Load Balancing assign one for you. We recommend that you reserve a static IP address; otherwise, you must update your DNS record with the newly assigned ephemeral IP address whenever you delete a forwarding rule and create a new one.

Clients use the IP address and port to connect to the load balancer's Envoy proxies—the forwarding rule's IP address is the IP address of the load balancer (sometimes called a virtual IP address or VIP). Clients connecting to a load balancer must use TCP. For the complete list of supported protocols, see Load balancer feature comparison.

The internal IP address associated with the forwarding rule can come from a subnet in the same network and region as your backends.

Port specification. Each forwarding rule that you use in an internal proxy Network Load Balancer can reference a single port from 1-65535. To support multiple ports, you must configure multiple forwarding rules.

The following table shows the forwarding rule requirements for internal proxy Network Load Balancers:

Load balancer mode Forwarding rule, IP address, and proxy-only subnet --purpose Routing from the client to the load balancer's frontend

Regional internal proxy Network Load Balancer

Load balancer mode	Forwarding rule, IP address, and proxy-only subnet `--purpose`	Routing from the client to the load balancer's frontend
Regional internal proxy Network Load Balancer	Regional `forwardingRules` Regional IP address Load balancing scheme: `INTERNAL_MANAGED` Proxy-only subnet `--purpose`: `REGIONAL_MANAGED_PROXY` IP address `--purpose`: `SHARED_LOADBALANCER_VIP`	You can enable global access to allow clients from any region to access your load balancer. Backends must also be in the same region as the load balancer.
Cross-region internal proxy Network Load Balancer	Global `globalForwardingRules` Regional IP addresses Load balancing scheme: `INTERNAL_MANAGED` Proxy-only subnet `--purpose`: `GLOBAL_MANAGED_PROXY` IP address `--purpose`: `SHARED_LOADBALANCER_VIP`	Global access is enabled by default to allow clients from any region to access your load balancer. Backends can be in multiple regions.

Regional forwardingRules

Regional IP address

Load balancing scheme:

INTERNAL_MANAGED

Proxy-only subnet --purpose:

REGIONAL_MANAGED_PROXY

IP address --purpose:

SHARED_LOADBALANCER_VIP

You can enable global access to allow clients from any region to access your load balancer. Backends must also be in the same region as the load balancer.

Cross-region internal proxy Network Load Balancer

Global globalForwardingRules

Regional IP addresses

Load balancing scheme:

INTERNAL_MANAGED

Proxy-only subnet --purpose:

GLOBAL_MANAGED_PROXY

IP address --purpose:

SHARED_LOADBALANCER_VIP

Global access is enabled by default to allow clients from any region to access your load balancer. Backends can be in multiple regions.

Forwarding rules and VPC networks

This section describes how forwarding rules used by external Application Load Balancers are associated with VPC networks.

Load balancer mode	VPC network association
Cross-region internal proxy Network Load Balancer Regional internal proxy Network Load Balancer	Regional internal IPv4 addresses always exist inside VPC networks. When you create the forwarding rule, you're required to specify the subnet from which the internal IP address is taken. This subnet must be in the same region and VPC network as a proxy-only subnet. created. Thus, there is an implied network association.

Target proxies

The internal proxy Network Load Balancer terminates TCP connections from the client and creates new connections to the backends. By default, the original client IP address and port information isn't preserved. You can preserve this information by using the PROXY protocol. The target proxy routes incoming requests directly to the load balancer's backend service.

The following table shows the target proxy APIs required by internal proxy Network Load Balancers:

Load balancer mode	Target proxy
Regional internal proxy Network Load Balancer	Regional `regionTargetTcpProxies`
Cross-region internal proxy Network Load Balancer	Global `targetTcpProxies`

Backend service

A backend service directs incoming traffic to one or more attached backends. A backend is either an instance group or a network endpoint group. The backend contains balancing mode information to define fullness based on connections (or, for instance group backends only, utilization).

Each internal proxy Network Load Balancer has a single backend service resource.

The following table specifies the backend service requirements for internal proxy Network Load Balancers:

Load balancer mode	Backend service type
Regional internal proxy Network Load Balancer	Regional `regionBackendServices`
Cross-region internal proxy Network Load Balancer	Global `backendServices`

Supported backends

The backend service supports the following types of backends:

Load balancer mode	Supported backends on a backend service
Load balancer mode	Instance groups	Zonal NEGs	Internet NEGs	Serverless NEGs	Hybrid NEGs	Private Service Connect NEGs	GKE
Regional internal proxy Network Load Balancer		`GCE_VM_IP_PORT` type endpoints	Regional NEGs only			Add a Private Service Connect NEG
Cross-region internal proxy Network Load Balancer		`GCE_VM_IP_PORT` type endpoints				Add a Private Service Connect NEG

All of the backends must be of the same type (instance groups or NEGs). You can simultaneously use different types of instance group backends, or you can simultaneously use different types of NEG backends, but you cannot use instance group and NEG backends together on the same backend service.

You can mix zonal NEGs and hybrid NEGs within the same backend service.

To minimize service interruptions to your users, enable connection draining on backend services. Such interruptions can happen when a backend is terminated, removed manually, or removed by an autoscaler. To learn more about using connection draining to minimize service interruptions, see Enable connection draining.

Backends and VPC networks

For instance groups, zonal NEGs, and hybrid connectivity NEGs, all backends must be located in the same project and region as the backend service. However, a load balancer can reference a backend that uses a different VPC network in the same project as the backend service. Connectivity between the load balancer's VPC network and the backend VPC network can be configured using either VPC Network Peering, Cloud VPN tunnels, Cloud Interconnect VLAN attachments, or a Network Connectivity Center framework.

Backend network definition
- For zonal NEGs and hybrid NEGs, you explicitly specify the VPC network when you create the NEG.
- For managed instance groups, the VPC network is defined in the instance template.
- For unmanaged instance groups, the instance group's VPC network is set to match the VPC network of the nic0 interface for the first VM added to the instance group.
Backend network requirements

Your backend's network must satisfy one of the following network requirements:
- The backend's VPC network must exactly match the forwarding rule's VPC network.
- The backend's VPC network must be connected to the forwarding rule's VPC network using VPC Network Peering. You must configure subnet route exchanges to allow communication between the proxy-only subnet in the forwarding rule's VPC network and the subnets used by the backend instances or endpoints.

Both the backend's VPC network and the forwarding rule's VPC network must be VPC spokes attached to the same Network Connectivity Center hub. Import and export filters must allow communication between the proxy-only subnet in the forwarding rule's VPC network and the subnets used by backend instances or endpoints.

For all other backend types, all backends must be located in the same VPC network and region.

Backends and network interfaces

If you use instance group backends, packets are always delivered to nic0. If you want to send packets to non-nic0 interfaces (either vNICs or Dynamic Network Interfaces), use NEG backends instead.

If you use zonal NEG backends, packets are sent to whatever network interface is represented by the endpoint in the NEG. The NEG endpoints must be in the same VPC network as the NEG's explicitly defined VPC network.

Protocol for communicating with the backends

When you configure a backend service for an internal proxy Network Load Balancer, you set the protocol that the backend service uses to communicate with the backends. The load balancer uses only the protocol that you specify, and doesn't attempt to negotiate a connection with the other protocol. The internal proxy Network Load Balancers only support TCP for communicating with the backends.

Health check

Each backend service specifies a health check that periodically monitors the backends' readiness to receive a connection from the load balancer. This reduces the risk that requests might be sent to backends that can't service the request. Health checks don't check if the application itself is working.

Health check protocol

Although it is not required and not always possible, it is a best practice to use a health check whose protocol matches the protocol of the backend service. For example, a TCP health check most accurately tests TCP connectivity to backends. For the list of supported health check protocols, see the Health checks section of the Load balancer feature comparison page.

The following table specifies the scope of health checks supported by internal proxy Network Load Balancers in each mode:

Load balancer mode	Health check type
Regional internal proxy Network Load Balancer	Regional `regionHealthChecks`
Cross-region internal proxy Network Load Balancer	Global `healthChecks`

For more information about health checks, see the following:

Firewall rules

Internal proxy Network Load Balancers require the following firewall rules:

An ingress allow rule that permits traffic from the Google health check probes. For more information about the specific health check probe IP address ranges and why it's necessary to allow traffic from them, see Probe IP ranges and firewall rules.
An ingress allow rule that permits traffic from the proxy-only subnet.

The ports for these firewall rules must be configured as follows:

Allow traffic to the destination port for each backend service's health check.
For instance group backends: Determine the ports to be configured by the mapping between the backend service's named port and the port numbers associated with that named port on each instance group. The port numbers can vary among instance groups assigned to the same backend service.
For GCE_VM_IP_PORT NEG backends, allow traffic to the port numbers of the endpoints.

There are certain exceptions to the firewall rule requirements for these load balancers:

Allowing traffic from Google's health check probe ranges isn't required for hybrid NEGs. However, if you're using a combination of hybrid and zonal NEGs in a single backend service, you need to allow traffic from the Google health check probe ranges for the zonal NEGs.
For regional internet NEGs, health checks are optional. Traffic from load balancers using regional internet NEGs originates from the proxy-only subnet and is then NAT-translated (by using Cloud NAT) to either manually or automatically allocated NAT IP addresses. This traffic includes both health check probes and user requests from the load balancer to the backends. For details, see Regional NEGs: Use a Cloud NAT gateway.

Client access

Clients can be in the same network or in a VPC network connected by using VPC Network Peering.

For regional internal proxy Network Load Balancers, clients must be in the same region as the load balancer by default. You can enable global access to allow clients from any region to access your load balancer.

For cross-region internal proxy Network Load Balancers, global access is enabled by default. Clients from any region can access your load balancer.

The following table summarizes client access for regional internal proxy Network Load Balancers:

Global access disabled	Global access enabled
Clients must be in the same region as the load balancer. They also must be in the same VPC network as the load balancer or in a VPC network that is connected to the load balancer's VPC network by using VPC Network Peering.	Clients can be in any region. They still must be in the same VPC network as the load balancer or in a VPC network that's connected to the load balancer's VPC network by using VPC Network Peering.
On-premises clients can access the load balancer through Cloud VPN tunnels or VLAN attachments. These tunnels or attachments must be in the same region as the load balancer.	On-premises clients can access the load balancer through Cloud VPN tunnels or VLAN attachments. These tunnels or attachments can be in any region.

Shared VPC architecture

The internal proxy Network Load Balancer supports networks that use Shared VPC. Shared VPC lets you maintain a clear separation of responsibilities between network administrators and service developers. Your development teams can focus on building services in service projects, and the network infrastructure teams can provision and administer load balancing. If you're not already familiar with Shared VPC, read the Shared VPC overview documentation.

IP address	Forwarding rule	Target proxy	Backend components
An internal IP address must be defined in the same project as the backends. For the load balancer to be available in a Shared VPC network, the internal IP address must be defined in the same service project where the backend VMs are located, and it must reference a subnet in the desired Shared VPC network in the host project. The address itself comes from the primary IP range of the referenced subnet.	An internal forwarding rule must be defined in the same project as the backends. For the load balancer to be available in a Shared VPC network, the internal forwarding rule must be defined in the same service project where the backend VMs are located, and it must reference the same subnet (in the Shared VPC network) that the associated internal IP address references.	The target proxy must be defined in the same project as the backends.	In a Shared VPC scenario, the backend VMs are typically located in a service project. A regional internal backend service and health check must be defined in that service project.

IP address

Forwarding rule

Target proxy

Backend components

An internal IP address must be defined in the same project as the backends.

For the load balancer to be available in a Shared VPC network, the internal IP address must be defined in the same service project where the backend VMs are located, and it must reference a subnet in the desired Shared VPC network in the host project. The address itself comes from the primary IP range of the referenced subnet.

An internal forwarding rule must be defined in the same project as the backends.

For the load balancer to be available in a Shared VPC network, the internal forwarding rule must be defined in the same service project where the backend VMs are located, and it must reference the same subnet (in the Shared VPC network) that the associated internal IP address references.

The target proxy must be defined in the same project as the backends.

In a Shared VPC scenario, the backend VMs are typically located in a service project. A regional internal backend service and health check must be defined in that service project.

Traffic distribution

An internal proxy Network Load Balancer distributes traffic to its backends as follows:

Connections originating from a single client are sent to the same zone as long as healthy backends (instance groups or NEGs) within that zone are available and have capacity, as determined by the balancing mode. For regional internal proxy Network Load Balancers, the balancing mode can be CONNECTION (instance group or NEG backends) or UTILIZATION (instance group backends only).
Connections from a client are sent to the same backend if you have configured session affinity.
After a backend is selected, traffic is then distributed among instances (in an instance group) or endpoints (in a NEG) according to a load balancing policy. For the load balancing policy algorithms supported, see the localityLbPolicy setting in the regional backend service API documentation.

Session affinity

Session affinity lets you configure the load balancer's backend service to send all requests from the same client to the same backend, as long as the backend is healthy and has capacity.

Internal proxy Network Load Balancers offer the following types of session affinity:

None

A session affinity setting of NONE does not mean that there is no session affinity. It means that no session affinity option is explicitly configured.

Hashing is always performed to select a backend. And a session affinity setting of NONE means that the load balancer uses a 5-tuple hash to select a backend. The 5-tuple hash consists of the source IP address, the source port, the protocol, the destination IP address, and the destination port.

A session affinity of NONE is the default value.
Client IP affinity

Client IP session affinity (CLIENT_IP) is a 2-tuple hash created from the source and destination IP addresses of the packet. Client IP affinity forwards all requests from the same client IP address to the same backend, as long as that backend has capacity and remains healthy.

When you use client IP affinity, keep the following in mind:
- The packet destination IP address is only the same as the load balancer forwarding rule's IP address if the packet is sent directly to the load balancer.
- The packet source IP address might not match an IP address associated with the original client if the packet is processed by an intermediate NAT or proxy system before being delivered to a Google Cloud load balancer. In situations where many clients share the same effective source IP address, some backend VMs might receive more connections or requests than others.

Keep the following in mind when configuring session affinity:

Don't rely on session affinity for authentication or security purposes. Session affinity can break whenever the number of serving and healthy backends changes. For more details, see Losing session affinity.
The default values of the --session-affinity and --subsetting-policy flags are both NONE, and only one of them at a time can be set to a different value.

Losing session affinity

All session affinity options require the following:

The selected backend instance or endpoint must remain configured as a backend. Session affinity can break when one of the following events occurs:
- You remove the selected instance from its instance group.
- Managed instance group autoscaling or autohealing removes the selected instance from its managed instance group.
- You remove the selected endpoint from its NEG.
- You remove the instance group or NEG that contains the selected instance or endpoint from the backend service.
The selected backend instance or endpoint must remain healthy. Session affinity can break when the selected instance or endpoint fails health checks.

All session affinity options have the following additional requirements:

The instance group or NEG that contains the selected instance or endpoint must not be full as defined by its target capacity. (For regional managed instance groups, the zonal component of the instance group that contains the selected instance must not be full.) Session affinity can break when the instance group or NEG is full and other instance groups or NEGs are not. Because fullness can change in unpredictable ways when using the UTILIZATION balancing mode, you should use the RATE or CONNECTION balancing mode to minimize situations when session affinity can break.
The total number of configured backend instances or endpoints must remain constant. When at least one of the following events occurs, the number of configured backend instances or endpoints changes, and session affinity can break:
- Adding new instances or endpoints:
  - You add instances to an existing instance group on the backend service.
  - Managed instance group autoscaling adds instances to a managed instance group on the backend service.
  - You add endpoints to an existing NEG on the backend service.
  - You add non-empty instance groups or NEGs to the backend service.
- Removing any instance or endpoint, not just the selected instance or endpoint:
  - You remove any instance from an instance group backend.
  - Managed instance group autoscaling or autohealing removes any instance from a managed instance group backend.
  - You remove any endpoint from a NEG backend.
  - You remove any existing, non-empty backend instance group or NEG from the backend service.
The total number of healthy backend instances or endpoints must remain constant. When at least one of the following events occurs, the number of healthy backend instances or endpoints changes, and session affinity can break:
- Any instance or endpoint passes its health check, transitioning from unhealthy to healthy.
- Any instance or endpoint fails its health check, transitioning from healthy to unhealthy or timeout.

Failover

If a backend becomes unhealthy, traffic is automatically redirected to healthy backends.

The following table describes the failover behavior for internal proxy Network Load Balancers:

Load balancer mode	Failover behavior	Behavior when all backends are unhealthy
Regional internal proxy Network Load Balancer	The load balancer implements a gentle failover algorithm per zone. Rather than waiting for all the backends in a zone to become unhealthy, the load balancer starts redirecting traffic to a different zone when the ratio of healthy to unhealthy backends in any zone is less than a certain percentage threshold (70%; this threshold can't be configured). If all backends in all zones are unhealthy, the load balancer immediately terminates the client connection. Envoy proxy sends traffic to healthy backends in a region based on the configured traffic distribution.	Terminates the connection
Cross-region internal proxy Network Load Balancer	Automatic failover to healthy backends in the same region or other regions. Traffic is distributed among healthy backends spanning multiple regions based on the configured traffic distribution.	Terminates the connection

Load balancer mode

Failover behavior

Behavior when all backends are unhealthy

Regional internal proxy Network Load Balancer

The load balancer implements a gentle failover algorithm per zone. Rather than waiting for all the backends in a zone to become unhealthy, the load balancer starts redirecting traffic to a different zone when the ratio of healthy to unhealthy backends in any zone is less than a certain percentage threshold (70%; this threshold can't be configured). If all backends in all zones are unhealthy, the load balancer immediately terminates the client connection.

Envoy proxy sends traffic to healthy backends in a region based on the configured traffic distribution.

Terminates the connection

Cross-region internal proxy Network Load Balancer

Automatic failover to healthy backends in the same region or other regions.

Traffic is distributed among healthy backends spanning multiple regions based on the configured traffic distribution.

Terminates the connection

Load balancing for GKE applications

If you are building applications in Google Kubernetes Engine (GKE), you can use standalone zonal NEGs to load balance traffic directly to containers. With standalone NEGs you are responsible for creating the Service object that creates the zonal NEG, and then associating the NEG with the backend service so that the load balancer can connect to the Pods.

Quotas and limits

For information about quotas and limits, see Quotas and limits.

Limitations

The internal proxy Network Load Balancer doesn't support IPv6 traffic.
The internal proxy Network Load Balancer doesn't support Shared VPC deployments where the load balancer's frontend is in one host or service project and the backend service and backends are in another service project (also known as cross-project service referencing).

Internal proxy Network Load Balancer overview

Modes of operation

Identify the mode

Console

gcloud

Architecture

Regional

Cross-region

Proxy-only subnet

Forwarding rules and IP addresses

Forwarding rules and VPC networks

Target proxies

Backend service

Supported backends

Backends and VPC networks

Backends and network interfaces

Protocol for communicating with the backends

Health check

Health check protocol

Firewall rules

Client access

Shared VPC architecture

Traffic distribution

Session affinity

Failover

Load balancing for GKE applications

Quotas and limits

Limitations

What's next