Internal proxy Network Load Balancer overview

The Google Cloud internal proxy Network Load Balancer is a proxy-based load balancer powered by open source Envoy proxy software and the Andromeda network virtualization stack.

The internal proxy Network Load Balancer is a layer 4 load balancer that enables you to run and scale your TCP service traffic behind a regional internal IP address that is accessible only to clients in the same VPC network or clients connected to your VPC network. The load balancer first terminates the TCP connection between the client and the load balancer at an Envoy proxy. The proxy opens a second TCP connection to backends hosted in Google Cloud, on-premises, or other cloud environments. For more use cases, see Proxy Network Load Balancer overview.

Modes of operation

You can configure an internal proxy Network Load Balancer in the following modes:

  • Regional internal proxy Network Load Balancer. This is a regional load balancer that is implemented as a managed service based on the open source Envoy proxy. Regional mode ensures that all clients and backends are from a specified region, which helps when you need regional compliance.
  • Cross-region internal proxy Network Load Balancer. This is a multi-region load balancer that is implemented as a managed service based on the open source Envoy proxy. The cross-region mode lets you load balance traffic to backend services that are globally distributed, including traffic management that ensures traffic is directed to the closest backend. This load balancer also enables high availability. Placing backends in multiple regions helps avoid failures in a single region. If one region's backends are down, traffic can fail over to another region.

    The following table describes the important differences between regional and cross-region modes:

    Feature Regional internal proxy Network Load Balancer Cross-region internal proxy Network Load Balancer
    Virtual IP address (VIP) of the load balancer. Allocated from a subnet in a specific Google Cloud region. Allocated from a subnet in a specific Google Cloud region.

    VIP addresses from multiple regions can share the same global backend service.

    Client access Not globally accessible by default.
    You can optionally enable global access.
    Always globally accessible. Clients from any Google Cloud region can send traffic to the load balancer.
    Load balanced backends Regional backends.
    Load balancer can only send traffic to backends that are in the same region as the proxy of the load balancer.
    Global backends.
    Load balancer can send traffic to backends in any region.
    High availability and failover Automatic failover to healthy backends in the same region. Automatic failover to healthy backends in the same or different regions.

Identify the mode

Cloud console

  1. In the Google Cloud console, go to the Load balancing page.

    Go to Load balancing

  2. In the Load Balancers tab, you can see the load balancer type, protocol, and region. If the region is blank, then the load balancer is in the cross-region mode. The following table summarizes how to identify the mode of the load balancer.

    Load balancer mode Load balancer type Access type Region
    Regional internal proxy Network Load Balancer Network (Proxy) Internal Specifies a region
    Cross-region internal proxy Network Load Balancer Network (Proxy) Internal

gcloud

  1. To determine the mode of a load balancer, run the following command:

    gcloud compute forwarding-rules describe FORWARDING_RULE_NAME
    

    In the command output, check the load balancing scheme, region, and network tier. The following table summarizes how to identify the mode of the load balancer.

    Load balancer mode Load balancing scheme Forwarding rule
    Regional internal proxy Network Load Balancer INTERNAL_MANAGED Regional
    Cross-region internal proxy Network Load Balancer INTERNAL_MANAGED Global

Architecture

The following diagram shows the Google Cloud resources required for internal proxy Network Load Balancers.

Regional

This diagram shows the components of a regional internal proxy Network Load Balancer deployment in Premium Tier.

Regional internal proxy Network Load Balancer components.
Regional internal proxy Network Load Balancer components (click to enlarge).

Cross-region

This diagram shows the components of a cross-region internal proxy Network Load Balancer deployment in Premium Tier within the same VPC network. Each global forwarding rule uses a regional IP address that the clients use to connect.

Cross-region internal proxy Network Load Balancer components.
Cross-region internal proxy Network Load Balancer components (click to enlarge).

Proxy-only subnet

In the diagram above, the proxy-only subnet provides a set of IP addresses that Google uses to run Envoy proxies on your behalf. You must create a proxy-only subnet in each region of a VPC network where you use an Envoy-based internal proxy Network Load Balancer. The following table describes the differences between proxy-only subnets in the regional and cross-region modes:

Load balancer mode Value of the proxy-only subnet --purpose flag
Regional internal proxy Network Load Balancer

REGIONAL_MANAGED_PROXY

Regional and cross-region load balancers cannot share the same subnets

All the regional Envoy-based load balancers in a region and VPC network share the same proxy-only subnet.

Cross-region internal proxy Network Load Balancer

GLOBAL_MANAGED_PROXY

Regional and cross-region load balancers cannot share the same subnets

The cross-region Envoy-based load balancer must have a proxy-only subnet in each region in which the load balancer is configured. Cross-region load balancer proxies in the same region and network share the same proxy-only subnet.

Further:

  • Proxy-only subnets are only used for Envoy proxies, not your backends.
  • Backend virtual machine (VM) instances or endpoints of all internal proxy Network Load Balancers in a region and VPC network receive connections from the proxy-only subnet.
  • The IP address of an internal proxy Network Load Balancer is not located in the proxy-only subnet. The load balancer's IP address is defined by its internal managed forwarding rule.

Forwarding rules and IP addresses

Forwarding rules route traffic by IP address, port, and protocol to a load balancing configuration consisting of a target proxy and a backend service.

Clients use the IP address and port to connect to the load balancer's Envoy proxies—the forwarding rule's IP address is the IP address of the load balancer (sometimes called a virtual IP address or VIP). Clients connecting to a load balancer must use TCP. For the complete list of supported protocols, see Load balancer feature comparison.

The internal IP address associated with the forwarding rule can come from a subnet in the same network and region as your backends. Each forwarding rule that you use in an internal proxy Network Load Balancer can reference exactly one TCP port. The internal IP address associated with the forwarding rule can come from any subnet in the same network and region.

The following table shows the differences between forwarding rules in the regional and cross-region modes:

Load balancer mode Forwarding rule, IP address, and proxy-only subnet --purpose Routing from the client to the load balancer's frontend
Regional internal proxy Network Load Balancer

Regional forwardingRules

Regional IP address

Load balancing scheme:

INTERNAL_MANAGED

Proxy-only subnet --purpose:

REGIONAL_MANAGED_PROXY

IP address --purpose:

SHARED_LOADBALANCER_VIP

You can enable global access to allow clients from any region to access your load balancer. Backends must also be in the same region as the load balancer.

Cross-region internal proxy Network Load Balancer

Global globalForwardingRules

Regional IP addresses

Load balancing scheme:

INTERNAL_MANAGED

Proxy-only subnet --purpose:

GLOBAL_MANAGED_PROXY

IP address --purpose:

SHARED_LOADBALANCER_VIP

Global access is enabled by default to allow clients from any region to access your load balancer. Backends can be in multiple regions.


Target proxies

The internal proxy Network Load Balancer terminates TCP connections from the client and creates new connections to the backends. By default, the original client IP address and port information is not preserved. You can preserve this information by using the PROXY protocol. The target proxy routes incoming requests directly to the load balancer's backend service.

The following table shows the target proxy APIs required by internal proxy Network Load Balancers in each mode:

Target proxy Regional internal proxy Network Load Balancer Cross-region internal proxy Network Load Balancer
TCP Regional regionTargetTcpProxies Global targetTcpProxies

Backend service

A backend service directs incoming traffic to one or more attached backends. A backend is either an instance group or a network endpoint group. The backend contains balancing mode information to define fullness based on connections (or, for instance group backends only, utilization).

Each internal proxy Network Load Balancer has a single backend service resource. The following table specifies the backend service type required by internal proxy Network Load Balancers in each mode:

Regional internal proxy Network Load Balancer Cross-region internal proxy Network Load Balancer
Backend service type Regional regionBackendServices Global backendServices

Supported backends

The backend service supports the following types of backends:

Load balancer mode Supported backends on a backend service
Instance groups Zonal NEGs Internet NEGs Serverless NEGs Hybrid NEGs Private Service Connect NEGs GKE
Regional internal proxy Network Load Balancer
GCE_VM_IP_PORT type endpoints
Regional NEGs only Add a Private Service Connect NEG
Cross-region internal proxy Network Load Balancer
GCE_VM_IP_PORT type endpoints
Add a Private Service Connect NEG

All of the backends must be of the same type (instance groups or NEGs). You can simultaneously use different types of instance group backends, or you can simultaneously use different types of NEG backends, but you cannot use instance group and NEG backends together on the same backend service.

You can mix zonal NEGs and hybrid NEGs within the same backend service.

To ensure minimal interruptions to your users, you can enable connection draining on backend services. Such interruptions might happen when a backend is terminated, removed manually, or removed by an autoscaler. To learn more about using connection draining to minimize service interruptions, see Enabling connection draining.

Protocol for communicating with the backends

When you configure a backend service for an internal proxy Network Load Balancer, you set the protocol that the backend service uses to communicate with the backends. The load balancer uses only the protocol that you specify, and does not attempt to negotiate a connection with the other protocol. The regional internal proxy Network Load Balancer or cross-region internal proxy Network Load Balancer only supports TCP for communicating with the backends.

Health check

Each backend service specifies a health check that periodically monitors the backends' readiness to receive a connection from the load balancer. This reduces the risk that requests might be sent to backends that can't service the request. Health checks don't check if the application itself is working.

Health check protocol

Although it is not required and not always possible, it is a best practice to use a health check whose protocol matches the protocol of the backend service. For example, a TCP health check most accurately tests TCP connectivity to backends. For the list of supported health check protocols, see Load balancing features.

The following table specifies the scope of health checks supported by internal proxy Network Load Balancers in each mode:

Load balancer mode Health check type
Regional internal proxy Network Load Balancer Regional regionHealthChecks
Cross-region internal proxy Network Load Balancer Global healthChecks

For more information about health checks, see the following:

Firewall rules

Internal proxy Network Load Balancers require the following firewall rules:

  • An ingress allow rule that permits traffic from the Google health check probes.
    • 35.191.0.0/16
    • 130.211.0.0/22
  • An ingress allow rule that permits traffic from the proxy-only subnet.

The ports for these firewall rules must be configured as follows:

  • Allow traffic to the destination port for each backend service's health check.

  • For instance group backends: Determine the ports to be configured by the mapping between the backend service's named port and the port numbers associated with that named port on each instance group. The port numbers can vary among instance groups assigned to the same backend service.

  • For GCE_VM_IP_PORT NEG backends: Allow traffic to the port numbers of the endpoints.

There are certain exceptions to the firewall rule requirements for these load balancers:

  • Allowlisting Google's health check probe ranges isn't required for hybrid NEGs. However, if you're using a combination of hybrid and zonal NEGs in a single backend service, you need to allowlist the Google health check probe ranges for the zonal NEGs.
  • For regional internet NEGs, health checks are optional. Traffic from load balancers using regional internet NEGs originates from the proxy-only subnet and is then NAT-translated (by using Cloud NAT) to either manual or auto-allocated NAT IP addresses. This traffic includes both health check probes and user requests from the load balancer to the backends. For details, see Regional NEGs: Use Cloud NAT to egress.

Client access

Clients can be in the same network or in a VPC network connected by using VPC Network Peering.

For regional internal proxy Network Load Balancers, clients must be in the same region as the load balancer by default. You can enable global access to allow clients from any region to access your load balancer.

For cross-region internal proxy Network Load Balancers, global access is enabled by default. Clients from any region can access your load balancer.

The following table summarizes client access for regional internal proxy Network Load Balancers:

Global access disabled Global access enabled
Clients must be in the same region as the load balancer. They also must be in the same VPC network as the load balancer or in a VPC network that is connected to the load balancer's VPC network by using VPC Network Peering. Clients can be in any region. They still must be in the same VPC network as the load balancer or in a VPC network that's connected to the load balancer's VPC network by using VPC Network Peering.
On-premises clients can access the load balancer through Cloud VPN tunnels or VLAN attachments. These tunnels or attachments must be in the same region as the load balancer. On-premises clients can access the load balancer through Cloud VPN tunnels or VLAN attachments. These tunnels or attachments can be in any region.

Shared VPC architecture

The internal proxy Network Load Balancer supports networks that use Shared VPC. Shared VPC lets you maintain a clear separation of responsibilities between network administrators and service developers. Your development teams can focus on building services in service projects, and the network infrastructure teams can provision and administer load balancing. If you're not already familiar with Shared VPC, read the Shared VPC overview documentation.

IP address Forwarding rule Target proxy Backend components

An internal IP address must be defined in the same project as the backends.

For the load balancer to be available in a Shared VPC network, the internal IP address must be defined in the same service project where the backend VMs are located, and it must reference a subnet in the desired Shared VPC network in the host project. The address itself comes from the primary IP range of the referenced subnet.

An internal forwarding rule must be defined in the same project as the backends.

For the load balancer to be available in a Shared VPC network, the internal forwarding rule must be defined in the same service project where the backend VMs are located, and it must reference the same subnet (in the Shared VPC network) that the associated internal IP address references.

The target proxy must be defined in the same project as the backends. In a Shared VPC scenario, the backend VMs are typically located in a service project. A regional internal backend service and health check must be defined in that service project.

Traffic distribution

An internal proxy Network Load Balancer distributes traffic to its backends as follows:

  1. Connections originating from a single client are sent to the same zone as long as healthy backends (instance groups or NEGs) within that zone are available and have capacity, as determined by the balancing mode. For regional internal proxy Network Load Balancers, the balancing mode can be CONNECTION (instance group or NEG backends) or UTILIZATION (instance group backends only).
  2. Connections from a client are sent to the same backend if you have configured session affinity.
  3. After a backend is selected, traffic is then distributed among instances (in an instance group) or endpoints (in a NEG) according to a load balancing policy. For the load balancing policy algorithms supported, see the localityLbPolicy setting in the regional backend service API documentation.

Session affinity

Session affinity lets you configure the load balancer's backend service to send all requests from the same client to the same backend, as long as the backend is healthy and has capacity.

The internal proxy Network Load Balancer offers client IP affinity, which forwards all requests from the same client IP address to the same backend, as long as that backend has capacity and remains healthy.

Failover

If a backend becomes unhealthy, traffic is automatically redirected to healthy backends. The following table describes the failover behavior in each mode:

Load balancer mode Failover behavior Behavior when all backends are unhealthy
Regional internal proxy Network Load Balancer

The load balancer implements a gentle failover algorithm per zone. Rather than waiting for all the backends in a zone to become unhealthy, the load balancer starts redirecting traffic to a different zone when the ratio of healthy to unhealthy backends in any zone is lesser than a certain percentage threshold (70%; this threshold is non-configurable). If all backends in all zones are unhealthy, the load balancer immediately terminates the client connection.

Envoy proxy sends traffic to healthy backends in a region based on the configured traffic distribution.

Terminates the connection
Cross-region internal proxy Network Load Balancer

Automatic failover to healthy backends in the same region or other regions.

Traffic is distributed among healthy backends spanning multiple regions based on the configured traffic distribution.

Terminates the connection

Load balancing for GKE applications

If you are building applications in Google Kubernetes Engine, you can use standalone zonal NEGs to load balance traffic directly to containers. With standalone NEGs you are responsible for creating the Service object that creates the zonal NEG, and then associating the NEG with the backend service so that the load balancer can connect to the Pods.

Related GKE documentation:

Quotas and limits

For information about quotas and limits, see Load balancing resource quotas.

Limitations

  • The internal proxy Network Load Balancer does not support IPv6 traffic.
  • The internal proxy Network Load Balancer does not support Shared VPC deployments where the load balancer's frontend is in one host or service project and the backend service and backends are in another service project (also known as cross-project service referencing).

What's next