LoadBalancer Service concepts

Stay organized with collections Save and categorize content based on your preferences.

This page provides a general overview of how Google Kubernetes Engine (GKE) creates and manages Google Cloud load balancers when you apply a Kubernetes LoadBalancer Services manifest. It describes the different types of load balancers and how settings like the externalTrafficPolicy and GKE subsetting for L4 internal load balancers determine how the load balancers are configured.

Before reading this page, you should be familiar with GKE networking concepts. After you're familiar with the concepts on this page, see GKE LoadBalancer Service parameters for additional configuration options.

Overview

When you create a LoadBalancer Service, GKE configures a Google Cloud pass-through load balancer whose characteristics depend on parameters of your Service manifest. You choose whether your load balancer has an internal or an external IP address, control how packets are routed to serving Pods, and specify other load balancer parameters using the Service manifest. This page provides the necessary background information.

Load balancer types

When you create a LoadBalancer Service in GKE, you specify whether the load balancer has an internal or external address by using an annotation in the Service manifest:

  • To create an internal LoadBalancer Service, place one of the following annotations in the metadata.annotations[] of the Service manifest:

    • networking.gke.io/load-balancer-type: "Internal" (GKE 1.17 and later)
    • cloud.google.com/load-balancer-type: "Internal" (versions earlier than 1.17)

    Internal LoadBalancer Services are powered by internal TCP/UDP load balancers which GKE creates in the cluster's Virtual Private Cloud (VPC) network. Clients located in the same VPC network or a network connected to the cluster's VPC network can access the Service using the load balancer's IP address.

  • To create an external LoadBalancer Service, omit both of the annotations listed in the previous point. External LoadBalancer services are powered by two types of external network load balancers, both of which are accessible on the internet. You control the type of external network load balancer by including or omitting an annotation:

GKE subsetting

The GKE subsetting for L4 internal load balancers cluster-wide configuration option, or GKE subsetting, improves the scalability of internal TCP/UDP load balancers by more efficiently grouping node endpoints for the load balancer backends. If your cluster contains more than 250 nodes and you need to create internal LoadBalancer Services, you must enable GKE subsetting.

The following diagram shows two Services in a zonal cluster with three nodes and GKE subsetting enabled. Each Service has two Pods. GKE creates one GCE_VM_IP network endpoint group (NEG) for each Service. Endpoints in each NEG are the nodes with the serving Pods for the respective Service.

You can enable GKE subsetting when you create a cluster or by editing an existing cluster. Once enabled, you cannot disable GKE subsetting. For more information, see GKE subsetting.

GKE subsetting requires:

  • GKE version 1.18.19-gke.1400 or later, and
  • The HttpLoadBalancing add-on enabled for the cluster. This add-on is enabled by default. It allows the cluster to manage load balancers which use backend services.

Node grouping

The Service manifest annotations and status of GKE subsetting determine the resulting Google Cloud load balancer and the type of backends. Backends for Google Cloud pass-through load balancers identify the network interface (NIC) of the GKE node, not a particular node or Pod IP address. The type of load balancer and backends determine how nodes are grouped into GCE_VM_IP NEGs, instance groups, or target pools.

GKE LoadBalancer Service Resulting Google Cloud load balancer Node grouping method
Internal LoadBalancer Service created in a cluster with GKE subsetting enabled1 An internal TCP/UDP load balancer whose backend service uses GCE_VM_IP network endpoint group (NEG) backends

Node VMs are grouped zonally into GCE_VM_IP NEGs on a per-service basis according to the externalTrafficPolicy of the Service and the number of nodes in the cluster.

The externalTrafficPolicy of the Service also controls which nodes pass the load balancer health check and packet processing.

Internal LoadBalancer Service created in a cluster with GKE subsetting disabled An internal TCP/UDP load balancer whose backend service uses zonal unmanaged instance group backends

All node VMs are placed into zonal unmanaged instance groups which GKE uses as backends for the internal TCP/UDP load balancer's backend service.

The externalTrafficPolicy of the Service controls which nodes pass the load balancer health check and the packet processing.

The same unmanaged instance groups are used for other load balancer backend services created in the cluster because of the single load-balanced instance group limitation.

External LoadBalancer Service with the cloud.google.com/l4-rbs: "enabled" annotation2 A backend service-based network load balancer whose backend service uses zonal unmanaged instance group backends

All node VMs are placed into zonal unmanaged instance groups which GKE uses as backends for the network load balancer's backend service.

The externalTrafficPolicy of the Service controls which nodes pass the load balancer health check and the packet processing.

The same unmanaged instance groups are used for other load balancer backend services created in the cluster because of the single load-balanced instance group limitation.

External LoadBalancer Service without the cloud.google.com/l4-rbs: "enabled" annotation3 A target pool-based network load balancer whose target pool contains all nodes of the cluster

The target pool is a legacy API which does not rely on instance groups. All nodes have direct membership in the target pool.

The externalTrafficPolicy of the Service controls which nodes pass the load balancer health check and the packet processing.

1 Only the internal TCP/UDP load balancers created after enabling GKE subsetting use GCE_VM_IP NEGs. Any internal LoadBalancer Services created before enabling GKE subsetting continue to use unmanaged instance group backends. For examples and configuration guidance, see Creating internal LoadBalancer Services.

2GKE does not automatically migrate existing external LoadBalancer Services from target pool-based network load balancers to backend service-based network load balancers. To create an external LoadBalancer Service powered by a backend service-based network load balancer, you must include the cloud.google.com/l4-rbs: "enabled" annotation in the Service manifest at the time of creation.

3Removing the cloud.google.com/l4-rbs: "enabled" annotation from an existing external LoadBalancer Service powered by a backend service-based network load balancer does not cause GKE to create a target pool-based network load balancer. To create an external LoadBalancer Service powered by a target pool-based network load balancer, you must omit the cloud.google.com/l4-rbs: "enabled" annotation from the Service manifest at the time of creation.

Node membership in GCE_VM_IP NEG backends

When GKE subsetting is enabled for a cluster, GKE creates a unique GCE_VM_IP NEG in each zone for each internal LoadBalancer Service. Unlike instance groups, nodes can be members of more than one load-balanced GCE_VM_IP NEG. The externalTrafficPolicy of the Service and the number of nodes in the cluster determine which nodes are added as endpoints to the Service's GCE_VM_IP NEG(s).

The cluster's control plane adds nodes as endpoints to the GCE_VM_IP NEGs according to the value of the Service's externalTrafficPolicy and the number of nodes in the cluster, as summarized in the following table.

externalTrafficPolicy Number of nodes in the cluster Endpoint membership
Cluster 1 to 25 nodes GKE uses all nodes in the cluster as endpoints for the Service's NEG(s), even if a node does not contain a serving Pod for the Service.
Cluster more than 25 nodes GKE uses a random subset of 25 nodes as endpoints for the Service's NEG(s), even if a node does not contain a serving Pod for the Service.
Local any number of nodes1 GKE only uses nodes which have at least one of the Service's serving Pods as endpoints for the Service's NEG(s).

1Limited to 250 nodes with serving Pods for internal LoadBalancer Services. More than 250 nodes can be present in the cluster, but internal TCP/UDP load balancers only distribute to 250 backend VMs when Internal TCP/UDP Load Balancing backend subsetting is disabled. Even with GKE subsetting enabled, GKE never configures internal TCP/UDP load balancers with Internal TCP/UDP Load Balancing backend subsetting. For details about this limit, see Maximum number of VM instances per internal backend service.

Single load-balanced instance group limitation

The Compute Engine API prohibits VMs from being members of more than one load-balanced instance group. GKE nodes are subject to this constraint.

When using unmanaged instance group backends, GKE creates or updates unmanaged instance groups containing all nodes from all node pools in each zone the cluster uses. These unmanaged instance groups are used for:

  • Internal TCP/UDP load balancers created for internal LoadBalancer Services when GKE subsetting is disabled.
  • Backend service-based network load balancers created for external LoadBalancer Services with the cloud.google.com/l4-rbs: "enabled" annotation.
  • External HTTP(S) load balancers created for external GKE Ingress resources, using the GKE Ingress controller, but not using container-native load balancing.

Because node VMs can't be members of more than one load-balanced instance group, GKE can't create and manage internal TCP/UDP load balancers, backend service-based network load balancers, and external HTTP(S) load balancers created for GKE Ingress resources if either of the following is true:

  • Outside of GKE, you created at least one backend service based load balancer, and you used the cluster's managed instance groups as backends for the load balancer's backend service.
  • Outside of GKE, you create a custom unmanaged instance group that contains some or all of the cluster's nodes, then attach that custom unmanaged instance group to a backend service for a load balancer.

To work around this limitation, you can instruct GKE to use NEG backends where possible:

  • Enable GKE subsetting. As a result, new internal LoadBalancer Services use GCE_VM_IP NEGs instead.
  • Configure external GKE Ingress resources to use container native load balancing. For more information, see GKE container-native load balancing.

Load balancer health checks

All GKE LoadBalancer Services require a load balancer health check. The load balancer's health check is implemented outside of the cluster and is different from a readiness or liveness probe.

The externalTrafficPolicy of the Service defines how the load balancer's health check operates. In all cases, the load balancer's health check probers send packets to the kube-proxy software running on each node. The load balancer's health check is a proxy for information that the kube-proxy gathers, such as whether a Pod exists, is running, and has passed its readiness probe. Health checks for LoadBalancer Services cannot be routed to serving Pods. The load balancer's health check is designed to direct new TCP connections to nodes.

The following table describes the health check behavior:

externalTrafficPolicy Which nodes pass the health check What port is used
Cluster All nodes of the cluster pass the health check regardless of whether the node is running a serving Pod. The load balancer health check port must be TCP port 10256. It cannot be customized.
Local

Only the nodes with at least one ready, serving Pod pass the health check. Nodes without a serving Pod and nodes with serving Pods that have not yet passed their readiness probes fail the health check.

If the serving Pod has failed its readiness probe or is about to terminate, a node might still pass the load balancer's health check even though it does not contain a ready and serving Pod. This situation happens when the load balancer's health check has not yet reached its failure threshold. How the packet is processed in this situation depends on the GKE version. For additional details, see the next section, Packet processing.

The health check port is TCP port 10256 unless you specify a custom health check port.

Packet processing

The following sections detail how the load balancer and cluster nodes work together to route packets received for LoadBalancer Services.

Pass-through load balancing

The Google Cloud pass-through load balancer routes packets to the nic0 interface of the GKE cluster's nodes. Each load-balanced packet received by a node has the following characteristics:

  • The packet's destination IP address matches the load balancer's forwarding rule IP address.
  • The protocol and destination port of the packet match both of these:
    • a protocol and port specified in spec.ports[] of the Service manifest
    • a protocol and port configured on the load balancer's forwarding rule

Destination Network Address Translation on nodes

After the node receives the packet, the node performs additional packet processing. In GKE clusters without GKE Dataplane V2 enabled, nodes use iptables to process load-balanced packets. In GKE clusters with GKE Dataplane V2 enabled, nodes use eBPF instead. The node-level packet processing always includes the following actions:

  • The node performs Destination Network Address Translation (DNAT) on the packet, setting its destination IP address to a serving Pod IP address.
  • The node changes the packet's destination port to the targetPort of the corresponding Service's spec.ports[].

Source Network Address Translation on nodes

The externalTrafficPolicy determines whether the node-level packet processing also performs source network address translation (SNAT) as well as the path the packet takes from node to Pod:

externalTrafficPolicy Node SNAT behavior Routing behavior
Cluster The node changes the source IP address of load-balanced packets to match the IP address of the node which received it from the load balancer.

The node routes packets to any serving Pod. The serving Pod might or might not be on the same node.

If the node that receives the packets from the load balancer lacks a ready and serving Pod, the node routes the packets to a different node which does contain a ready and serving Pod. Response packets from the Pod are routed from its node back to the node which received the request packets from the load balancer. That first node then sends the response packets to the original client using Direct Server Return.

Local The node does not change the source IP address of load-balanced packets.

In most situations, the node routes the packet to a serving Pod running on the node which received the packet from the load balancer. That node sends response packets to the original client using Direct Server Return. This is the primary intent of this type of traffic policy.

In some situations, a node might receive packets from the load balancer even though the node lacks a ready and serving Pod for the Service. This situation can happen when the load balancer's health check has not yet reached its failure threshold, but a previously ready and serving Pod is no longer ready. How the packets are processed in this situation depends on the GKE version:

  • In 1.14 and earlier, packets are dropped.
  • In 1.15 and later, GKE routes the packets to a different node that has a ready and serving Pod.

Pricing and quotas

Network pricing applies to packets processed by a load balancer. For more information, see Cloud Load Balancing and forwarding rules pricing. You can also estimate billing charges using the Google Cloud pricing calculator.

The number of forwarding rules you can create is controlled by load balancer quotas:

What's next