Container-native load balancing

Autopilot Standard

This page explains what container-native load balancing is in Google Kubernetes Engine (GKE). Container-native load balancing enables several kinds of load balancers to target Pods directly and to evenly distribute traffic to Pods.

Container-native load balancing architecture

Container-native load balancing uses GCE_VM_IP_PORT network endpoint groups (NEGs). The endpoints of the NEG are Pod IP addresses.

Container-native load balancing is always used for internal GKE Ingress and is optional for external Ingress. The Ingress controller creates the load balancer, including the virtual IP address, forwarding rules, health checks, and firewall rules.

To learn how to use container-native load balancing with Ingress, see Container-native load balancing through Ingress.

For more flexibility, you can also create standalone NEGs. In this case, you are responsible for creating and managing all aspects of the load balancer.

Benefits of container-native load balancing

Container-native load balancing offers the following benefits:

Pods are core objects for load balancing: kube-proxy configures nodes' iptables rules to distribute traffic to Pods. Without container-native load balancing, load balancer traffic travels to the node instance groups and gets routed using iptables rules to Pods which might or might not be in the same node. With container-native load balancing, load balancer traffic is distributed directly to the Pods which should receive the traffic, eliminating the extra network hop. Container-native load balancing also helps with improved health checking since it targets Pods directly.

Comparison of default behavior (left) with container-native load balancer behavior.
Improved network performance: Because the container-native load balancer talks directly with the Pods, and connections have fewer network hops, both latency and throughput are improved.
Increased visibility: With container-native load balancing, you have visibility into the latency from the Application Load Balancer to Pods. The latency from the Application Load Balancer to each Pod is visible, which were aggregated with node IP-base container-native load balancing. This makes troubleshooting your Services at the NEG-level easier.

Support for advanced load balancing features: Container-native load balancing in GKE supports several features of external Application Load Balancers, such as integration with Google Cloud services like Google Cloud Armor, Cloud CDN, and Identity-Aware Proxy. It also features load balancing algorithms for accurate traffic distribution.

Support for Cloud Service Mesh: The NEG data model is required to use Cloud Service Mesh, Google Cloud's fully managed traffic control plane for service mesh.

Pod readiness

For relevant Pods, the corresponding Ingress controller manages a readiness gate of type cloud.google.com/load-balancer-neg-ready. The Ingress controller polls the load balancer's health check status, which includes the health of all endpoints in the NEG. When the load balancer's health check status indicates that the endpoint corresponding to a particular Pod is healthy, the Ingress controller sets the Pod's readiness gate value to True. The kubelet running on each Node then computes the Pod's effective readiness, considering both the value of this readiness gate and, if defined, the Pod's readiness probe.

Pod readiness gates are automatically enabled when using container-native load balancing through Ingress.

Readiness gates control the rate of a rolling update. When you initiate a rolling update, as GKE creates new Pods, an endpoint for each new Pod is added to a NEG. When the endpoint is healthy from the perspective of the load balancer, the Ingress controller sets the readiness gate to True. A newly created Pod must at least pass its readiness gate before GKE removes an old Pod. This ensures that the corresponding endpoint for the Pod has already passed the load balancer's health check and that the backend capacity is maintained.

If a Pod's readiness gate never indicates that the Pod is ready, due to a bad container image or a misconfigured load balancer health check, the load balancer won't direct traffic to the new Pod. If such a failure occurs while rolling out an updated Deployment, the rollout stalls after attempting to create one new Pod because that Pod's readiness gate is never True. See the troubleshooting section for information on how to detect and fix this situation.

Without container-native load balancing and readiness gates, GKE can't detect if a load balancer's endpoints are healthy before marking Pods as ready. In previous Kubernetes versions, you control the rate that Pods are removed and replaced by specifying a delay period (minReadySeconds in the Deployment specification).

GKE sets the value of cloud.google.com/load-balancer-neg-ready for a Pod to True if any of the following conditions are met:

None of the Pod's IP addresses are endpoints in a GCE_VM_IP_PORT NEG managed by the GKE control plane.
One or more of the Pod's IP addresses are endpoints in a GCE_VM_IP_PORT NEG managed by the GKE control plane. The NEG is attached to a backend service. The backend service has a successful load balancer health check.
One or more of the Pod's IP addresses are endpoints in a GCE_VM_IP_PORT NEG managed by the GKE control plane. The NEG is attached to a backend service. The load balancer health check for the backend service times out.
One or more of the Pod's IP addresses are endpoints in one or more GCE_VM_IP_PORT NEGs. None of the NEGs are attached to a backend service. No load balancer health check data is available.

Session affinity

Container-native load balancing supports Pod-based session affinity.

Requirements for using container-native load balancing

Container-native load balancers through Ingress on GKE have the following requirements:

The cluster must be VPC-native.
The cluster must have the HttpLoadBalancing add-on enabled. GKE clusters have the HttpLoadBalancing add-on enabled by default; you must not disable it.

Limitations for container-native load balancers

Container-native load balancers through Ingress on GKE have the following limitations:

Don't support external passthrough Network Load Balancers.
You must not manually change or update the configuration of the Application Load Balancer that GKE creates. Any changes that you make are overwritten by GKE.

Pricing for container-native load balancers

You are charged for the Application Load Balancer provisioned by the Ingress that you create in this guide. For load balancer pricing information, refer to Load balancing and forwarding rules on the VPC pricing page.

What's next

Learn more about NEGs.
Learn more about VPC-native clusters.
Learn more about external Application Load Balancers.
Watch a KubeCon talk about Pod readiness gates.