This page explains how to create an internal passthrough Network Load Balancer or internal load balancer on Google Kubernetes Engine (GKE). To create an external passthrough Network Load Balancer, Create a Service of type LoadBalancer.
Before reading this page, ensure that you're familiar with the following concepts:
- LoadBalancer Service.
- LoadBalancer Service parameters.
- Backend service-based external passthrough Network Load Balancer.
Using internal passthrough Network Load Balancer
Internal passthrough Network Load Balancers make your cluster's Services accessible to clients within your cluster's VPC network and to clients in networks connected to your cluster's VPC network. Clients don't have to be located within your cluster. For example, an internal LoadBalancer Service can be accessible to virtual machine (VM) instances located in the cluster's VPC network.
Internal Load Balancers (ILBs) make your cluster's Services accessible to clients within your cluster's VPC network and to clients in networks connected to your cluster's VPC network. Clients don't have to be located within your cluster. For example, an internal LoadBalancer Service can be accessible to virtual machine (VM) instances located in the cluster's VPC network. By default, GKE subsetting is disabled, which limits the backend service to distributing up to 250 or fewer backend node VMs.
Using GKE subsetting
GKE subsetting improves the scalability of internal LoadBalancer
Services because it uses GCE_VM_IP
network endpoint groups (NEGs) as backends
instead of instance groups. When GKE subsetting is enabled,
GKE creates one NEG per compute zone per internal LoadBalancer Service.
The member endpoints in the NEG are the IP addresses of nodes that have at least one
of the Service's serving Pods. For more information about GKE
subsetting, see Node grouping.
Requirements and limitations
Following are the requirements and limitations for internal load balancers.
Requirements
GKE subsetting has the following requirements and limitations:
- You can enable GKE subsetting in new and existing Standard clusters in GKE versions 1.18.19-gke.1400 and later. GKE subsetting cannot be disabled once it has been enabled.
- GKE subsetting is disabled by default in Autopilot clusters. However, you can enable it when you create the cluster or later.
- GKE subsetting requires that the
HttpLoadBalancing
add-on is enabled. This add-on is enabled by default. In Autopilot clusters, you cannot disable this required add-on. - Quotas for Network Endpoint Groups
apply. Google Cloud creates one
GCE_VM_IP
NEG per internal LoadBalancer Service per zone. - Quotas for forwarding rules, backend services, and health checks apply. For more information, see Quotas and limits.
- GKE subsetting cannot be used with the annotation to share a
backend service among multiple load balancers,
alpha.cloud.google.com/load-balancer-backend-share
. - You must have Google Cloud CLI version 345.0.0 or later.
Limitations
Internal passthrough Network Load Balancers
- For clusters running Kubernetes version 1.7.4 and later, you can use internal load balancers with custom-mode subnets in addition to auto-mode subnets.
- Clusters running Kubernetes version 1.7.X and later support using a reserved
IP address for the internal passthrough Network Load Balancer if you create the reserved
IP address with the
--purpose
flag set toSHARED_LOADBALANCER_VIP
. Refer to Enabling Shared IP for step-by-step directions. GKE only preserves the IP address of an internal passthrough Network Load Balancer if the Service references an internal IP address with that purpose. Otherwise, GKE might change the load balancer's IP address (spec.loadBalancerIP
) if the Service is updated (for example, if ports are changed). - Even if the load balancer's IP address changes (see previous point), the
spec.clusterIP
remains constant. - Internal UDP load balancers don't support using
sessionAffinity: ClientIP
.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
Enable GKE subsetting in a new Standard cluster
You can create a Standard cluster with GKE subsetting enabled using the Google Cloud CLI, the Google Cloud console, or Terraform. A cluster created with GKE subsetting enabled always uses GKE subsetting.
Console
Go to the Google Kubernetes Engine page in the Google Cloud console.
Click add_box Create.
Configure your cluster as desired.
From the navigation pane, under Cluster, click Networking.
Select the Enable subsetting for L4 internal load balancers checkbox.
Click Create.
gcloud
gcloud container clusters create CLUSTER_NAME \
--cluster-version=VERSION \
--enable-l4-ilb-subsetting \
--location=COMPUTE_LOCATION
Replace the following:
CLUSTER_NAME
: the name of the new cluster.VERSION
: the GKE version, which must be 1.18.19-gke.1400 or later. You can also use the--release-channel
option to select a release channel. The release channel must have a default version 1.18.19-gke.1400 or later.COMPUTE_LOCATION
: the Compute Engine location for the cluster.
If you want to use a non-default network or subnetwork, run the following command:
gcloud container clusters create CLUSTER_NAME \
--cluster-version=VERSION \
--network NETWORK_NAME \
--subnetwork SUBNETWORK_NAME \
--enable-l4-ilb-subsetting \
--location=COMPUTE_LOCATION
Replace the following:
SUBNET_NAME
: the name of the new subnet. In GKE versions 1.22.4-gke.100 and later, you can specify a subnet in a different project by using the fully qualified resource URL for this field. You can get the fully qualified resource URL using the commandgcloud compute networks subnets describe
.NETWORK_NAME
: the name of the VPC network for the subnet.
Terraform
To create a Standard cluster with GKE subsetting enabled using Terraform, refer to the following example:
To learn more about using Terraform, see Terraform support for GKE.
Enable GKE subsetting in an existing Standard cluster
You can enable GKE subsetting for an existing Standard cluster using the gcloud CLI or the Google Cloud console. You cannot disable GKE subsetting after you have enabled it.
Console
In the Google Cloud console, go to the Google Kubernetes Engine page.
In the cluster list, click the name of the cluster you want to modify.
Under Networking, next to the Subsetting for L4 Internal Load Balancers field, click edit Enable subsetting for L4 internal load balancers.
Select the Enable subsetting for L4 internal load balancers checkbox.
Click Save Changes.
gcloud
gcloud container clusters update CLUSTER_NAME \
--enable-l4-ilb-subsetting
Replace the following:
CLUSTER_NAME
: the name of the cluster.
Enabling GKE subsetting does not disrupt existing
internal LoadBalancer Services. If you want to migrate existing internal
LoadBalancer Services to use backend services with GCE_VM_IP
NEGs as backends,
you must deploy a replacement Service manifest. For more details, see
Node grouping
in the LoadBalancer Service concepts documentation.
Deploy a workload
The following manifest describes a Deployment that runs a sample web application container image.
Save the manifest as
ilb-deployment.yaml
:apiVersion: apps/v1 kind: Deployment metadata: name: ilb-deployment spec: replicas: 3 selector: matchLabels: app: ilb-deployment template: metadata: labels: app: ilb-deployment spec: containers: - name: hello-app image: us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0
Apply the manifest to your cluster:
kubectl apply -f ilb-deployment.yaml
Create an internal LoadBalancer Service
(Optional) Disable automatic VPC firewall rules creation:
While GKE automatically creates VPC firewall rules to allow traffic to your internal load balancer, you have the option to disable the automatic VPC firewall rules creation and manage firewall rules on your own. You can disable VPC firewall rules only if you have enabled GKE subsetting for your internal LoadBalancer Service. However, managing VPC firewall rules is optional and you can rely on the automatic rules.
Before you disable automatic VPC firewall rules creation, ensure that you define allow rules that permit traffic to reach your load balancer and application Pods.
For more information on managing VPC firewall rules, see manage automatic firewall rule creation and how to disable automatic firewall rule creation, see User-managed firewall rules for GKE LoadBalancer Services.
The following example creates an internal LoadBalancer Service using TCP port
80
. GKE deploys an internal passthrough Network Load Balancer whose forwarding rule uses port80
, but then forwards traffic to backend Pods on port8080
:Save the manifest as
ilb-svc.yaml
:apiVersion: v1 kind: Service metadata: name: ilb-svc annotations: networking.gke.io/load-balancer-type: "Internal" spec: type: LoadBalancer externalTrafficPolicy: Cluster selector: app: ilb-deployment ports: - name: tcp-port protocol: TCP port: 80 targetPort: 8080
Your manifest must contain the following:
- A
name
for the internal LoadBalancer Service, in this caseilb-svc
. - An annotation that specifies that you require an internal LoadBalancer Service.
For GKE versions 1.17 and later, use the annotation
networking.gke.io/load-balancer-type: "Internal"
as shown in the example manifest. For earlier versions, usecloud.google.com/load-balancer-type: "Internal"
instead. - The
type: LoadBalancer
. - A
spec: selector
field to specify the Pods the Service should target, for example,app: hello
. - Port information:
- The
port
represents the destination port on which the forwarding rule of the internal passthrough Network Load Balancer receives packets. - The
targetPort
must match acontainerPort
defined on each serving Pod. - The
port
andtargetPort
values don't need to be the same. Nodes always perform destination NAT, changing the destination load balancer forwarding rule IP address andport
to a destination Pod IP address andtargetPort
. For more details, see Destination Network Address Translation on nodes in the LoadBalancer Service concepts documentation.
- The
Your manifest can contain the following:
spec.ipFamilyPolicy
andipFamilies
to define how GKE allocates IP addresses to the Service. GKE supports either single-stack (IPv4 only or IPv6 only), or dual-stack IP LoadBalancer Services. A dual-stack LoadBalancer Service is implemented with two separate internal passthrough Network Load Balancer forwarding rules: one for IPv4 traffic and one for IPv6 traffic. The GKE dual-stack LoadBalancer Service is available in version 1.29 or later. To learn more, see IPv4/IPv6 dual-stack Services.
For more information see, LoadBalancer Service parameters
- A
Apply the manifest to your cluster:
kubectl apply -f ilb-svc.yaml
Get detailed information about the Service:
kubectl get service ilb-svc --output yaml
The output is similar to the following:
apiVersion: v1 kind: Service metadata: annotations: cloud.google.com/neg: '{"ingress":true}' cloud.google.com/neg-status: '{"network_endpoint_groups":{"0":"k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r"},"zones":["ZONE_NAME","ZONE_NAME","ZONE_NAME"]}' kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"networking.gke.io/load-balancer-type":"Internal"},"name":"ilb-svc","namespace":"default"},"spec":{"externalTrafficPolicy":"Cluster","ports":[{"name":"tcp-port","port":80,"protocol":"TCP","targetPort":8080}],"selector":{"app":"ilb-deployment"},"type":"LoadBalancer"}} networking.gke.io/load-balancer-type: Internal service.kubernetes.io/backend-service: k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r service.kubernetes.io/firewall-rule: k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r service.kubernetes.io/firewall-rule-for-hc: k8s2-pn2h9n5f-l4-shared-hc-fw service.kubernetes.io/healthcheck: k8s2-pn2h9n5f-l4-shared-hc service.kubernetes.io/tcp-forwarding-rule: k8s2-tcp-pn2h9n5f-default-ilb-svc-3bei4n1r creationTimestamp: "2022-07-22T17:26:04Z" finalizers: - gke.networking.io/l4-ilb-v2 - service.kubernetes.io/load-balancer-cleanup name: ilb-svc namespace: default resourceVersion: "51666" uid: d7a1a865-7972-44e1-aa9e-db5be23d6567 spec: allocateLoadBalancerNodePorts: true clusterIP: 10.88.2.141 clusterIPs: - 10.88.2.141 externalTrafficPolicy: Cluster internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: tcp-port nodePort: 30521 port: 80 protocol: TCP targetPort: 8080 selector: app: ilb-deployment sessionAffinity: None type: LoadBalancer status: loadBalancer: ingress: - ip: 10.128.15.245
The output has the following attributes:
- The IP address of the internal passthrough Network Load Balancer's forwarding rule is included in
status.loadBalancer.ingress
. This IP address is different from the value ofclusterIP
. In this example, the load balancer's forwarding rule IP address is10.128.15.245
. - Any Pod that has the label
app: ilb-deployment
is a serving Pod for this Service. These are the Pods that receive packets routed by the internal passthrough Network Load Balancer. - Clients call the Service by using this
loadBalancer
IP address and the TCP destination port specified in theport
field of the Service manifest. For complete details about how packets are routed once received by a node, see Packet processing. - GKE assigned a
nodePort
to the Service; in this example, port30521
is assigned. ThenodePort
is not relevant to the internal passthrough Network Load Balancer.
- The IP address of the internal passthrough Network Load Balancer's forwarding rule is included in
Inspect the Service network endpoint group:
kubectl get svc ilb-svc -o=jsonpath="{.metadata.annotations.cloud\.google\.com/neg-status}"
The output is similar to the following:
{"network_endpoint_groups":{"0":"k8s2-knlc4c77-default-ilb-svc-ua5ugas0"},"zones":["ZONE_NAME"]}
The response indicates that GKE has created a network endpoint group named
k8s2-knlc4c77-default-ilb-svc-ua5ugas0
. This annotation is present in services of typeLoadBalancer
that use GKE subsetting and is not present in Services that do not use GKE subsetting.
Verify internal passthrough Network Load Balancer components
The internal passthrough Network Load Balancer's forwarding rule IP address is 10.128.15.245
in
the example included in the Create an internal LoadBalancer Service
section. You can see this forwarding rule is included in the list of forwarding
rules in the cluster's project by using the Google Cloud CLI:
gcloud compute forwarding-rules list --filter="loadBalancingScheme=INTERNAL"
The output includes the relevant internal passthrough Network Load Balancer forwarding rule, its IP
address, and the backend service referenced by the forwarding rule
(k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r
in this example).
NAME ... IP_ADDRESS ... TARGET
...
k8s2-tcp-pn2h9n5f-default-ilb-svc-3bei4n1r 10.128.15.245 ZONE_NAME/backendServices/k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r
You can describe the load balancer's backend service by using the Google Cloud CLI:
gcloud compute backend-services describe k8s2-tcp-pn2h9n5f-default-ilb-svc-3bei4n1r --region=COMPUTE_REGION
Replace COMPUTE_REGION
with the
compute region of the backend service.
The output includes the backend GCE_VM_IP
NEG or NEGs for the Service
(k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r
in this example):
backends:
- balancingMode: CONNECTION
group: .../ZONE_NAME/networkEndpointGroups/k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r
...
kind: compute#backendService
loadBalancingScheme: INTERNAL
name: aae3e263abe0911e9b32a42010a80008
...
To determine the list of nodes in a subset for a service, use the following command:
gcloud compute network-endpoint-groups list-network-endpoints NEG_NAME \
--zone=COMPUTE_ZONE
Replace the following:
NEG_NAME
: the name of the network endpoint group created by the GKE controller.COMPUTE_ZONE
: the compute zone of the network endpoint group to operate on.
To determine the list of healthy nodes for an internal passthrough Network Load Balancer, use the following command:
gcloud compute backend-services get-health SERVICE_NAME \
--region=COMPUTE_REGION
Replace the following:
SERVICE_NAME
: the name of the backend service. This value is the same as the name of the network endpoint group created by the GKE controller.COMPUTE_REGION
: the compute region of the backend service to operate on.
Test connectivity to the internal passthrough Network Load Balancer
Run the following command in the same region as the cluster:
curl LOAD_BALANCER_IP:80
Replace LOAD_BALANCER_IP
with the load balancer's
forwarding rule IP address.
The response shows the output of ilb-deployment
:
Hello, world!
Version: 1.0.0
Hostname: ilb-deployment-77b45987f7-pw54n
The internal passthrough Network Load Balancer is only accessible within the same VPC network (or a connected network). By default, the load balancer's forwarding rule has global access disabled, so client VMs, Cloud VPN tunnels, or Cloud Interconnect attachments (VLANs) must be located in the same region as the internal passthrough Network Load Balancer. To support clients in all regions, you can enable global access on the load balancer's forwarding rule by including the global access annotation in the Service manifest.
Delete the internal LoadBalancer Service and load balancer resources
You can delete the Deployment and Service using kubectl delete
or the
Google Cloud console.
kubectl
Delete the Deployment
To delete the Deployment, run the following command:
kubectl delete deployment ilb-deployment
Delete the Service
To delete the Service, run the following command:
kubectl delete service ilb-svc
Console
Delete the Deployment
To delete the Deployment, perform the following steps:
Go to the Workloads page in the Google Cloud console.
Select the Deployment you want to delete, then click delete Delete.
When prompted to confirm, select the Delete Horizontal Pod Autoscaler associated with selected Deployment checkbox, then click Delete.
Delete the Service
To delete the Service, perform the following steps:
Go to the Services & Ingress page in the Google Cloud console.
Select the Service you want to delete, then click delete Delete.
When prompted to confirm, click Delete.
Shared IP
The internal passthrough Network Load Balancer allows the
sharing of a Virtual IP address amongst multiple forwarding rules.
This is useful for expanding the number of simultaneous ports on the same IP or
for accepting UDP and TCP traffic on the same IP. It allows up to a maximum of
50 exposed ports per IP address. Shared IPs are supported natively on
GKE clusters with internal LoadBalancer Services.
When deploying, the Service's loadBalancerIP
field is used to indicate
which IP should be shared across Services.
Limitations
A shared IP for multiple load balancers has the following limitations and capabilities:
- Each forwarding rule can have up to five ports (contiguous or non-contiguous), or it can be configured to match and forward traffic on all ports. If an Internal LoadBalancer Service defines more than five ports, the forwarding rule will automatically be set to match all ports.
- A maximum of ten Services (forwarding rules) can share an IP address. This results in a maximum of 50 ports per shared IP.
- Each forwarding rule that shares the same IP address must use a unique combination of protocols and ports. Therefore, every internal LoadBalancer Service must use a unique set of protocols and ports.
- A combination of TCP-only and UDP-only Services is supported on the same shared IP, however you cannot expose both TCP and UDP ports in the same Service.
Enabling Shared IP
To enable an internal LoadBalancer Services to share a common IP, follow these steps:
Create a static internal IP with
--purpose SHARED_LOADBALANCER_VIP
. An IP address must be created with this purpose to enable its ability to be shared. If you create the static internal IP address in a Shared VPC, you must create the IP address in the same service project as the instance that will use the IP address, even though the value of the IP address will come from the range of available IPs in a selected shared subnet of the Shared VPC network. Refer to reserving a static internal IP on the Provisioning Shared VPC page for more information.Deploy up to ten internal LoadBalancer Services using this static IP in the
loadBalancerIP
field. The internal passthrough Network Load Balancers are reconciled by the GKE service controller and deploy using the same frontend IP.
The following example demonstrates how this is done to support multiple TCP and UDP ports against the same internal load balancer IP.
Create a static IP in the same region as your GKE cluster. The subnet must be the same subnet that the load balancer uses, which by default is the same subnet that is used by the GKE cluster node IPs.
If your cluster and the VPC network are in the same project:
gcloud compute addresses create IP_ADDR_NAME \ --project=PROJECT_ID \ --subnet=SUBNET \ --addresses=IP_ADDRESS \ --region=COMPUTE_REGION \ --purpose=SHARED_LOADBALANCER_VIP
If your cluster is in a Shared VPC service project but uses a Shared VPC network in a host project:
gcloud compute addresses create IP_ADDR_NAME \ --project=SERVICE_PROJECT_ID \ --subnet=projects/HOST_PROJECT_ID/regions/COMPUTE_REGION/subnetworks/SUBNET \ --addresses=IP_ADDRESS \ --region=COMPUTE_REGION \ --purpose=SHARED_LOADBALANCER_VIP
Replace the following:
IP_ADDR_NAME
: a name for the IP address object.SERVICE_PROJECT_ID
: the ID of the service project.PROJECT_ID
: the ID of your project (single project).HOST_PROJECT_ID
: the ID of the Shared VPC host project.COMPUTE_REGION
: the compute region containing the shared subnet.IP_ADDRESS
: an unused internal IP address from the selected subnet's primary IP address range. If you omit specifying an IP address, Google Cloud selects an unused internal IP address from the selected subnet's primary IP address range. To determine an automatically selected address, you'll need to rungcloud compute addresses describe
.SUBNET
: the name of the shared subnet.
Save the following TCP Service configuration to a file named
tcp-service.yaml
and then deploy to your cluster. ReplaceIP_ADDRESS
with the IP address you chose in the previous step.apiVersion: v1 kind: Service metadata: name: tcp-service namespace: default annotations: networking.gke.io/load-balancer-type: "Internal" spec: type: LoadBalancer loadBalancerIP: IP_ADDRESS selector: app: myapp ports: - name: 8001-to-8001 protocol: TCP port: 8001 targetPort: 8001 - name: 8002-to-8002 protocol: TCP port: 8002 targetPort: 8002 - name: 8003-to-8003 protocol: TCP port: 8003 targetPort: 8003 - name: 8004-to-8004 protocol: TCP port: 8004 targetPort: 8004 - name: 8005-to-8005 protocol: TCP port: 8005 targetPort: 8005
Apply this Service definition against your cluster:
kubectl apply -f tcp-service.yaml
Save the following UDP Service configuration to a file named
udp-service.yaml
and then deploy it. It also uses theIP_ADDRESS
that you specified in the previous step.apiVersion: v1 kind: Service metadata: name: udp-service namespace: default annotations: networking.gke.io/load-balancer-type: "Internal" spec: type: LoadBalancer loadBalancerIP: IP_ADDRESS selector: app: my-udp-app ports: - name: 9001-to-9001 protocol: UDP port: 9001 targetPort: 9001 - name: 9002-to-9002 protocol: UDP port: 9002 targetPort: 9002
Apply this file against your cluster:
kubectl apply -f udp-service.yaml
Validate that the VIP is shared amongst load balancer forwarding rules by listing them out and filtering for the static IP. This shows that there is a UDP and a TCP forwarding rule both listening across seven different ports on the shared
IP_ADDRESS
, which in this example is10.128.2.98
.gcloud compute forwarding-rules list | grep 10.128.2.98 ab4d8205d655f4353a5cff5b224a0dde us-west1 10.128.2.98 UDP us-west1/backendServices/ab4d8205d655f4353a5cff5b224a0dde acd6eeaa00a35419c9530caeb6540435 us-west1 10.128.2.98 TCP us-west1/backendServices/acd6eeaa00a35419c9530caeb6540435
Known issues
Connection timeout every 10 minutes
Internal LoadBalancer Services created with Subsetting might observe traffic disruptions roughly every 10 minutes. This bug has been fixed in versions:
- 1.18.19-gke.1700 and later
- 1.19.10-gke.1000 and later
- 1.20.6-gke.1000 and later
Error creating load balancer in Standard tier
When you create an internal passthrough Network Load Balancer in a project with the project default network tier set to Standard, the following error message appears:
Error syncing load balancer: failed to ensure load balancer: googleapi: Error 400: STANDARD network tier (the project's default network tier) is not supported: Network tier other than PREMIUM is not supported for loadBalancingScheme=INTERNAL., badRequest
To resolve this issue in GKE versions earlier than 1.23.3-gke.900, configure the project default network tier to Premium.
This issue is resolved in GKE versions 1.23.3-gke.900 and later when GKE subsetting is enabled.
The GKE controller creates internal passthrough Network Load Balancers in the Premium network tier even if the project default network tier is set to Standard.
What's next
- Read the GKE network overview.
- Learn more about Compute Engine load balancers.
- Learn how to create a VPC-native cluster.
- Troubleshoot load balancing in GKE.