Create an internal load balancer

Autopilot Standard

This page explains how to create an internal passthrough Network Load Balancer or internal load balancer on Google Kubernetes Engine (GKE). To create an external passthrough Network Load Balancer, Create a Service of type LoadBalancer.

Before reading this page, ensure that you're familiar with the following concepts:

Using internal passthrough Network Load Balancer

Internal passthrough Network Load Balancers make your cluster's Services accessible to clients within your cluster's VPC network and to clients in networks connected to your cluster's VPC network. Clients don't have to be located within your cluster. For example, an internal LoadBalancer Service can be accessible to virtual machine (VM) instances located in the cluster's VPC network.

Internal Load Balancers (ILBs) make your cluster's Services accessible to clients within your cluster's VPC network and to clients in networks connected to your cluster's VPC network. Clients don't have to be located within your cluster. For example, an internal LoadBalancer Service can be accessible to virtual machine (VM) instances located in the cluster's VPC network. By default, GKE subsetting is disabled, which limits the backend service to distributing up to 250 or fewer backend node VMs.

Using GKE subsetting

GKE subsetting improves the scalability of internal LoadBalancer Services because it uses GCE_VM_IP network endpoint groups (NEGs) as backends instead of instance groups. When GKE subsetting is enabled, GKE creates one NEG per compute zone per internal LoadBalancer Service. The member endpoints in the NEG are the IP addresses of nodes that have at least one of the Service's serving Pods. For more information about GKE subsetting, see Node grouping.

Requirements and limitations

Following are the requirements and limitations for internal load balancers.

Requirements

GKE subsetting has the following requirements and limitations:

You can enable GKE subsetting in new and existing Standard clusters in GKE versions 1.18.19-gke.1400 and later. GKE subsetting cannot be disabled once it has been enabled.

GKE subsetting is disabled by default in Autopilot clusters. However, you can enable it after you create the cluster.
GKE subsetting requires that the HttpLoadBalancing add-on is enabled. This add-on is enabled by default. In Autopilot clusters, you cannot disable this required add-on.
Quotas for Network Endpoint Groups apply. Google Cloud creates one GCE_VM_IP NEG per internal LoadBalancer Service per zone.
Quotas for forwarding rules, backend services, and health checks apply. For more information, see Quotas and limits.
GKE subsetting cannot be used with the annotation to share a backend service among multiple load balancers, alpha.cloud.google.com/load-balancer-backend-share.
You must have Google Cloud CLI version 345.0.0 or later.

Limitations

Internal passthrough Network Load Balancers

For clusters running Kubernetes version 1.7.4 and later, you can use internal load balancers with custom-mode subnets in addition to auto-mode subnets.
Clusters running Kubernetes version 1.7.X and later support using a reserved IP address for the internal passthrough Network Load Balancer if you create the reserved IP address with the --purpose flag set to SHARED_LOADBALANCER_VIP. Refer to Enabling Shared IP for step-by-step directions. GKE only preserves the IP address of an internal passthrough Network Load Balancer if the Service references an internal IP address with that purpose. Otherwise, GKE might change the load balancer's IP address (spec.loadBalancerIP) if the Service is updated (for example, if ports are changed).
Even if the load balancer's IP address changes (see previous point), the spec.clusterIP remains constant.
Internal UDP load balancers don't support using sessionAffinity: ClientIP.

Before you begin

Before you start, make sure you have performed the following tasks:

Enable the Google Kubernetes Engine API.

Enable Google Kubernetes Engine API

If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.
Note: For existing gcloud CLI installations, make sure to set the compute/region and compute/zone properties. By setting default locations, you can avoid errors in gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location.

Enable GKE subsetting in a new Standard cluster

You can create a Standard cluster with GKE subsetting enabled using the Google Cloud CLI, the Google Cloud console, or Terraform. A cluster created with GKE subsetting enabled always uses GKE subsetting.

Console

Go to the Google Kubernetes Engine page in the Google Cloud console.

Go to Google Kubernetes Engine
Click Create.
Configure your cluster as desired.
From the navigation pane, under Cluster, click Networking.
Select the Enable subsetting for L4 internal load balancers checkbox.
Click Create.

gcloud

gcloud container clusters create CLUSTER_NAME \
  --cluster-version=VERSION \
  --enable-l4-ilb-subsetting \
  --location=COMPUTE_LOCATION

Replace the following:

CLUSTER_NAME: the name of the new cluster.
VERSION: the GKE version, which must be 1.18.19-gke.1400 or later. You can also use the --release-channel option to select a release channel. The release channel must have a default version 1.18.19-gke.1400 or later.
COMPUTE_LOCATION: the Compute Engine location for the cluster.

If you want to use a non-default network or subnetwork, run the following command:

gcloud container clusters create CLUSTER_NAME \
  --cluster-version=VERSION \
  --network NETWORK_NAME \
  --subnetwork SUBNETWORK_NAME \
  --enable-l4-ilb-subsetting \
  --location=COMPUTE_LOCATION

Replace the following:

SUBNET_NAME: the name of the new subnet. In GKE versions 1.22.4-gke.100 and later, you can specify a subnet in a different project by using the fully qualified resource URL for this field. You can get the fully qualified resource URL using the command gcloud compute networks subnets describe.
NETWORK_NAME: the name of the VPC network for the subnet.

Terraform

To create a Standard cluster with GKE subsetting enabled using Terraform, refer to the following example:

resource "google_container_cluster" "default" {
  name               = "gke-standard-regional-cluster"
  location           = "us-central1"
  initial_node_count = 1

  enable_l4_ilb_subsetting = true

  # Set `deletion_protection` to `true` will ensure that one cannot
  # accidentally delete this instance by use of Terraform.
  deletion_protection = false
}

To learn more about using Terraform, see Terraform support for GKE.

Enable GKE subsetting in an existing cluster

You can enable GKE subsetting for an existing cluster using the gcloud CLI or the Google Cloud console. You cannot disable GKE subsetting after you have enabled it.

Console

In the Google Cloud console, go to the Google Kubernetes Engine page.

Go to Google Kubernetes Engine
In the cluster list, click the name of the cluster you want to modify.
Under Networking, next to the Subsetting for L4 Internal Load Balancers field, click Enable subsetting for L4 internal load balancers.
Select the Enable subsetting for L4 internal load balancers checkbox.
Click Save Changes.

gcloud

gcloud container clusters update CLUSTER_NAME \
    --enable-l4-ilb-subsetting

Replace the following:

CLUSTER_NAME: the name of the cluster.

Enabling GKE subsetting does not disrupt existing internal LoadBalancer Services. If you want to migrate existing internal LoadBalancer Services to use backend services with GCE_VM_IP NEGs as backends, you must deploy a replacement Service manifest. For more details, see Node grouping in the LoadBalancer Service concepts documentation.

Deploy a workload

The following manifest describes a Deployment that runs a sample web application container image.

Save the manifest as ilb-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ilb-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: ilb-deployment
  template:
    metadata:
      labels:
        app: ilb-deployment
    spec:
      containers:
      - name: hello-app
        image: us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0

Apply the manifest to your cluster:
```
kubectl apply -f ilb-deployment.yaml
```

Create an internal LoadBalancer Service

(Optional) Disable automatic VPC firewall rules creation:

While GKE automatically creates VPC firewall rules to allow traffic to your internal load balancer, you have the option to disable the automatic VPC firewall rules creation and manage firewall rules on your own. You can disable VPC firewall rules only if you have enabled GKE subsetting for your internal LoadBalancer Service. However, managing VPC firewall rules is optional and you can rely on the automatic rules.

Before you disable automatic VPC firewall rules creation, ensure that you define allow rules that permit traffic to reach your load balancer and application Pods.

For more information on managing VPC firewall rules, see manage automatic firewall rule creation and how to disable automatic firewall rule creation, see User-managed firewall rules for GKE LoadBalancer Services.
The following example creates an internal LoadBalancer Service using TCP port 80. GKE deploys an internal passthrough Network Load Balancer whose forwarding rule uses port 80, but then forwards traffic to backend Pods on port 8080:
1. Save the manifest as ilb-svc.yaml:
```
apiVersion: v1
kind: Service
metadata:
  name: ilb-svc
  annotations:
    networking.gke.io/load-balancer-type: "Internal"
spec:
  type: LoadBalancer
  externalTrafficPolicy: Cluster
  selector:
    app: ilb-deployment
  ports:
  - name: tcp-port
    protocol: TCP
    port: 80
    targetPort: 8080
```
  Your manifest must contain the following:
  - A name for the internal LoadBalancer Service, in this case ilb-svc.
  - An annotation that specifies that you require an internal LoadBalancer Service. For GKE versions 1.17 and later, use the annotation networking.gke.io/load-balancer-type: "Internal" as shown in the example manifest. For earlier versions, use cloud.google.com/load-balancer-type: "Internal" instead.
  - The type: LoadBalancer.
  - A spec: selector field to specify the Pods the Service should target, for example, app: hello.
  - Port information:
    - The port represents the destination port on which the forwarding rule of the internal passthrough Network Load Balancer receives packets.
    - The targetPort must match a containerPort defined on each serving Pod.
    - The port and targetPort values don't need to be the same. Nodes always perform destination NAT, changing the destination load balancer forwarding rule IP address and port to a destination Pod IP address and targetPort. For more details, see Destination Network Address Translation on nodes in the LoadBalancer Service concepts documentation.
  Your manifest can contain the following:
  - spec.ipFamilyPolicy and ipFamilies to define how GKE allocates IP addresses to the Service. GKE supports either single-stack (IPv4 only or IPv6 only), or dual-stack IP LoadBalancer Services. A dual-stack LoadBalancer Service is implemented with two separate internal passthrough Network Load Balancer forwarding rules: one for IPv4 traffic and one for IPv6 traffic. The GKE dual-stack LoadBalancer Service is available in version 1.29 or later. To learn more, see IPv4/IPv6 dual-stack Services.
  For more information see, LoadBalancer Service parameters
2. Apply the manifest to your cluster:
```
kubectl apply -f ilb-svc.yaml
```

Get detailed information about the Service:

kubectl get service ilb-svc --output yaml

The output is similar to the following:

apiVersion: v1
kind: Service
metadata:
  annotations:
    cloud.google.com/neg: '{"ingress":true}'
    cloud.google.com/neg-status: '{"network_endpoint_groups":{"0":"k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r"},"zones":["ZONE_NAME","ZONE_NAME","ZONE_NAME"]}'
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"networking.gke.io/load-balancer-type":"Internal"},"name":"ilb-svc","namespace":"default"},"spec":{"externalTrafficPolicy":"Cluster","ports":[{"name":"tcp-port","port":80,"protocol":"TCP","targetPort":8080}],"selector":{"app":"ilb-deployment"},"type":"LoadBalancer"}}
    networking.gke.io/load-balancer-type: Internal
    service.kubernetes.io/backend-service: k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r
    service.kubernetes.io/firewall-rule: k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r
    service.kubernetes.io/firewall-rule-for-hc: k8s2-pn2h9n5f-l4-shared-hc-fw
    service.kubernetes.io/healthcheck: k8s2-pn2h9n5f-l4-shared-hc
    service.kubernetes.io/tcp-forwarding-rule: k8s2-tcp-pn2h9n5f-default-ilb-svc-3bei4n1r
  creationTimestamp: "2022-07-22T17:26:04Z"
  finalizers:
  - gke.networking.io/l4-ilb-v2
  - service.kubernetes.io/load-balancer-cleanup
  name: ilb-svc
  namespace: default
  resourceVersion: "51666"
  uid: d7a1a865-7972-44e1-aa9e-db5be23d6567
spec:
  allocateLoadBalancerNodePorts: true
  clusterIP: 10.88.2.141
  clusterIPs:
  - 10.88.2.141
  externalTrafficPolicy: Cluster
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: tcp-port
    nodePort: 30521
    port: 80
    protocol: TCP
    targetPort: 8080
  selector:
    app: ilb-deployment
  sessionAffinity: None
  type: LoadBalancer
status:
  loadBalancer:
    ingress:
    - ip: 10.128.15.245

The output has the following attributes:

The IP address of the internal passthrough Network Load Balancer's forwarding rule is included in status.loadBalancer.ingress. This IP address is different from the value of clusterIP. In this example, the load balancer's forwarding rule IP address is 10.128.15.245.
Any Pod that has the label app: ilb-deployment is a serving Pod for this Service. These are the Pods that receive packets routed by the internal passthrough Network Load Balancer.
Clients call the Service by using this loadBalancer IP address and the TCP destination port specified in the port field of the Service manifest. For complete details about how packets are routed once received by a node, see Packet processing.
GKE assigned a nodePort to the Service; in this example, port 30521 is assigned. The nodePort is not relevant to the internal passthrough Network Load Balancer.

Inspect the Service network endpoint group:
```
kubectl get svc ilb-svc -o=jsonpath="{.metadata.annotations.cloud\.google\.com/neg-status}"
```
The output is similar to the following:
```
{"network_endpoint_groups":{"0":"k8s2-knlc4c77-default-ilb-svc-ua5ugas0"},"zones":["ZONE_NAME"]}
```
The response indicates that GKE has created a network endpoint group named k8s2-knlc4c77-default-ilb-svc-ua5ugas0. This annotation is present in services of type LoadBalancer that use GKE subsetting and is not present in Services that do not use GKE subsetting.

Verify internal passthrough Network Load Balancer components

The internal passthrough Network Load Balancer's forwarding rule IP address is 10.128.15.245 in the example included in the Create an internal LoadBalancer Service section. You can see this forwarding rule is included in the list of forwarding rules in the cluster's project by using the Google Cloud CLI:

gcloud compute forwarding-rules list --filter="loadBalancingScheme=INTERNAL"

The output includes the relevant internal passthrough Network Load Balancer forwarding rule, its IP address, and the backend service referenced by the forwarding rule (k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r in this example).

NAME                          ... IP_ADDRESS  ... TARGET
...
k8s2-tcp-pn2h9n5f-default-ilb-svc-3bei4n1r   10.128.15.245   ZONE_NAME/backendServices/k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r

You can describe the load balancer's backend service by using the Google Cloud CLI:

gcloud compute backend-services describe k8s2-tcp-pn2h9n5f-default-ilb-svc-3bei4n1r --region=COMPUTE_REGION

Replace COMPUTE_REGION with the compute region of the backend service.

The output includes the backend GCE_VM_IP NEG or NEGs for the Service (k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r in this example):

backends:
- balancingMode: CONNECTION
  group: .../ZONE_NAME/networkEndpointGroups/k8s2-pn2h9n5f-default-ilb-svc-3bei4n1r
...
kind: compute#backendService
loadBalancingScheme: INTERNAL
name: aae3e263abe0911e9b32a42010a80008
...

To determine the list of nodes in a subset for a service, use the following command:

gcloud compute network-endpoint-groups list-network-endpoints NEG_NAME \
    --zone=COMPUTE_ZONE

Replace the following:

NEG_NAME: the name of the network endpoint group created by the GKE controller.
COMPUTE_ZONE: the compute zone of the network endpoint group to operate on.

To determine the list of healthy nodes for an internal passthrough Network Load Balancer, use the following command:

gcloud compute backend-services get-health SERVICE_NAME \
    --region=COMPUTE_REGION

Replace the following:

SERVICE_NAME: the name of the backend service. This value is the same as the name of the network endpoint group created by the GKE controller.
COMPUTE_REGION: the compute region of the backend service to operate on.

Test connectivity to the internal passthrough Network Load Balancer

Run the following command in the same region as the cluster:

curl LOAD_BALANCER_IP:80

Replace LOAD_BALANCER_IP with the load balancer's forwarding rule IP address.

The response shows the output of ilb-deployment:

Hello, world!
Version: 1.0.0
Hostname: ilb-deployment-77b45987f7-pw54n

The internal passthrough Network Load Balancer is only accessible within the same VPC network (or a connected network). By default, the load balancer's forwarding rule has global access disabled, so client VMs, Cloud VPN tunnels, or Cloud Interconnect attachments (VLANs) must be located in the same region as the internal passthrough Network Load Balancer. To support clients in all regions, you can enable global access on the load balancer's forwarding rule by including the global access annotation in the Service manifest.

Delete the internal LoadBalancer Service and load balancer resources

You can delete the Deployment and Service using kubectl delete or the Google Cloud console.

kubectl

Delete the Deployment

To delete the Deployment, run the following command:

kubectl delete deployment ilb-deployment

Delete the Service

To delete the Service, run the following command:

kubectl delete service ilb-svc

Console

Delete the Deployment

To delete the Deployment, perform the following steps:

Go to the Workloads page in the Google Cloud console.

Go to Workloads
Select the Deployment you want to delete, then click Delete.
When prompted to confirm, select the Delete Horizontal Pod Autoscaler associated with selected Deployment checkbox, then click Delete.

Delete the Service

To delete the Service, perform the following steps:

Go to the Services & Ingress page in the Google Cloud console.

Go to Services & Ingress
Select the Service you want to delete, then click Delete.
When prompted to confirm, click Delete.

Shared IP

The internal passthrough Network Load Balancer allows the sharing of a Virtual IP address amongst multiple forwarding rules. This is useful for expanding the number of simultaneous ports on the same IP or for accepting UDP and TCP traffic on the same IP. It allows up to a maximum of 50 exposed ports per IP address. Shared IPs are supported natively on GKE clusters with internal LoadBalancer Services. When deploying, the Service's loadBalancerIP field is used to indicate which IP should be shared across Services.

Limitations

A shared IP for multiple load balancers has the following limitations and capabilities:

Each forwarding rule can have up to five ports (contiguous or non-contiguous), or it can be configured to match and forward traffic on all ports. If an Internal LoadBalancer Service defines more than five ports, the forwarding rule will automatically be set to match all ports.
A maximum of ten Services (forwarding rules) can share an IP address. This results in a maximum of 50 ports per shared IP.
Each forwarding rule that shares the same IP address must use a unique combination of protocols and ports. Therefore, every internal LoadBalancer Service must use a unique set of protocols and ports.
A combination of TCP-only and UDP-only Services is supported on the same shared IP, however you cannot expose both TCP and UDP ports in the same Service.

Enabling Shared IP

To enable an internal LoadBalancer Services to share a common IP, follow these steps:

Create a static internal IP with --purpose SHARED_LOADBALANCER_VIP. An IP address must be created with this purpose to enable its ability to be shared. If you create the static internal IP address in a Shared VPC, you must create the IP address in the same service project as the instance that will use the IP address, even though the value of the IP address will come from the range of available IPs in a selected shared subnet of the Shared VPC network. Refer to reserving a static internal IP on the Provisioning Shared VPC page for more information.
Deploy up to ten internal LoadBalancer Services using this static IP in the loadBalancerIP field. The internal passthrough Network Load Balancers are reconciled by the GKE service controller and deploy using the same frontend IP.

The following example demonstrates how this is done to support multiple TCP and UDP ports against the same internal load balancer IP.

Create a static IP in the same region as your GKE cluster. The subnet must be the same subnet that the load balancer uses, which by default is the same subnet that is used by the GKE cluster node IPs.

If your cluster and the VPC network are in the same project:
```
gcloud compute addresses create IP_ADDR_NAME \
    --project=PROJECT_ID \
    --subnet=SUBNET \
    --addresses=IP_ADDRESS \
    --region=COMPUTE_REGION \
    --purpose=SHARED_LOADBALANCER_VIP
```
If your cluster is in a Shared VPC service project but uses a Shared VPC network in a host project:
```
gcloud compute addresses create IP_ADDR_NAME \
    --project=SERVICE_PROJECT_ID \
    --subnet=projects/HOST_PROJECT_ID/regions/COMPUTE_REGION/subnetworks/SUBNET \
    --addresses=IP_ADDRESS \
    --region=COMPUTE_REGION \
    --purpose=SHARED_LOADBALANCER_VIP
```
Replace the following:
- IP_ADDR_NAME: a name for the IP address object.
- SERVICE_PROJECT_ID: the ID of the service project.
- PROJECT_ID: the ID of your project (single project).
- HOST_PROJECT_ID: the ID of the Shared VPC host project.
- COMPUTE_REGION: the compute region containing the shared subnet.
- IP_ADDRESS: an unused internal IP address from the selected subnet's primary IP address range. If you omit specifying an IP address, Google Cloud selects an unused internal IP address from the selected subnet's primary IP address range. To determine an automatically selected address, you'll need to run gcloud compute addresses describe.
- SUBNET: the name of the shared subnet.

Save the following TCP Service configuration to a file named tcp-service.yaml and then deploy to your cluster. Replace IP_ADDRESS with the IP address you chose in the previous step.

apiVersion: v1
kind: Service
metadata:
  name: tcp-service
  namespace: default
  annotations:
    networking.gke.io/load-balancer-type: "Internal"
spec:
  type: LoadBalancer
  loadBalancerIP: IP_ADDRESS
  selector:
    app: myapp
  ports:
  - name: 8001-to-8001
    protocol: TCP
    port: 8001
    targetPort: 8001
  - name: 8002-to-8002
    protocol: TCP
    port: 8002
    targetPort: 8002
  - name: 8003-to-8003
    protocol: TCP
    port: 8003
    targetPort: 8003
  - name: 8004-to-8004
    protocol: TCP
    port: 8004
    targetPort: 8004
  - name: 8005-to-8005
    protocol: TCP
    port: 8005
    targetPort: 8005

Apply this Service definition against your cluster:
```
kubectl apply -f tcp-service.yaml
```

Save the following UDP Service configuration to a file named udp-service.yaml and then deploy it. It also uses the IP_ADDRESS that you specified in the previous step.

apiVersion: v1
kind: Service
metadata:
  name: udp-service
  namespace: default
  annotations:
    networking.gke.io/load-balancer-type: "Internal"
spec:
  type: LoadBalancer
  loadBalancerIP: IP_ADDRESS
  selector:
    app: my-udp-app
  ports:
  - name: 9001-to-9001
    protocol: UDP
    port: 9001
    targetPort: 9001
  - name: 9002-to-9002
    protocol: UDP
    port: 9002
    targetPort: 9002

Apply this file against your cluster:
```
kubectl apply -f udp-service.yaml
```

Validate that the VIP is shared amongst load balancer forwarding rules by listing them out and filtering for the static IP. This shows that there is a UDP and a TCP forwarding rule both listening across seven different ports on the shared IP_ADDRESS, which in this example is 10.128.2.98.

gcloud compute forwarding-rules list | grep 10.128.2.98
ab4d8205d655f4353a5cff5b224a0dde                         us-west1   10.128.2.98     UDP          us-west1/backendServices/ab4d8205d655f4353a5cff5b224a0dde
acd6eeaa00a35419c9530caeb6540435                         us-west1   10.128.2.98     TCP          us-west1/backendServices/acd6eeaa00a35419c9530caeb6540435

Known issues

Connection timeout every 10 minutes

Internal LoadBalancer Services created with Subsetting might observe traffic disruptions roughly every 10 minutes. This bug has been fixed in versions:

1.18.19-gke.1700 and later
1.19.10-gke.1000 and later
1.20.6-gke.1000 and later

Error creating load balancer in Standard tier

When you create an internal passthrough Network Load Balancer in a project with the project default network tier set to Standard, the following error message appears:

Error syncing load balancer: failed to ensure load balancer: googleapi: Error 400: STANDARD network tier (the project's default network tier) is not supported: Network tier other than PREMIUM is not supported for loadBalancingScheme=INTERNAL., badRequest

To resolve this issue in GKE versions earlier than 1.23.3-gke.900, configure the project default network tier to Premium.

This issue is resolved in GKE versions 1.23.3-gke.900 and later when GKE subsetting is enabled.

The GKE controller creates internal passthrough Network Load Balancers in the Premium network tier even if the project default network tier is set to Standard.

What's next

Learn about IP masquerade agent.

Learn about configuring authorized networks.