Internal Load Balancing

This page explains how to create a Compute Engine internal load balancer on Google Kubernetes Engine.

Overview

Internal load balancing makes your cluster's services accessible to applications running on the same network but outside of the cluster. For example, if you run a cluster alongside some Compute Engine VM instances in the same network and you would like your cluster-internal services to be available to the cluster-external instances, you need to configure one of your cluster's Service resources to add an internal load balancer.

Without internal load balancing, you would need to set up external load balancers, create firewall rules to limit the access, and set up network routes to make the IP address of the application accessible outside of the cluster.

Internal load balancing creates a private (RFC 1918) LoadBalancer Ingress IP address in the cluster for receiving traffic on the network within the same compute region from an IP range in the user’s subnet.

You create an internal load balancer by using kubectl to create a Service resource with a cloud.google.com/load-balancer-type: "Internal" annotation and a LoadBalancer specification.

Pricing

You are charged per Compute Engine's pricing model. For more information, refer to Compute Engine's internal load balancing pricing page.

Before you begin

To prepare for this task, perform the following steps:

  • Ensure that you have enabled the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • Ensure that you have installed the Cloud SDK.
  • Set your default project ID:
    gcloud config set project [PROJECT_ID]
  • If you are working with zonal clusters, set your default compute zone:
    gcloud config set compute/zone [COMPUTE_ZONE]
  • If you are working with regional clusters, set your default compute region:
    gcloud config set compute/region [COMPUTE_REGION]
  • Update gcloud to the latest version:
    gcloud components update

Creating an Internal Load Balancer

The following sections explain how to create an internal load balancer using a Service. Internal load balancers support Service parameters, such as externalTrafficPolicy, sessionAffinity, and loadBalancerSourceRanges.

Writing the Service Configuration File

The following is an example of a Service, service.yaml, that creates an internal load balancer:

apiVersion: v1
kind: Service
metadata:
  name: [SERVICE_NAME]
  annotations:
    cloud.google.com/load-balancer-type: "Internal"
  labels:
    [KEY]: [VALUE]
spec:
  type: LoadBalancer
  loadBalancerIP: [IP_ADDRESS] # if omitted, an IP is generated
  loadBalancerSourceRanges:
  - [IP_RANGE] # defaults to 0.0.0.0/0
  ports:
  - name: [PORT_NAME]
    port: 9000
    protocol: TCP # default; can also specify UDP
  selector:
    [KEY]: [VALUE] # label selector for Pods to target

Your Service configuration file must contain the following:

  • [SERVICE_NAME], the name you choose for the Service
  • An annotation, cloud.google.com/load-balancer-type: "Internal", which specifies that an internal load balancer is to be configured
  • The type LoadBalancer and port fields.

You should also include the following:

  • a spec: loadBalancerSourceRanges array to specify one or more RFC 1918 ranges used by your VPC Networks, Subnetworks, or VPN Gateways. loadBalancerSourceRanges restricts traffic through the load balancer to the IPs specified in this field. If you do not set this field manually, the field defaults to 0.0.0.0, which allows all IPv4 traffic to reach the nodes.
  • a spec: selector field to specify the Pods the Service should target. For example, the selector might target Pods labelled app: web.

You can also include the following optional fields:

  • spec: loadBalancerIP enables you to choose a specific IP address for the load balancer. The IP address must not be in use by another internal load balancer or Service. If omitted, an ephemeral IP is assigned. For more information about reservering private IP addresses within subnets, see Reserving a Static Internal IP Address.
  • spec: ports: protocol defines the network protocol the internal load balancer's port should use. If omitted, the port uses TCP.

For more information about configuring loadBalancerSourceRanges to restrict access to your internal load balancer, refer to Configure Your Cloud Provider's Firewalls. For more information about the Service specification, see the Service API reference.

Deploying the Service

To create the internal load balancer, run the following command in your shell or terminal window:

kubectl apply -f service.yaml

Inspecting the Service

After deployment, inspect the Service to verify that it has been configured successfully.

gcloud

To inspect the internal load balancer, run the following command:

kubectl describe service [SERVICE_NAME]

The command's output is similar to the following:

Name:     [SERVICE_NAME]
Namespace:    default
Labels:     app=echo
Annotations:    cloud.google.com/load-balancer-type=Internal
Selector:   app=echo
Type:     LoadBalancer
IP:     10.0.146.226
LoadBalancer Ingress: 10.128.0.6
Port:       9000/TCP
NodePort:   30387/TCP
Endpoints:    10.100.1.10:80,10.100.2.10:80,10.100.3.8:80
Session Affinity: ClientIP

IP is the Service's cluster IP address.

Console

To inspect the internal load balancer, perform the following steps:

  1. Visit the Google Kubernetes Engine Services menu in GCP Console.

    Visit the Services menu

  2. Select the desired Service.

The Service details menu includes the following:

  • External endpoints
  • Cluster IP
  • Load balancer IP
  • A list of Pods served by the Service

Using the Internal Load Balancer

You can access the Service from within the cluster using the cluster IP address. To access the Service from outside the cluster, use the LoadBalancer Ingress IP address.

Considerations for Existing Ingresses

If your cluster has an existing Ingress resource, that resource must use the balancing mode RATE. UTILIZATION balancing mode is not compatible with internal load balancers.

Earlier BackendService resources created by Kubernetes Ingress Resources objects were created with no balancing mode specified. By default, the API used balancing mode UTILIZATION for HTTP load balancers. However, internal load balancers cannot be pointed to instance groups with other load balancers using UTILIZATION.

To ensure compatibility with an internal load balancer and Ingress resources, you may need to perform some manual steps.

Determining if your Ingress is Compatible

To determine if your Ingress is compatible, run the following commands from your shell or terminal window:

GROUPNAME=`kubectl get configmaps ingress-uid -o jsonpath='k8s-ig--{.data.uid}' --namespace=kube-system`
gcloud compute backend-services list --format="table(name,backends[].balancingMode,backends[].group)" | grep $GROUPNAME

These commands export a shell variable, GROUPNAME, which fetches your cluster's instance group name. Then, your project's Compute Engine BackendService resources are polled and the results are narrowed down based on the contents of $GROUPNAME.

The output is similar to the following:

k8s-be-31210--...  [u'RATE']       us-central1-b/instanceGroups/k8s-ig--...
k8s-be-32125--...  [u'RATE']       us-central1-b/instanceGroups/k8s-ig--...

If the output return RATE entries or zero entries are returned, then internal load balancers are compatible and no additional work is needed.

If the output returns entries marked with UTILIZATION, your Ingresses are not compatible.

Updating your Existing Ingresses

Ingress balancing mode type can only change when there are no existing HTTP(S) load balancers pointing to the cluster.

To update your Ingress resources to be compatible with an internal load balancer, you can create a new cluster running Kubernetes version 1.7.2 or higher, then migrate your services to that cluster. Migrating to the new cluster ensures that no Ingresses can exist with the incompatible balancing mode.

Restrictions for Internal Load Balancers

  • Your master and nodes must be running Kubernetes version 1.7.2 or higher.
  • Internal load balancers are only accessible from within the same network and region.
  • Internal load balancer ports can only serve traffic on one type of protocol, TCP or UDP. The internal load balancer uses the protocol of the first port specified in the Service definition.
  • Internal load balancers do not support using UDP and sessionAffinity: ClientIP together.
  • For clusters running Kubernetes version 1.7.4 or later, you can use internal load balancers with custom-mode subnets in addition to auto-mode subnets.
  • For clusters running Kubernetes 1.7.X, while the clusterIP remains unchanged, internal load balancer IP addresses cannot be reserved. Changes made to ports, protocols, or session affinity may cause these IP addresses to change.

Limits

  • Each cluster node is a backend instance and counts against the backend instance limitation. For example, if a cluster contains 300 nodes and there is a limitation of 250 backend instances, only 250 instances receive traffic. This may adversely affect services with externalTrafficPolicy set to Local.
  • A maximum number of 50 internal load balancer forwarding rules is allowed per network.

For information on the limitations of internal load balancers, see the Limits section of the Compute Engine internal load balancing page.

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Kubernetes Engine