Setting Up Internal HTTP(S) Load Balancing for GKE pods

This document provides instructions for configuring Internal HTTP(S) Load Balancing for your services running in Google Kubernetes Engine (GKE) pods.

Before you begin

Before following the instructions in this guide, review the following documents:

Configuring Internal HTTP(S) Load Balancing with a GKE-based service

This section shows the configuration required for services that run on GKE pods. Client VMs connect to the IP address and port that you configure in the forwarding rule. When your client applications send traffic to this IP address and port, their requests are forwarded to your backend GKE pods according to your internal HTTP(S) load balancer's URL map.

The example on this page explicitly sets a reserved internal IP address for the internal forwarding rule, rather than allowing an ephemeral internal IP address to be allocated. This is the recommended best practice for forwarding rules.

Configuring backends: GKE cluster

This section creates a demonstration GKE cluster and a deployment for running a simple container that runs a simple web server that that serves the node's hostname. It also creates a ClusterIP service, which creates a NEG automatically.


To create a cluster using Cloud Console, perform the following steps:

  1. Go to the Google Kubernetes Engine menu in Cloud Console.

    Visit the Google Kubernetes Engine menu

  2. Click Create cluster.

  3. In the Name field, enter l7-ilb-cluster.

  4. For the Location type, select Zonal.

  5. For the Zone, select us-west1-b.

  6. Choose the default Master version.

  7. Expand Availability, networking, security, and additional features:

    1. Ensure that the Enable VPC-native (using alias IP) box is checked.
    2. For the Network, select lb-network.
    3. For the Node subnet, select backend-subnet.
    4. Make sure Load balancing is enabled by checking the box next to Enable HTTP load balancing.
  8. For the remaining fields, keep the default values or adjust them to meet your needs.

  9. Click Create.

When you use the Cloud Console to create the cluster, you must add the system-generated network tag to the proxy firewall filter:

  1. Find the network tag that GKE added to the nodes in the cluster. The tag is generated from the cluster hash.

    1. Go to the VM instances page.

      Go to the VM instances page

    2. In the Columns pulldown, select Network Tags.

    3. Copy the network tag for the GKE nodes. It should look something like this:

  2. Edit the fw-allow-proxies firewall rule and add the tag.

    1. Go to the Firewall rules page in the Google Cloud Console.
      Go to the Firewall rules page
    2. Click the fw-allow-proxies firewall rule, and then click Edit.
    3. In the Target tags field, add the network tag that you copied in the previous step.
    4. Click Save.


  1. Create a GKE cluster with the gcloud container clusters create command.

    gcloud container clusters create l7-ilb-cluster \
      --zone=us-west1-b \
      --network=lb-network \
      --subnetwork=backend-subnet \
      --enable-ip-alias \


Create a GKE cluster with the projects.zones.clusters.create method, replacing [project-id] with your project ID.

  "cluster": {
    "name": "l7-ilb-cluster",
    "network": "projects/<var>[project-id]</var>/global/networks/lb-network1",
    "subnetwork": "projects/<var>[project-id]</var>/regions/us-west1/subnetworks/backend-subnet1",
    "initialClusterVersion": "1.11",
    "location": "us-west1-b",
    "nodePools": [{
      "name": "l7-ilb-node-pool",
      "initialNodeCount": 3
    "defaultMaxPodsConstraint": {
      "maxPodsPerNode": "110"
    "ipAllocationPolicy": {
      "useIpAliases": True

Getting credentials to operate the cluster

Use the gcloud container clusters get-credentials command.

gcloud container clusters get-credentials l7-ilb-cluster \

Defining a deployment with test containers that serve their hostname

Create hostname.yaml with the deployment and service specification.

cat << EOF > hostname.yaml
apiVersion: v1
kind: Service
  name: hostname
  annotations: '{"exposed_ports":{"80":{}}}'
  - port: 80
    name: host1
    protocol: TCP
    targetPort: 8000
    run: hostname
  type: ClusterIP


apiVersion: extensions/v1beta1
kind: Deployment
    run: hostname
  name: hostname
  replicas: 3
        run: hostname
      - image:
        name: host1
        - /bin/sh
        - -c
        - /serve_hostname -http=true -udp=false -port=8000
        - protocol: TCP
          containerPort: 8000

Applying the configuration

kubectl apply -f hostname.yaml

Verifying the deployment and GKE configuration

Make sure that the new service hostname is created and the application pod is running.

kubectl get svc

The example output is as follows:

hostname     ClusterIP           80/TCP    41m

Note that your cluster IP address is likely different because GKE automatically creates the cluster IP range as a secondary IP range on the backend-subnet.

If the kubectl get svc command fails with credential/OAuth issues, run the gcloud auth application-default login command.

kubectl get pods

The example output is as follows:

NAME                        READY     STATUS    RESTARTS   AGE
hostname-6db459dcb9-896kh   1/1       Running   0          33m
hostname-6db459dcb9-k6ddk   1/1       Running   0          50m
hostname-6db459dcb9-x72kb   1/1       Running   0          33m

Note that your pod names are different.

You should see that the new service hostname was created, and pods for the hostname application are running.

Getting the name of the NEG

  1. Look up the name of the NEG by using the gcloud compute network-endpoint-groups list command, filtering by the cluster's zone and the name of the deployment:

    gcloud compute network-endpoint-groups list \
       --filter="us-west1-b AND hostname" \

Examining the NEG configuration

  1. Examine details and list endpoints in the NEG with the gcloud compute network-endpoint-groups list-network-endpoints and gcloud compute network-endpoint-groups describe commands, replacing neg-name with the name of the NEG that you created.

    gcloud compute network-endpoint-groups describe neg-name \
    gcloud compute network-endpoint-groups list-network-endpoints neg-name \

Configuring the load balancer for GKE

The example demonstrates the following internal HTTP(S) load balancer configuration tasks:

  • Creates a health check using the HTTP protocol
  • Creates a regional, internal managed backend service
  • Add the NEG as a backend to the backend service
  • Create a URL map
    • Make sure to refer to a regional URL map if a region is defined for the target HTTP(S) proxy. A regional URL map routes requests to a regional backend service based on rules that you define for the host and path of an incoming URL. A regional URL map can be referenced by a regional target proxy rule in the same region only.
  • Create a target proxy
  • Create a forwarding rule


Select a load balancer type

  1. Go to the Load balancing page in the Google Cloud Console.
    Go to the Load balancing page
  2. Click Create load balancer.
  3. Under HTTP(S) Load Balancing, click Start configuration.
  4. Select Only between my VMs. This setting means that the load balancer is internal.
  5. Click Continue.

Prepare the load balancer

  1. For the Name of the load balancer, enter l7-ilb-gke-map.
  2. Ensure the Protocol is HTTP.
  3. For the Region, select us-west1.
  4. For the VPC network, select lb-network.
  5. Keep the window open to continue.

Reserve a proxy-only subnet

For Internal HTTP(S) Load Balancing, reserve a proxy subnet:

  1. Click Reserve a Subnet.
  2. For the Name, enter proxy-subnet.
  3. For the Network, select lb-network.
  4. For the Region, select us-west1.
  5. For the IP address range, enter
  6. Click Add.

Configure the backend service

  1. Click Backend configuration.
  2. From the Create or select backend services menu, select Create a backend service.
  3. Set the Name of the backend service to l7-ilb-gke-backend-service.
  4. Under Backend type, select Network endpoint groups.
  5. In the New backend card of the Backends section:
    1. Set the Network endpoint group to the NEG was created by GKE. Refer to Getting the name of the NEG for how to determine its name.
    2. Enter a maximum rate of 5 RPS per endpoint. Google Cloud will exceed this maximum if necessary.
    3. Click Done.
  6. In the Health check section, choose Create a health check with the following parameters:
    1. Name: l7-ilb-gke-basic-check
    2. Protocol: HTTP
    3. Port specification: Serving port
    4. Click Save and Continue.
  7. Click Create.

Configure the URL map

  1. Click Host and path rules. Ensure that the l7-ilb-gke-backend-service is the only backend service for any unmatched host and any unmatched path.

Configure the frontend components

  1. Click Frontend configuration and edit the New frontend IP and port section.
  2. Set the Name to l7-ilb-gke-forwarding-rule.
  3. Set the Protocol to HTTP.
  4. Set the Subnet to backend-subnet.
  5. Choose Reserve a static internal IP address from the Internal IP pop-up button.
  6. In the panel that appears provide the following details:
    1. Name: l7-ilb-gke-ip
    2. In the Static IP address section, select Let me choose.
    3. In the Custom IP address section, enter
    4. Click Reserve.
  7. Set the Port to 80.
  8. Click Done.

Complete the configuration

  1. Click Create.


  1. Define the HTTP health check with the gcloud compute health-checks create http command.

    gcloud beta compute health-checks create http l7-ilb-gke-basic-check \
    --region=us-west1 \
  2. Define the backend service with the gcloud compute backend-services create command.

    gcloud beta compute backend-services create l7-ilb-gke-backend-service \
    --load-balancing-scheme=INTERNAL_MANAGED \
    --protocol=HTTP \
    --health-checks=l7-ilb-gke-basic-check \
    --health-checks-region=us-west1 \
  3. Add NEG backends to the backend service with the gcloud compute backend-services add-backend command.

    gcloud beta compute backend-services add-backend l7-ilb-gke-backend-service \
     --network-endpoint-group=neg-name \
     --network-endpoint-group-zone=us-west1-b \
     --region=us-west1 \
     --balancing-mode=RATE \
  4. Create the URL map with the gcloud compute url-maps create command.

    gcloud beta compute url-maps create l7-ilb-gke-map \
    --default-service=l7-ilb-gke-backend-service \
  5. Create the target proxy with the gcloud compute target-http-proxies create command.

    gcloud beta compute target-http-proxies create l7-ilb-gke-proxy \
    --url-map=l7-ilb-gke-map \
    --url-map-region=us-west1 \
  6. Create the forwarding rule with the gcloud compute forwarding-rules create command.

    For custom networks, you must reference the subnet in the forwarding rule. Note that this is the VM subnet, not the proxy subnet.

    gcloud beta compute forwarding-rules create l7-ilb-gke-forwarding-rule \
    --load-balancing-scheme=INTERNAL_MANAGED \
    --network=lb-network \
    --subnet=backend-subnet \
    --address= \
    --ports=80 \
    --region=us-west1 \
    --target-http-proxy=l7-ilb-gke-proxy \


Create the health check by making a POST request to the healthChecks.insert method, replacing [project-id] with your project ID.

  "name": "l7-ilb-gke-basic-check",
  "type": "HTTP",
  "httpHealthCheck": {
    "portSpecification": "USE_SERVING_PORT"

Create the regional backend service by making a POST request to the regionBackendServices.insert method, replacing [project-id] with your project ID and [neg-name] with the name of the NEG that you created.

  "name": "l7-ilb-gke-backend-service",
  "backends": [
      "group": "[project-id]/zones/us-west1-b/networkEndpointGroups/[neg-name]",
      "balancingMode": "RATE"
      "maxRatePerEndpoint": 5
  "healthChecks": [
  "loadBalancingScheme": "INTERNAL_MANAGED"

Create the URL map by making a POST request to the urlMap.insert method, replacing [project-id] with your project ID.

  "name": "l7-ilb-gke-map",
  "defaultService": "projects/[project-id]/regions/us-west1/backendServices/l7-ilb-gke-backend-service"

Create the target HTTP proxy by making a POST request to the targetHttpProxies.insert method, replacing [project-id] with your project ID.

  "name": "l7-ilb-gke-proxy",
  "urlMap": "projects/[project-id]/global/urlMaps/l7-ilb-gke-map",
  "region": "us-west1"

Create the forwarding rule by making a POST request to the forwardingRules.insert method, replacing [project-id] with your project ID.

  "name": "l7-ilb-gke-forwarding-rule",
  "IPAddress": "",
  "IPProtocol": "TCP",
  "portRange": "80-80",
  "target": "projects/[project-id]/regions/us-west1/targetHttpProxies/l7-ilb-gke-proxy",
  "loadBalancingScheme": "INTERNAL_MANAGED",
  "subnetwork": "projects/[project-id]/regions/us-west1/subnetworks/backend-subnet",
  "network": "projects/[project-id]/global/networks/lb-network",
  "networkTier": "PREMIUM",


Creating a VM instance in the zone to test connectivity

gcloud compute instances create l7-ilb-client-us-west1-b \
    --image-family=debian-9 \
    --image-project=debian-cloud \
    --zone=us-west1-b \
    --network=lb-network \
    --subnet=backend-subnet \

Allowing SSH access to the instance

gcloud compute firewall-rules create allow-ssh-to-l7-ilb-client \
    --network=lb-network \
    --target-tags=l7-ilb-client \

Testing the load balancer

Log in to the client instance to that HTTP(S) services on the backends are reachable via the internal HTTP(S) load balancer's forwarding rule IP address, and traffic is being load balanced among endpoints in the NEG.

Connecting via SSH to each client instance

gcloud compute ssh l7-ilb-client-us-west1-b \

Verifying that the IP is serving its hostname


Running 100 requests and confirm that they are load balanced

for i in {1..100}
    RESULTS="$RESULTS:$(curl --silent"
echo "***"
echo "*** Results of load-balancing to "
echo "***"
echo "$RESULTS" | tr ':' '\n' | grep -Ev "^$" | sort | uniq -c

Note that multiple proxies perform the load balancing, one for each curl command, and they don't coordinate their selection of backends. Therefore, in this test, the backends don't receive the same number of requests. However, over the long term (in other words, thousands to millions of requests), the fraction of requests received by each backend approaches an equal distribution.

What's next

Σας βοήθησε αυτή η σελίδα; Πείτε μας τη γνώμη σας:

Αποστολή σχολίων σχετικά με…

Αυτή η σελίδα