Container-native load balancing with standalone zonal NEGs

This page shows you how to create a Kubernetes Service that is backed by a zonal network endpoint group (NEG).

See Container-native load balancing for information on the benefits, requirements, and limitations of container-native load balancing.

Overview

A network endpoint group (NEG) represents a group of backends served by a load balancer. NEGs are lists of IP addresses that are managed by a NEG controller, and are used by Google Cloud load balancers. IP addresses in a NEG can be primary or secondary IP addresses of a VM, which means they can be Pod IPs. This enables container-native load balancing that sends traffic directly to Pods from a Google Cloud load balancer.

The following diagram describes how Kubernetes API objects correspond to Compute Engine objects.

Kubernetes Services correspond to Compute Engine network endpoint groups,
while Kubernetes Pods correspond to Compute Engine network endpoints. The NEG
controller component of the GKE master manages this.

Ingress with NEGs

When NEGs are used with GKE Ingress, the Ingress controller facilitates the creation of all aspects of the L7 load balancer. This includes creating the virtual IP address, forwarding rules, health checks, firewall rules, and more.

Ingress is the recommended way to use container-native load balancing as it has many features that simplify the management of NEGs. Standalone NEGs are an option if NEGs managed by Ingress do not serve your use case.

Standalone NEGs

When NEGs are deployed with load balancers provisioned by anything other than Ingress, they are considered standalone NEGs. Standalone NEGs are deployed and managed through the NEG controller, but the forwarding rules, health checks, and other load balancing objects are deployed manually.

Standalone NEGs do not conflict with Ingress enabled container-native load balancing.

The following illustration shows the differences in how the load balancing objects are deployed in each scenario:

With standalone NEGs and Ingress managed NEGs, the NEG controller on the
GKE master manages the NEG and Network endpoints objects. With standalone NEGs, every other component is
managed by the user as described in the previous paragraphs.

Preventing leaked NEGs

With standalone NEGs, you are responsible for managing the lifecycles of NEGs and the resources that make up the load balancer. You could leak NEGs in these ways:

  • When a GKE Service is deleted, the associated NEG will not be garbage collected if the NEG is still referenced by a backend Service. Dereference the NEG from the backend service to allow NEG deletion.
  • When a cluster is deleted, standalone NEGs are not deleted.

Use cases of standalone NEGs

Standalone NEGs have several critical uses. Standalone NEGs are very flexible. This is in contrast to Ingress (used with or without NEGs) which defines a specific set of load balancing objects that were chosen in an opinionated way to make them easy to use.

Use cases for standalone NEGs include:

Heterogeneous services of containers and VMs

NEGs can contain both VM and container IP addresses. This means a single virtual IP address can point to a backend that consists of both Kubernetes and non-Kubernetes workloads. This can also be used to migrate existing workloads to a GKE cluster.

Standalone NEGs can point to VM IPs which makes it possible to manually configure load balancers to point at backends that are comprised of both VMs and containers for the same service VIP.

Customized Ingress controllers

You can use a customized Ingress controller (or no Ingress controller) to configure load balancers that target standalone NEGs.

Use Traffic Director with GKE

Traffic Director with GKE. Traffic Director uses standalone NEGs to provide container-native load balancing for the managed service mesh.

Use TCP Proxy Load Balancing with GKE

You can use standalone NEGs to load balance directly to containers with the TCP proxy load balancer which is not supported natively by Kubernetes/GKE.

Pod readiness

Readiness gates are an extensibility feature of Kubernetes that enable the injection of extra feedback or signals into the PodStatus to allow the Pod to transition to the Ready state. The NEG controller manages a custom readiness gate to ensure the full network path, from Compute Engine load balancer to pod is functional. Pod readiness gates in GKE are explained in container-native load balancing.

Ingress with NEGs deploys and manages Compute Engine health checks on behalf of the load balancer. However, standalone NEGs make no assumptions about Compute Engine health checks because they are expected to be deployed and managed separately. Compute Engine health checks should always be configured along with the load balancer to prevent traffic from being sent to backends which are not ready to receive. If there is no health check status associated with the NEG (usually because no health check is configured), then the NEG controller will mark the Pod's readiness gate value to True when its corresponding endpoint is programmed in NEG.

Requirements

Standalone NEGs are available in GKE 1.10 and higher. Pod readiness feedback is enabled for standalone NEGs in version 1.16.4 and higher.

Your cluster must be VPC-native. To learn more, see to Creating VPC-native clusters using Alias IPs.

Your cluster must have HTTP load-balancing enabled. GKE clusters have HTTP load-balancing enabled by default; you must not disable it.

Before you begin

Before you start, make sure you have performed the following tasks:

Set up default gcloud settings using one of the following methods:

  • Using gcloud init, if you want to be walked through setting defaults.
  • Using gcloud config, to individually set your project ID, zone, and region.

Using gcloud init

If you receive the error One of [--zone, --region] must be supplied: Please specify location, complete this section.

  1. Run gcloud init and follow the directions:

    gcloud init

    If you are using SSH on a remote server, use the --console-only flag to prevent the command from launching a browser:

    gcloud init --console-only
  2. Follow the instructions to authorize gcloud to use your Google Cloud account.
  3. Create a new configuration or select an existing one.
  4. Choose a Google Cloud project.
  5. Choose a default Compute Engine zone.

Using gcloud config

  • Set your default project ID:
    gcloud config set project project-id
  • If you are working with zonal clusters, set your default compute zone:
    gcloud config set compute/zone compute-zone
  • If you are working with regional clusters, set your default compute region:
    gcloud config set compute/region compute-region
  • Update gcloud to the latest version:
    gcloud components update

Using standalone NEGs

The instructions below show how to use standalone NEGs with an external HTTP load balancer on GKE.

This involves creating several objects:

  • A Deployment that creates and manages Pods.
  • A Service that creates a NEG.
  • A load balancer created with the Compute Engine API. This differs from using NEGs with Ingress, in that case Ingress creates and configures a load balancer for you. With standalone NEGs you are responsible for associating the NEG and the backend service to connect the Pods to the load balancer. The load balancer consists of several components, shown in the diagram below:

The components of a load balancer are a forwarding rule, target HTTP proxy,
URL map, health check, and backend service. This directs traffic to a NEG that
contains Pod IP addresses.

Create a VPC-native cluster

To use container-native load balancing, you must create a cluster with alias IPs enabled. This cluster:

  • Must run Google Kubernetes Engine version 1.16.4 or later.
  • Must be a VPC-native cluster.
  • Must have the HTTP load-balancing add-on enabled. GKE clusters have HTTP load-balancing enabled by default; you must not disable it.

This command creates a cluster, neg-demo-cluster, in zone us-central1-a, with an autoprovisioned subnetwork:

gcloud container clusters create neg-demo-cluster \
    --enable-ip-alias \
    --create-subnetwork="" \
    --network=default \
    --zone=us-central1-a \
    --cluster-version version

where the cluster version must be 1.16.4 or later.

Create a Deployment

The manifests below specifies a Deployment, neg-demo-app, that runs three instances of a containerized HTTP server. The HTTP server responds to requests with the hostname of the application server, the name of the Pod the server is running on.

We recommend you use workloads that use Pod Readiness feedback if it is available in the version of GKE you are using. See the Pod readiness section above for more information and see the Requirements section for GKE version requirements for using Pod Readiness feedback. Consider upgrading your cluster to use Pod Readiness feedback.

using Pod readiness feedback

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: neg-demo-app # Label for the Deployment
  name: neg-demo-app # Name of Deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      run: neg-demo-app
  template: # Pod template
    metadata:
      labels:
        run: neg-demo-app # Labels Pods from this Deployment
    spec: # Pod specification; each Pod created by this Deployment has this specification
      containers:
      - image: k8s.gcr.io/serve_hostname:v1.4 # Application to run in Deployment's Pods
        name: hostname
  

using hardcoded delay

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: neg-demo-app # Label for the Deployment
  name: neg-demo-app # Name of Deployment
spec:
  minReadySeconds: 60 # Number of seconds to wait after a Pod is created and its status is Ready
  replicas: 3
  selector:
    matchLabels:
      run: neg-demo-app
  template: # Pod template
    metadata:
      labels:
        run: neg-demo-app # Labels Pods from this Deployment
    spec: # Pod specification; each Pod created by this Deployment has this specification
      containers:
      - image: k8s.gcr.io/serve_hostname:v1.4 # Application to run in Deployment's Pods
        name: hostname
      # Note: The following line is necessary only on clusters running GKE v1.11 and lower.
      # For details, see https://cloud.google.com/kubernetes-engine/docs/how-to/container-native-load-balancing#align_rollouts
      terminationGracePeriodSeconds: 60 # Number of seconds to wait for connections to terminate before shutting down Pods
  

Save this manifest as neg-demo-app.yaml, then create the Deployment by running the following command:

kubectl apply -f neg-demo-app.yaml

Create a Service

The manifest below specifies a Service, neg-demo-svc.

  • Any Pod with the label run: neg-demo-app is a member of this Service.
  • The Service has one ServicePort field with port 80.
  • The cloud.google.com/neg annotation specifies that port 80 will be associated with a NEG.
  • Each member Pod must have a container that is listening on TCP port 9376.
apiVersion: v1
kind: Service
metadata:
  name: neg-demo-svc
  annotations:
    cloud.google.com/neg: '{"exposed_ports": {"80":{}}}'
spec:
  type: ClusterIP
  selector:
    run: neg-demo-app # Selects Pods labelled run: neg-demo-app
  ports:
  - port: 80
    protocol: TCP
    targetPort: 9376

Save this manifest as neg-demo-svc.yaml, then create the Service by running the following command:

kubectl apply -f neg-demo-svc.yaml

Service types

While this example uses a ClusterIP service, all five types of Service support standalone NEGs. We recommend the default type, ClusterIP.

Mapping ports to multiple NEGs

A Service can listen on more than one port. By definition, NEGs have only a single IP address and port. This means that if you specify a Service with multiple ports, it will create a NEG for each port.

The format of the cloud.google.com.neg annotation is:

cloud.google.com/neg: '{
   "exposed_ports":{
      "service-port-1":{},
      "service-port-2":{},
      "service-port-3":{},
      ...
   }'

where service-port-n are distinct port numbers that refer to existing service ports of the Service. For each service port listed, the NEG controller creates one NEG in each zone the cluster occupies.

Retrieve NEG statuses

Use this command to retrieve the statuses of the cluster's Services:

kubectl get service neg-demo-svc -o yaml

This command outputs the status of the NEGs in this format:

cloud.google.com/neg-status: '{
   "network-endpoint-groups":{
      "service-port-1": "neg-name-1",
      "service-port-1": "neg-name-2",
      ...
   },
   "zones":["zone-1", "zone-2", ...]
}

where each element in the network-endpoint-groups mapping is a service port (service-port-1) and the name of the corresponding managed NEGs (like neg-name-1). The zones list contains every zone (like zone-1) that has a NEG in it.

The example below is the full output of the command:

apiVersion: v1
kind: Service
metadata:
  annotations:
    cloud.google.com/neg: '{"exposed_ports": {"80":{}}}'
    cloud.google.com/neg-status: '{"network_endpoint_groups":{"80":"k8s1-cca197ad-default-neg-demo-app-80-4db81e02"},"zones":["us-central1-a", "us-central1-b"]}'
  labels:
    run: neg-demo-app
  name: neg-demo-app
  namespace: default
  selfLink: /api/v1/namespaces/default/services/neg-demo-app
  ...
spec:
  clusterIP: 10.0.14.252
  ports:
  - port: 80
    protocol: TCP
    targetPort: 9376
  selector:
    run: neg-demo-app
  sessionAffinity: None
status:
  loadBalancer: {}

In this example, the annotation shows that service port 80 is exposed to NEGs named k8s1-cca197ad-default-neg-demo-app-80-4db81e02 located in zones us-central1-a and us-central1-b.

Validate NEG creation

A NEG is created within a few minutes of Service creation. If there are Pods that match the label specified in the Service manifest, then upon creation the NEG will contain the IPs of the Pods.

Verify that the NEG exists by listing the NEGs in your Google Cloud project and checking for a NEG that matches the Service you created. The NEG's name has this format: k8s1-cluster-uid-namespace-service-port-random-hash

Use this command to list NEGs:

gcloud compute network-endpoint-groups list

The output resembles this:

NAME                                          LOCATION       ENDPOINT_TYPE   SIZE
k8s1-70aa83a6-default-my-service-80-c9710a6f  us-central1-a  GCE_VM_IP_PORT  3

This output shows that the SIZE of the NEG is 3, meaning that it has three endpoints which correspond to the three Pods in the Deployment.

Identify the individual endpoints with this command:

gcloud compute network-endpoint-groups list-network-endpoints \
    k8s1-70aa83a6-default-my-service-80-c9710a6f

The output shows three endpoints, each endpoint is a has a Pod's IP address and port:

INSTANCE                                           IP_ADDRESS  PORT
gke-standard-cluster-3-default-pool-4cc71a15-qlpf  10.12.1.43  9376
gke-standard-cluster-3-default-pool-4cc71a15-qlpf  10.12.1.44  9376
gke-standard-cluster-3-default-pool-4cc71a15-w9nk  10.12.2.26  9376

Attaching a load balancer to your standalone NEGs

Now that you've created your NEGs, you can use them as backends for the following types of load balancers:

  • An external HTTP(S) load balancer
  • An internal HTTP(S) load balancer
  • An SSL proxy load balancer
  • An TCP proxy load balancer

The following examples show you how:

Attaching an external HTTP(S) load balancer to standalone NEGs

The following steps show how to create an external HTTP load balancer using the Compute Engine API.

  1. Create a firewall rules Load balancers need to access to cluster endpoints to perform health checks. This command creates firewall rules that allow that access:

    gcloud compute firewall-rules create fw-allow-health-check-and-proxy \
      --network=network-name \
      --action=allow \
      --direction=ingress \
      --target-tags=gke-node-network-tags \
      --source-ranges=130.211.0.0/22,35.191.0.0/16 \
      --rules=tcp:9376

    where gke-node-network-tags is the networking tags on the GKE nodes and network-name is the network where the cluster runs.

    If you did not create custom network tags for your nodes, GKE generated tags for you. You can look up these automatically generated tags by checking the nodes with:

    gcloud compute instances describe node-name
    
  2. Create a global virtual IP address for the load balancer:

    gcloud compute addresses create hostname-server-vip \
      --ip-version=IPV4 \
      --global
  3. Create a health check. This is used by the load balancer to detect the liveness of individual endpoints within the NEG.

    gcloud compute health-checks create http http-basic-check \
      --use-serving-port
  4. Create a backend service that specifies that this is a global external HTTP(S) load balancer:

    gcloud compute backend-services create my-bes \
      --protocol HTTP \
      --health-checks http-basic-check \
      --global
  5. Create a URL map and target proxy for the load balancer. This example is very simple because the serve_hostname app used for this guide has a single endpoint and does not feature URLs.

    gcloud compute url-maps create web-map \
      --default-service my-bes
    gcloud compute target-http-proxies create http-lb-proxy \
      --url-map web-map
  6. Create a forwarding rule. This is what creates the load balancer.

    gcloud compute forwarding-rules create http-forwarding-rule \
      --address=hostname-server-vip \
      --global \
      --target-http-proxy=http-lb-proxy \
      --ports=80

    hostname-server-vip is the IP address to use for the load balancer. You can reserve a new static external IP address for this purpose. You can also omit the --address option and an ephemeral IP address will be assigned automatically.

Checkpoint

These are the resources you have have created so far:

  • An external virtual IP address
  • The forwarding rules
  • The firewall rules
  • The target HTTP proxy
  • The URL map the Compute Engine health check
  • The Backend Service
  • The Compute Engine health check

The relationship between these resources are shown in the diagram below:

""

These resources together are a load balancer. In the next step you will add backends to the load balancer.

One of the benefits of standalone NEGs demonstrated here is that the lifecycles of the load balancer and backend can be completely independent. The load balancer can continue running after the application, its Services, or the GKE cluster is deleted. You can add and remove new NEGs or multiple NEGs from the load balancer without changing any of the fronted load balancer objects.

Add backends to the load balancer

Use gcloud compute backend-services add-backend to connect the NEG to the load balancer by adding it as a backend of the my-bes backend service:

gcloud compute backend-services add-backend my-bes --global \
   --network-endpoint-group network-endpoint-group-name \
   --network-endpoint-group-zone network-endpoint-group-zone \
   --balancing-mode RATE --max-rate-per-endpoint 5

where:

  • network-endpoint-group-name is the name of your network endpoint group. See the instructions below to find this value.
  • network-endpoint-group-zone is the zone your network endpoint group is in. See the instructions below to find this value.

Use this command to get the name and location of the NEG:

gcloud compute network-endpoint-groups list

The output resembles this:

NAME                                          LOCATION       ENDPOINT_TYPE   SIZE
k8s1-70aa83a6-default-my-service-80-c9710a6f  us-central1-a  GCE_VM_IP_PORT  3

In this example output, the name of the NEG is k8s1-70aa83a6-default-my-service-80-c9710a6f and it is in zone `us-central1-a

Multiple NEGs can be added to the same backend service. Global backend services like my-bes can have NEG backends in different regions, while regional backend services must have backends in a single region.

Validate that the load balancer works

There are two ways to validate that the load balancer you set up is working: verify that the health check is correctly configured and reporting healthy, and by accessing the application and verifying its response

Verify health checks

Check that the backend service is associated with the health check and network endpoint groups, and that the individual endpoints are healthy.

Use this command to check that the backend service is associated with your health check and your network endpoint group:

gcloud compute backend-services describe my-bes --global

The output should resemble this:

backends:
- balancingMode: RATE
  capacityScaler: 1.0
  group: ... /networkEndpointGroups/k8s1-70aa83a6-default-my-service-80-c9710a6f
...
healthChecks:
- ... /healthChecks/http-basic-check
...
name: my-bes
...

Next, check the health of the individual endpoints:

gcloud compute backend-services get-health my-bes --global

The status: section of the output should resemble this:

status:
  healthStatus:
  - healthState: HEALTHY
    instance: ... gke-standard-cluster-3-default-pool-4cc71a15-qlpf
    ipAddress: 10.12.1.43
    port: 50000
  - healthState: HEALTHY
    instance: ... gke-standard-cluster-3-default-pool-4cc71a15-qlpf
    ipAddress: 10.12.1.44
    port: 50000
  - healthState: HEALTHY
    instance: ... gke-standard-cluster-3-default-pool-4cc71a15-w9nk
    ipAddress: 10.12.2.26
    port: 50000

Access the application

Access the application through the load balancer's IP address to confirm that everything is working.

First, get the virtual IP address of the load balancer:

gcloud compute addresses describe hostname-server-vip --global | grep "address:"

The output will include an IP address. Next, send a request to that IP address (34.98.102.37 in this example):

curl 34.98.102.37

The response from the serve_hostname app should be neg-demo-app.

Attaching an internal HTTP(S) load balancer to standalone NEGs

This section provides instructions for configuring Internal HTTP(S) Load Balancing for your services running in standalone GKE Pods.

Configuring the proxy-only subnet

The proxy-only subnet is for all internal HTTP(S) load balancers in the load balancer's region, in this example us-west1.

console

If you're using the GCP Console, you can wait and create the proxy-only subnet later in the Load Balancing UI.

gcloud

Create the proxy-only subnet with the gcloud compute networks subnets create command.

gcloud compute networks subnets create proxy-only-subnet \
  --purpose=INTERNAL_HTTPS_LOAD_BALANCER \
  --role=ACTIVE \
  --region=us-west1 \
  --network=lb-network \
  --range=10.129.0.0/23

api

Create the proxy-only subnet with the subnetworks.insert method, replacing project-id with your project ID.

POST https://www.googleapis.com/compute/projects/project-id/regions/us-west1/subnetworks

{
  "name": "proxy-only-subnet",
  "ipCidrRange": "10.129.0.0/23",
  "network": "projects/project-id/global/networks/lb-network",
  "region": "projects/project-id/regions/us-west1",
  "purpose": "INTERNAL_HTTPS_LOAD_BALANCER",
  "role": "ACTIVE"
}

Configuring firewall rules

This example uses the following firewall rules:

  • fw-allow-ssh: An ingress rule, applicable to the instances being load balanced, that allows incoming SSH connectivity on TCP port 22 from any address. You can choose a more restrictive source IP range for this rule; for example, you can specify just the IP ranges of the system from which you initiate SSH sessions. This example uses the target tag allow-ssh to identify the VMs to which the firewall rule applies.

  • fw-allow-health-check: An ingress rule, applicable to the instances being load balanced, that allows all TCP traffic from the Google Cloud health checking systems (in 130.211.0.0/22 and 35.191.0.0/16). This example uses the target tag load-balanced-backend to identify the instances to which it should apply.

  • fw-allow-proxies: An ingress rule, applicable to the instances being load balanced, that allows TCP traffic on ports 80, 443, and 8000 from the internal HTTP(S) load balancer's managed proxies. This example uses the target tag load-balanced-backend to identify the instances to which it should apply.

Without these firewall rules, the default deny ingress rule blocks incoming traffic to the backend instances.

Console

  1. Go to the Firewall rules page in the Google Cloud Console.
    Go to the Firewall rules page
  2. Click Create firewall rule to create the rule to allow incoming SSH connections:
    • Name: fw-allow-ssh
    • Network: lb-network
    • Direction of traffic: ingress
    • Action on match: allow
    • Targets: Specified target tags
    • Target tags: allow-ssh
    • Source filter: IP ranges
    • Source IP ranges: 0.0.0.0/0
    • Protocols and ports:
      • Choose Specified protocols and ports.
      • Check tcp and type 22 for the port number.
  3. Click Create.
  4. Click Create firewall rule a second time to create the rule to allow Google Cloud health checks:
    • Name: fw-allow-health-check
    • Network: lb-network
    • Direction of traffic: ingress
    • Action on match: allow
    • Targets: Specified target tags
    • Target tags: load-balanced-backend
    • Source filter: IP ranges
    • Source IP ranges: 130.211.0.0/22 and 35.191.0.0/16
    • Protocols and ports:
      • Choose Specified protocols and ports
      • Check tcp and enter 80.
        As a best practice, limit this rule to just the protocols and ports that match those used by your health check. If you use tcp:80 for the protocol and port, Google Cloud can contact your VMs using HTTP on port 80, but it cannot contact them using HTTPS on port 443.
  5. Click Create.
  6. Click Create firewall rule a third time to create the rule to allow the load balancer's proxy servers to connect the backends:
    • Name: fw-allow-proxies
    • Network: lb-network
    • Direction of traffic: ingress
    • Action on match: allow
    • Targets: Specified target tags
    • Target tags: load-balanced-backend
    • Source filter: IP ranges
    • Source IP ranges: 10.129.0.0/23
    • Protocols and ports:
      • Choose Specified protocols and ports.
      • Check tcp and type 80, 443, 8000 for the port numbers.
  7. Click Create.

gcloud

  1. Create the fw-allow-ssh firewall rule to allow SSH connectivity to VMs with the network tag allow-ssh. When you omit source-ranges, Google Cloud interprets the rule to mean any source.

    gcloud compute firewall-rules create fw-allow-ssh \
        --network=lb-network \
        --action=allow \
        --direction=ingress \
        --target-tags=allow-ssh \
        --rules=tcp:22
    
  2. Create the fw-allow-health-check rule to allow Google Cloud health checks. This example allows all TCP traffic from health check probers; however, you can configure a narrower set of ports to meet your needs.

    gcloud compute firewall-rules create fw-allow-health-check \
        --network=lb-network \
        --action=allow \
        --direction=ingress \
        --source-ranges=130.211.0.0/22,35.191.0.0/16 \
        --target-tags=load-balanced-backend \
        --rules=tcp
    
  3. Create the fw-allow-proxies rule to allow the internal HTTP(S) load balancer's proxies to connect to your backends.

    gcloud compute firewall-rules create fw-allow-proxies \
      --network=lb-network \
      --action=allow \
      --direction=ingress \
      --source-ranges=10.129.0.0/23 \
      --target-tags=load-balanced-backend \
      --rules=tcp:80,tcp:443,tcp:8000
    

api

Create the fw-allow-ssh firewall rule by making a POST request to the firewalls.insert method, replacing project-id with your project ID.

POST https://www.googleapis.com/compute/v1/projects/project-id/global/firewalls

{
  "name": "fw-allow-ssh",
  "network": "projects/project-id/global/networks/lb-network",
  "sourceRanges": [
    "0.0.0.0/0"
  ],
  "targetTags": [
    "allow-ssh"
  ],
  "allowed": [
   {
     "IPProtocol": "tcp",
     "ports": [
       "22"
     ]
   }
  ],
 "direction": "INGRESS"
}

Create the fw-allow-health-check firewall rule by making a POST request to the firewalls.insert method, replacing project-id with your project ID.

POST https://www.googleapis.com/compute/v1/projects/project-id/global/firewalls

{
  "name": "fw-allow-health-check",
  "network": "projects/project-id/global/networks/lb-network",
  "sourceRanges": [
    "130.211.0.0/22",
    "35.191.0.0/16"
  ],
  "targetTags": [
    "load-balanced-backend"
  ],
  "allowed": [
    {
      "IPProtocol": "tcp"
    }
  ],
  "direction": "INGRESS"
}

Create the fw-allow-proxies firewall rule to allow TCP traffic within the proxy subnet the firewalls.insert method, replacing project-id with your project ID.

POST https://www.googleapis.com/compute/v1/projects/{project}/global/firewalls

{
  "name": "fw-allow-proxies",
  "network": "projects/project-id/global/networks/lb-network",
  "sourceRanges": [
    "10.129.0.0/23"
  ],
  "targetTags": [
    "load-balanced-backend"
  ],
  "allowed": [
    {
      "IPProtocol": "tcp",
      "ports": [
        "80"
      ]
    },
  {
      "IPProtocol": "tcp",
      "ports": [
        "443"
      ]
    },
    {
      "IPProtocol": "tcp",
      "ports": [
        "8000"
      ]
    }
  ],
  "direction": "INGRESS"
}

Configuring the load balancer

For the forwardng rule's IP address, use a backend subnet. If you try to use the proxy-only subnet, forwarding rule creation fails.

Console

Select a load balancer type

  1. Go to the Load balancing page in the Google Cloud Console.
    Go to the Load balancing page
  2. Under HTTP(S) Load Balancing, click Start configuration.
  3. Select Only between my VMs. This setting means that the load balancer is internal.
  4. Click Continue.

Prepare the load balancer

  1. For the Name of the load balancer, enter l7-ilb-gke-map.
  2. For the Region, select us-west1.
  3. For the Network, select lb-network.
  4. Keep the window open to continue.

Reserve a proxy-only subnet

For Internal HTTP(S) Load Balancing, reserve a proxy subnet:

  1. Click Reserve a Subnet.
  2. For the Name, enter proxy-only-subnet.
  3. For the IP address range, enter 10.129.0.0/23.
  4. Click Add.

Configure the backend service

  1. Click Backend configuration.
  2. From the Create or select backend services menu, select Create a backend service.
  3. Set the Name of the backend service to l7-ilb-gke-backend-service.
  4. Under Backend type, select Network endpoint groups.
  5. In the New backend card of the Backends section:
    1. Set the Network endpoint group to the NEG was created by GKE. To get the NEG name, see Validate NEG creation.
    2. Enter a maximum rate of 5 RPS per endpoint. Google Cloud will exceed this maximum if necessary.
    3. Click Done.
  6. In the Health check section, choose Create a health check with the following parameters:
    1. Name: l7-ilb-gke-basic-check
    2. Protocol: HTTP
    3. Port specification: Serving port
    4. Click Save and Continue.
  7. Click Create.

Configure the URL map

  1. Click Host and path rules. Ensure that the l7-ilb-gke-backend-service is the only backend service for any unmatched host and any unmatched path.

Configure the frontend

For HTTP:

  1. Click Frontend configuration.
  2. Click Add frontend IP and port.
  3. Set the Name to l7-ilb-gke-forwarding-rule.
  4. Set the Protocol to HTTP.
  5. Set the Subnetwork to backend-subnet.
  6. Under Internal IP, select Reserve a static internal IP address.
  7. In the panel that appears provide the following details:
    1. Name: l7-ilb-gke-ip
    2. In the Static IP address section, select Let me choose.
    3. In the Custom IP address section, enter 10.1.2.199.
    4. Click Reserve.
  8. Set the Port to 80.
  9. Click Done.

For HTTPS:

If you are using HTTPS between the client and the load balancer, you need one or more SSL certificate resources to configure the proxy. See SSL Certificates for information on how to create SSL certificate resources. Google-managed certificates aren't currently supported with internal HTTP(S) load balancers.

  1. Click Frontend configuration.
  2. Click Add frontend IP and port.
  3. In the Name field, enter l7-ilb-gke-forwarding-rule.
  4. In the Protocol field, select HTTPS (includes HTTP/2).
  5. Set the Subnet to backend-subnet.
  6. Under Internal IP, select Reserve a static internal IP address.
  7. In the panel that appears provide the following details:
    1. Name: l7-ilb-gke-ip
    2. In the Static IP address section, select Let me choose.
    3. In the Custom IP address section, enter 10.1.2.199.
    4. Click Reserve.
  8. Ensure that the Port is set to 443, to allow HTTPS traffic.
  9. Click the Certificate drop-down list.
    1. If you already have a self-managed SSL certificate resource you want to use as the primary SSL certificate, select it from the drop-down menu.
    2. Otherwise, select Create a new certificate.
      1. Fill in a Name of l7-ilb-cert.
      2. In the appropriate fields upload your PEM-formatted files:
        • Public key certificate
        • Certificate chain
        • Private key
      3. Click Create.
  10. To add certificate resources in addition to the primary SSL certificate resource:
    1. Click Add certificate.
    2. Select a certificate from the Certificates list or click Create a new certificate and follow the instructions above.
  11. Click Done.

Complete the configuration

  1. Click Create.

gcloud

  1. Define the HTTP health check with the gcloud compute health-checks create http command.

    gcloud compute health-checks create http l7-ilb-gke-basic-check \
       --region=us-west1 \
       --use-serving-port
    
  2. Define the backend service with the gcloud compute backend-services create command.

    gcloud compute backend-services create l7-ilb-gke-backend-service \
      --load-balancing-scheme=INTERNAL_MANAGED \
      --protocol=HTTP \
      --health-checks=l7-ilb-gke-basic-check \
      --health-checks-region=us-west1 \
      --region=us-west1
    
  3. Add NEG backends to the backend service with the gcloud compute backend-services add-backend command.

    gcloud compute backend-services add-backend l7-ilb-gke-backend-service \
       --network-endpoint-group=$DEPLOYMENT_NAME \
       --network-endpoint-group-zone=us-west1-b \
       --region=us-west1 \
       --balancing-mode=RATE \
       --max-rate-per-endpoint=5
    
  4. Create the URL map with the gcloud compute url-maps create command.

    gcloud compute url-maps create l7-ilb-gke-map \
      --default-service=l7-ilb-gke-backend-service \
      --region=us-west1
    
  5. Create the target proxy.

    For HTTP:

    Use the gcloud compute target-http-proxies create command.

    gcloud compute target-http-proxies create l7-ilb-gke-proxy \
      --url-map=l7-ilb-gke-map \
      --url-map-region=us-west1 \
      --region=us-west1
    

    For HTTPS:

    See SSL Certificates for information on how to create SSL certificate resources. Google-managed certificates aren't currently supported with internal HTTP(S) load balancers.

    Assign your filepaths to variable names.

    export LB_CERT=path to PEM-formatted file
    
    export LB_PRIVATE_KEY=path to PEM-formatted file
    

    Create a regional SSL certificate using the gcloud compute ssl-certificates create command.

    gcloud compute ssl-certificates create

    gcloud compute ssl-certificates create l7-ilb-cert \
      --certificate=$LB_CERT \
      --private-key=$LB_PRIVATE_KEY \
      --region=us-west1
    

    Use the regional SSL certificate to create a target proxy with the gcloud compute target-https-proxies create command.

    gcloud compute target-https-proxies create l7-ilb-gke-proxy \
      --url-map=l7-ilb-gke-map \
      --region=us-west1 \
      --ssl-certificates=l7-ilb-cert
    
  6. Create the forwarding rule.

    For custom networks, you must reference the subnet in the forwarding rule. Note that this is the VM subnet, not the proxy subnet.

    For HTTP:

    Use the gcloud compute forwarding-rules create command with the correct flags.

    gcloud compute forwarding-rules create l7-ilb-gke-forwarding-rule \
      --load-balancing-scheme=INTERNAL_MANAGED \
      --network=lb-network \
      --subnet=backend-subnet \
      --address=10.1.2.199 \
      --ports=80 \
      --region=us-west1 \
      --target-http-proxy=l7-ilb-gke-proxy \
      --target-http-proxy-region=us-west1
    

    For HTTPS:

    Use the gcloud compute forwarding-rules create command with the correct flags.

    gcloud compute forwarding-rules create l7-ilb-gke-forwarding-rule \
      --load-balancing-scheme=INTERNAL_MANAGED \
      --network=lb-network \
      --subnet=backend-subnet \
      --address=10.1.2.199 \
      --ports=443 \
      --region=us-west1 \
      --target-https-proxy=l7-ilb-gke-proxy \
      --target-https-proxy-region=us-west1
    

api

Create the health check by making a POST request to the regionHealthChecks.insert method, replacing project-id with your project ID.

POST https://compute.googleapis.com/compute/v1/projects/project-id/regions/us-west1/healthChecks

{
   "name": "l7-ilb-gke-basic-check",
   "type": "HTTP",
   "httpHealthCheck": {
     "portSpecification": "USE_SERVING_PORT"
   }
}

Create the regional backend service by making a POST request to the regionBackendServices.insert method, replacing project-id with your project ID and neg-name with the name of the NEG that you created.

POST https://www.googleapis.com/compute/v1/projects/project-id/regions/us-west1/backendServices

{
  "name": "l7-ilb-gke-backend-service",
  "backends": [
    {
      "group": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-west1-b/networkEndpointGroups/neg-name",
      "balancingMode": "RATE",
      "maxRatePerEndpoint": 5
    }
  ],
  "healthChecks": [
    "projects/project-id/regions/us-west1/healthChecks/l7-ilb-gke-basic-check"
  ],
  "loadBalancingScheme": "INTERNAL_MANAGED"
}

Create the URL map by making a POST request to the regionUrlMaps.insert method, replacing project-id with your project ID.

POST https://compute.googleapis.com/compute/v1/projects/project-id/regions/us-west1/urlMaps

{
  "name": "l7-ilb-gke-map",
  "defaultService": "projects/project-id/regions/us-west1/backendServices/l7-ilb-gke-backend-service"
}

Create the target HTTP proxy by making a POST request to the regionTargetHttpProxies.insert method, replacing project-id with your project ID.

POST https://www.googleapis.com/compute/v1/projects/project-id/regions/us-west1/targetHttpProxy

{
  "name": "l7-ilb-gke-proxy",
  "urlMap": "projects/project-id/global/urlMaps/l7-ilb-gke-map",
  "region": "us-west1"
}

Create the forwarding rule by making a POST request to the forwardingRules.insert method, replacing project-id with your project ID.

POST https://www.googleapis.com/compute/v1/projects/project-id/regions/us-west1/forwardingRules

{
  "name": "l7-ilb-gke-forwarding-rule",
  "IPAddress": "10.1.2.199",
  "IPProtocol": "TCP",
  "portRange": "80-80",
  "target": "projects/project-id/regions/us-west1/targetHttpProxies/l7-ilb-gke-proxy",
  "loadBalancingScheme": "INTERNAL_MANAGED",
  "subnetwork": "projects/project-id/regions/us-west1/subnetworks/backend-subnet",
  "network": "projects/project-id/global/networks/lb-network",
  "networkTier": "PREMIUM",
}

Testing

Creating a VM instance in the zone to test connectivity

gcloud compute instances create l7-ilb-client-us-west1-b \
    --image-family=debian-9 \
    --image-project=debian-cloud \
    --zone=us-west1-b \
    --network=lb-network \
    --subnet=backend-subnet \
    --tags=l7-ilb-client,allow-ssh

Log in to the client instance to that HTTP(S) services on the backends are reachable via the internal HTTP(S) load balancer's forwarding rule IP address, and traffic is being load balanced among endpoints in the NEG.

Connect via SSH to each client instance.

gcloud compute ssh l7-ilb-client-us-west1-b \
    --zone=us-west1-b

Verify that the IP is serving its hostname.

curl 10.1.2.199

For HTTPS testing, replace curl with:

curl -k -s 'https://test.example.com:443' --connect-to test.example.com:443:10.1.2.199:443

The -k flag causes curl to skip certificate validation.

Run 100 requests and confirming that they are load balanced.

For HTTP:

{
RESULTS=
for i in {1..100}
do
    RESULTS="$RESULTS:$(curl --silent 10.1.2.199)"
done
echo "***"
echo "*** Results of load-balancing to 10.1.2.199: "
echo "***"
echo "$RESULTS" | tr ':' '\n' | grep -Ev "^$" | sort | uniq -c
echo
}

For HTTPS:

{
RESULTS=
for i in {1..100}
do
    RESULTS="$RESULTS:$(curl -k -s 'https://test.example.com:443' --connect-to test.example.com:443:10.1.2.199:443
)"
done
echo "***"
echo "*** Results of load-balancing to 10.1.2.199: "
echo "***"
echo "$RESULTS" | tr ':' '\n' | grep -Ev "^$" | sort | uniq -c
echo
}

Implementing heterogeneous services (VMs and containers)

Load balancers can be frontends to mixed Kubernetes and non-Kubernetes workloads. This could be part of a migration from VMs to containers or a permanent architecture that benefits from a shared load balancer. This can be achieved by creating load balancers that target different kinds of backends including standalone NEGs.

VMs and containers in the same backend service

This example shows how to create a NEG that points at an existing VM running a workload, and how to add this NEG as another backend of an existing backendService. This way a single load balancer balances between VMs and GKE containers.

This examples extends the example above that uses an external HTTP load balancer.

Because all endpoints are grouped by the same backendService, the VM and container endpoints are considered the same service. This means the host/path matching will treat all backends identically based on the URL map rules.

A diagram showing the architecture described above. The load balancer created earlier points to two NEGs, the NEG for containers created earlier and a new NEG containing a VM's IP address

When you use a NEG as a backend for a backend service, all other backends in that backend service must also be NEGs. You can't use instance groups and NEGs as backends in the same backend service. Additionally, containers and VMs cannot exist as endpoints within the same NEG so they must always be configured with separate NEGs.

  1. Deploy a VM to Compute Engine with this command:

    gcloud compute instances create vm1 --zone zone --network=network \
     --subnet=subnet --image-project=cos-cloud \
     --image-family=cos-stable --tags=vm-neg-tag
  2. Deploy an application to the VM:

    gcloud compute ssh vm1 --zone=${ZONE} --command="docker run -d --rm --network=host \
     k8s.gcr.io/serve_hostname:v1.4 && sudo iptables -P INPUT ACCEPT"

    This command deploys to the VM the same example application used in the earlier example. For simplicity, the application is run as a Docker container but this is not essential. The iptables command is required to allow firewall access to the running container.

  3. Validate that the application is serving on port 9376 and reporting that it is running on vm1:

    gcloud compute ssh vm1 --zone=zone --command="curl -s localhost:9376"

    The server should respond with vm1.

  4. Create a NEG to use with the VM endpoint. Containers and VMs can both be NEG endpoints, but a single NEG can't have both VM and container endpoints.

    gcloud compute network-endpoint-groups create vm-neg \
    --subnet=subnet --zone=zone
  5. Attach the VM endpoint to the NEG:

    gcloud compute network-endpoint-groups update vm-neg --zone=zone \
     --add-endpoint="instance=vm1,ip=vm-primary-ip,port=9376"
  6. Confirm that the NEG has the VM endpoint:

    gcloud compute network-endpoint-groups list-network-endpoints vm-neg --zone zone
  7. Attach the NEG to the backend service using the same command that you used to add a container backend:

    gcloud compute backend-services add-backend my-bes --global \
     --network-endpoint-group vm-neg \
     --network-endpoint-group-zone zone \
     --balancing-mode RATE --max-rate-per-endpoint 10
  8. Open firewall to allow the VM's health check:

    gcloud compute firewall-rules create fw-allow-health-check-to-vm1 \
    --network=network \
    --action=allow \
    --direction=ingress \
    --target-tags=vm-neg-tag \
    --source-ranges=130.211.0.0/22,35.191.0.0/16 \
    --rules=tcp:9376
    
  9. Validate that the load balancer is forwarding traffic to both the new vm1 backend and the existing container backend by sending test traffic:

    for i in `seq 1 100`; do curl ${VIP};echo; done

    You should see responses from both the container (neg-demo-app) and VM (vm1) endpoints.

VMs and containers for different backend services

This example shows how to create a NEG that points at an existing VM running a workload, and how to add this NEG as the backend to a new backendService. This is useful for the case where the containers and VMs are different services but need to share the same L7 load balancer, such as if the services share the same IP address or domain name.

This examples extends the previous example that has a VM backend in the same backend service as the container backend. This example reuses that VM.

Because the container and VM endpoints are grouped in separate backendServices, they are considered different services. This means that the URL map will match backends and direct traffic to the VM or container based on the hostname.

The following diagram shows how a single virtual IP address corresponds to two host names, which in turn correspond to a container-based backend service and a VM-based backend service.

The following diagram shows the architecture described above:

The architecture has two NEGs, one for the service implemented with containers
and another for the service implemented with VMs. There is a Backend Service
object for each NEG. The URL Map object directs traffic to the correct backend
service based on the requested URL.

  1. Create a new backend service for the VM:

    gcloud compute backend-services create my-vm-bes \
      --protocol HTTP \
      --health-checks http-basic-check \
      --global
  2. Attach the NEG for the VM, vm-neg, to the backend-service:

    gcloud compute backend-services add-backend my-vm-bes --global \
     --network-endpoint-group vm-neg \
     --network-endpoint-group-zone zone \
     --balancing-mode RATE --max-rate-per-endpoint 10
  3. Add a host rule to the URL map to direct requests for container.example.com host to the container backend service:

    gcloud compute url-maps add-path-matcher web-map \
      --path-matcher-name=container-path --default-service=my-bes \
      --new-hosts=container.example.com --global
    
  4. Add another host rule the URL map to direct requests for vm.example.com host to the VM backend service:

    gcloud compute url-maps add-path-matcher web-map \
      --path-matcher-name=vm-path --default-service=my-vm-bes \
      --new-hosts=vm.example.com --global
    
  5. Validate that the load balancer sends traffic to the VM backend based on the requested path:

    curl -H "HOST:vm.example.com" virtual-ip

Limitations of standalone NEGs

  • Annotation validation errors are exposed to the user through Kubernetes events.
  • NEG names are generated by the NEG controller. You can't supply custom names for NEGs.
  • The limitations of NEGs also apply to standalone NEGs.
  • Standalone NEGs do not work with legacy networks.
  • Standalone NEGs can only be used with compatible network services including Traffic Director and the compatible load balancer types.

Pricing

Refer to load balancing section of the pricing page for details on pricing of the load balancer. There is no additional charge for NEGs.

Troubleshooting

No standalone NEG configured

Symptom: No NEG is created.

Potential Resolution:

  • Check the events associated with the Service and look for error messages.
  • Verify the standalone NEG annotation is well-formed JSON, and that the exposed ports match existing Ports in the service spec.
  • Verify the NEG status annotation and see if expected service ports has corresponding NEGs.
  • Verify that the NEGs have been created in the expected zones, with the command gcloud compute network-endpoint-groups list.

Traffic does not reach the endpoints

Symptom: 502 errors or rejected connections.

Potential Resolution:

  • After the service is configured new endpoints will generally become reachable after attaching them to NEG, provided they respond to health checks.
  • If after this time traffic still can not reach the endpoints resulting in 502 error code for HTTP(S) or connections being rejected for TCP/SSL load balancers, check the following:
    • Verify that firewall rules allow incoming TCP traffic to your endpoints from following ranges: 130.211.0.0/22 and 35.191.0.0/16.
    • Verify that your endpoints are healthy using gcloud or by calling getHealth api on BackendService or listEndpoints API on the NEG with showHealth parameter set to SHOW.

Stalled rollout

Symptom: Rolling out an updated Deployment stalls, and the number of up-to-date replicas does not match the desired number of replicas.

Potential Resolution:

The deployment's health checks are failing. The container image might be bad or the health check might be misconfigured. The rolling replacement of Pods waits until the newly started Pod passes its Pod readiness gate. This only occurs if the Pod is responding to load balancer health checks. If the Pod does not respond, or if the health check is misconfigured, the readiness gate conditions can't be met and the rollout can't continue.

  • If you're using kubectl 1.13 or higher, you can check the status of a Pod's readiness gates with the following command:

    kubectl get my-Pod -o wide

    Check the READINESS GATES column.

    This column doesn't exist in kubectl 1.12 and lower. A Pod that is marked as being in the READY state may have a failed readiness gate. To verify this, use the following command:

    kubectl get my-pod -o yaml

    The readiness gates and their status are listed in the output.

  • Verify that the container image in your Deployment's Pod specification is functioning correctly and is able to respond to health checks.

  • Verify that the health checks are correctly configured.

What's next