Container-native load balancing through standalone zonal NEGs

Autopilot Standard

This page shows you how to create a Kubernetes Service that is backed by a zonal GCE_VM_IP_PORT network endpoint group (NEG) in a Google Kubernetes Engine (GKE) VPC-native cluster.

For information on the benefits, requirements, and limitations of container-native load balancing, see Container-native load balancing.

Overview

A NEG represents a group of endpoints. GKE supports standalone NEGs of the GCE_VM_IP_PORT type. GCE_VM_IP_PORT NEGs support endpoints using either the VM's primary internal IP address or an IP address from one of its alias IP ranges.

In the context of a GKE VPC-native cluster using standalone NEGs, each endpoint is a Pod IP address and target port. Pod IP addresses are sourced from the node's alias IP range for Pods, which comes from the cluster's subnet secondary IP address range for Pods.

GKE provides a NEG controller to manage the membership of GCE_VM_IP_PORT NEGs. You can add the NEGs it creates as backends to the backend services for load balancers that you configure outside of the GKE API.

The following diagram describes how Kubernetes API objects correspond to Compute Engine objects.

Kubernetes Services correspond to Compute Engine network endpoint groups,
while Kubernetes Pods correspond to Compute Engine network endpoints. The NEG
controller component of the control plane manages this.

Ingress with NEGs

When NEGs are used with GKE Ingress, the Ingress controller facilitates the creation of all aspects of the load balancer. This includes creating the virtual IP address, forwarding rules, health checks, firewall rules, and more.

Ingress is the recommended way to use container-native load balancing as it has many features that simplify the management of NEGs. Standalone NEGs are an option if NEGs managed by Ingress don't serve your use case.

Standalone NEGs

When NEGs are deployed with load balancers provisioned by anything other than Ingress, they are considered standalone NEGs. Standalone NEGs are deployed and managed through the NEG controller, but the forwarding rules, health checks, and other load balancing objects are deployed manually.

Standalone NEGs don't conflict with Ingress enabled container-native load balancing.

The following illustration shows the differences in how the load balancing objects are deployed in each scenario:

With standalone NEGs and Ingress managed NEGs, the NEG controller on the
GKE control plane manages the NEG and Network endpoints objects. With standalone NEGs, every other component is
managed by the user as described in the previous paragraphs.

Preventing leaked NEGs

With standalone NEGs, you are responsible for managing the lifecycles of NEGs and the resources that make up the load balancer. You could leak NEGs in these ways:

When a GKE service is deleted, the associated NEG won't be garbage collected if the NEG is still referenced by a backend service. Dereference the NEG from the backend service to allow NEG deletion.
When a cluster is deleted, standalone NEGs won't be deleted in the following scenarios:
- The NEG is still referenced by a backend service.
- The cluster deletion process shuts down the NEG controller before the controller can delete the NEG.
To prevent leaked NEGs, dereference the NEG from the backend service and delete all NEGs before you delete the cluster.

If you have leaked NEGs after deleting the cluster or service, you can delete the NEGs by using the Google Cloud CLI.

Use cases of standalone NEGs

Standalone NEGs have several critical uses. Standalone NEGs are very flexible. This is in contrast to Ingress (used with or without NEGs) which defines a specific set of load balancing objects that were chosen in an opinionated way to make them straightforward to use.

Use cases for standalone NEGs include:

Heterogeneous services of containers and VMs

NEGs can contain both VM and container IP addresses. This means a single virtual IP address can point to a backend that consists of both Kubernetes and non-Kubernetes workloads. This can also be used to migrate existing workloads to a GKE cluster.

Standalone NEGs can point to VM IPs which makes it possible to manually configure load balancers to point at backends that are comprised of both VMs and containers for the same service VIP.

Customized Ingress controllers

You can use a customized Ingress controller (or no Ingress controller) to configure load balancers that target standalone NEGs.

Use Cloud Service Mesh with GKE

You can use Cloud Service Mesh with GKE. Cloud Service Mesh uses standalone NEGs to provide container-native load balancing for the managed service mesh.

Use external proxy Network Load Balancers with GKE

You can use standalone NEGs to load balance directly to containers with the external proxy Network Load Balancer which is not supported natively by Kubernetes/GKE.

Pod readiness

Readiness gates are an extensibility feature of Kubernetes that allow the injection of extra feedback or signals into the PodStatus to allow the Pod to transition to the Ready state. The NEG controller manages a custom readiness gate to ensure the full network path from Compute Engine load balancer to pod is functional. Pod readiness gates in GKE are explained in container-native load balancing.

Ingress with NEGs deploys and manages Compute Engine health checks on behalf of the load balancer. However, standalone NEGs make no assumptions about Compute Engine health checks because they are expected to be deployed and managed separately. Compute Engine health checks should always be configured along with the load balancer to prevent traffic from being sent to backends which are not ready to receive. If there is no health check status associated with the NEG (usually because no health check is configured), then the NEG controller will mark the Pod's readiness gate value to True when its corresponding endpoint is programmed in NEG.

Before you begin

Before you start, make sure that you have performed the following tasks:

Enable the Google Kubernetes Engine API.

Enable Google Kubernetes Engine API

If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running the gcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.
Note: For existing gcloud CLI installations, make sure to set the compute/region property. If you use primarily zonal clusters, set the compute/zone instead. By setting a default location, you can avoid errors in the gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.

Ensure that you have an existing VPC-native cluster. Your cluster must have the HttpLoadBalancing add-on enabled. New GKE clusters have the HttpLoadBalancing add-on enabled by default.

To create a new Standard cluster, see Create a VPC-native cluster. Autopilot clusters are VPC-native by default.

Using standalone NEGs

The following instructions show you how to use standalone NEGs with an external HTTP load balancer on GKE.

You must create the following objects:

A Deployment that creates and manages Pods.
A Service that creates a NEG.
A load balancer created with the Compute Engine API. This differs from using NEGs with Ingress, in which case Ingress creates and configures a load balancer for you. With standalone NEGs you are responsible for associating the NEG and the backend service to connect the Pods to the load balancer. The load balancer consists of several components, shown in the following diagram:

The components of a load balancer are a forwarding rule, target HTTP proxy,
URL map, health check, and backend service. This directs traffic to a NEG that
contains Pod IP addresses.

Create a Deployment

The following example manifests specify Deployments that run three instances of a containerized HTTP server. The HTTP server responds to requests with the hostname of the application server, the name of the Pod the server is running on.

We recommend you use workloads that use Pod readiness feedback.

Using Pod readiness feedback

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: neg-demo-app # Label for the Deployment
  name: neg-demo-app # Name of Deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      run: neg-demo-app
  template: # Pod template
    metadata:
      labels:
        run: neg-demo-app # Labels Pods from this Deployment
    spec: # Pod specification; each Pod created by this Deployment has this specification
      containers:
      - image: registry.k8s.io/serve_hostname:v1.4 # Application to run in Deployment's Pods
        name: hostname

Using hardcoded delay

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: neg-demo-app # Label for the Deployment
  name: neg-demo-app # Name of Deployment
spec:
  minReadySeconds: 60 # Number of seconds to wait after a Pod is created and its status is Ready
  replicas: 3
  selector:
    matchLabels:
      run: neg-demo-app
  template: # Pod template
    metadata:
      labels:
        run: neg-demo-app # Labels Pods from this Deployment
    spec: # Pod specification; each Pod created by this Deployment has this specification
      containers:
      - image: registry.k8s.io/serve_hostname:v1.4 # Application to run in Deployment's Pods
        name: hostname

Save this manifest as neg-demo-app.yaml, then create the Deployment by running the following command:

kubectl apply -f neg-demo-app.yaml

Create a Service

The following manifest specifies a Service where:

Any Pod with the label run: neg-demo-app is a member of this Service.
The Service has one ServicePort field with port 80.
The cloud.google.com/neg annotation specifies that port 80 will be associated with a NEG. The optional name field specifies that the NEG will be named NEG_NAME. If the name field is omitted, a unique name will be automatically generated. See naming NEGs for details.
Each member Pod must have a container that is listening on TCP port 9376.

apiVersion: v1
kind: Service
metadata:
  name: neg-demo-svc
  annotations:
    cloud.google.com/neg: '{"exposed_ports": {"80":{"name": "NEG_NAME"}}}'
spec:
  type: ClusterIP
  selector:
    run: neg-demo-app # Selects Pods labelled run: neg-demo-app
  ports:
  - port: 80
    protocol: TCP
    targetPort: 9376

Replace NEG_NAME with the name for the NEG. The NEG name must be unique in its region.

Save this manifest as neg-demo-svc.yaml, then create the Service by running the following command:

kubectl apply -f neg-demo-svc.yaml

A NEG is created within a few minutes of Service creation.

Service types

While this example uses a ClusterIP service, all five types of Services support standalone NEGs. We recommend the default type, ClusterIP.

Naming NEGs

In GKE versions 1.18.18-gke.1200 and later, you can specify a custom name for NEGs, or GKE can generate a name automatically. Previous versions of GKE only support automatically generated NEG names.

GKE creates one NEG in each zone used by the cluster. The NEGs all use the same name.

Specifying a name

Specifying a custom NEG name simplifies configuring the load balancer because you know the name and zone(s) of the NEG(s) in advance. Custom NEG names must meet the following requirements:

Be unique to the cluster's zone for zonal clusters, or unique to the region for regional clusters.
Must not match the name of any existing NEG that was not created by the GKE NEG controller.
Must not contain underscores.

Use the name field in the cloud.google.com/neg annotation of the Service to specify a NEG name:

cloud.google.com/neg: '{"exposed_ports": {"80":{"name": "NEG_NAME"}}}'

Replace NEG_NAME with the name for the NEG. The NEG name must be unique in its region.

Using an automatically generated name

Automatically generated NEG names are guaranteed to be unique. To use an automatically generated name, omit the name field:

cloud.google.com/neg: '{"exposed_ports": {"80":{}}}'

The automatically generated name has the following format:

k8s1-CLUSTER_UID-NAMESPACE-SERVICE-PORT-RANDOM_HASH

Mapping ports to multiple NEGs

A Service can listen on more than one port. By definition, NEGs have only a single IP address and port. This means that if you specify a Service with multiple ports, it will create a NEG for each port.

The format of the cloud.google.com/neg annotation is:

cloud.google.com/neg: '{
   "exposed_ports":{
      "SERVICE_PORT_1":{},
      "SERVICE_PORT_2":{},
      "SERVICE_PORT_3":{},
      ...
   }
 }'

In this example, each instance of SERVICE_PORT_N is a distinct port number that refers to existing service ports of the Service. For each service port listed, the NEG controller creates one NEG in each zone the cluster occupies.

Retrieve NEG statuses

Use the following command to retrieve the statuses of the cluster's Services:

kubectl get service neg-demo-svc -o yaml

The output is similar to the following:

cloud.google.com/neg-status: '{
   "network-endpoint-groups":{
      "SERVICE_PORT_1": "NEG_NAME_1",
      "SERVICE_PORT_2": "NEG_NAME_2",
      ...
   },
   "zones":["ZONE_1", "ZONE_2", ...]
}

In this output, each element in the network-endpoint-groups mapping is a service port (like SERVICE_PORT_1) and the name of the corresponding managed NEGs (like NEG_NAME_1). The zones list contains every zone (like ZONE_1) that has a NEG in it.

The output is similar to the following:

apiVersion: v1
kind: Service
metadata:
  annotations:
    cloud.google.com/neg: '{"exposed_ports": {"80":{}}}'
    cloud.google.com/neg-status: '{"network_endpoint_groups":{"80":"k8s1-cca197ad-default-neg-demo-app-80-4db81e02"},"zones":["ZONE_1", "ZONE_2"]}'
  labels:
    run: neg-demo-app
  name: neg-demo-app
  namespace: default
  selfLink: /api/v1/namespaces/default/services/neg-demo-app
  ...
spec:
  clusterIP: 10.0.14.252
  ports:
  - port: 80
    protocol: TCP
    targetPort: 9376
  selector:
    run: neg-demo-app
  sessionAffinity: None
status:
  loadBalancer: {}

In this example, the annotation shows that service port 80 is exposed to NEGs named k8s1-cca197ad-default-neg-demo-app-80-4db81e02.

Validate NEG creation

A NEG is created within a few minutes of Service creation. If there are Pods that match the label specified in the Service manifest, then upon creation the NEG will contain the IPs of the Pods.

There are two ways to verify that the NEG is created and is correctly configured. In GKE 1.18.6-gke.6400 and later, a custom resource ServiceNetworkEndpointGroup stores status information about NEGs created by the Service controller. In previous versions, you must inspect the NEGs directly.

The `ServiceNetworkEndpointGroup` resource

List the NEGs in a cluster by getting all of the ServiceNetworkEndpointGroup resources:

kubectl get svcneg

Observe the status of a NEG by checking the status of the ServiceNetworkEndpointGroup resource:

kubectl get svcneg NEG_NAME -o yaml

Replace NEG_NAME with the name of the individual NEG you want to inspect.

The output of this command includes a status section that might contain error messages. Some errors are reported as a Service event. You can find further details by querying the Service object:

kubectl describe service SERVICE_NAME

Replace SERVICE_NAME with the name of the relevant Service.

To verify that the NEG controller is successfully syncing the NEG, check the status field of the ServiceNetworkEndpointGroup resource for a condition with type:Synced. The time of the most recent sync is in the status.lastSyncTime field.

ServiceNetworkEndpointGroup resources only exist in GKE version 1.18 and later.

Inspecting NEGs directly

Verify that the NEG exists by listing the NEGs in your Google Cloud project and checking for a NEG that matches the Service you created. The NEG's name has the following format:

k8s1-CLUSTER_UID-NAMESPACE-SERVICE-PORT-RANDOM_HASH

Use the following command to list NEGs:

gcloud compute network-endpoint-groups list

The output is similar to the following:

NAME                                          LOCATION       ENDPOINT_TYPE   SIZE
k8s1-70aa83a6-default-my-service-80-c9710a6f  ZONE_NAME      GCE_VM_IP_PORT  3

This output shows that the SIZE of the NEG is 3, meaning that it has three endpoints which correspond to the three Pods in the Deployment.

Identify the individual endpoints with the following command:

gcloud compute network-endpoint-groups list-network-endpoints NEG_NAME

Replace NEG_NAME with the name of the NEG for which you want to display the individual endpoints.

The output shows three endpoints, each of which has a Pod's IP address and port:

INSTANCE                                           IP_ADDRESS  PORT
gke-cluster-3-default-pool-4cc71a15-qlpf  10.12.1.43  9376
gke-cluster-3-default-pool-4cc71a15-qlpf  10.12.1.44  9376
gke-cluster-3-default-pool-4cc71a15-w9nk  10.12.2.26  9376

Attaching an external Application Load Balancer to standalone NEGs

You can use NEGs as a backend for an external Application Load Balancer using the Compute Engine API.

Create a firewall rule. Load balancers need to access cluster endpoints to perform health checks. This command creates a firewall rule to allow access:
```
gcloud compute firewall-rules create fw-allow-health-check-and-proxy \
   --network=NETWORK_NAME \
   --action=allow \
   --direction=ingress \
   --target-tags=GKE_NODE_NETWORK_TAGS \
   --source-ranges=130.211.0.0/22,35.191.0.0/16 \
   --rules=tcp:9376
```
Replace the following:
- NETWORK_NAME: the network where the cluster runs.
- GKE_NODE_NETWORK_TAGS: the networking tags on the GKE nodes.
If you did not create custom network tags for your nodes, GKE automatically generates tags for you. You can look up these generated tags by running the following command:
```
gcloud compute instances describe INSTANCE_NAME
```
Replace INSTANCE_NAME with the name of the host Compute Engine VM instance running the GKE node. For example, the output in the previous section displays the instance names in the INSTANCE column for the GKE nodes.

For Standard clusters, you can also run gcloud compute instances list to list all instances in your project.

Create a global virtual IP address for the load balancer:

gcloud compute addresses create hostname-server-vip \
    --ip-version=IPV4 \
    --global

Create a health check. This is used by the load balancer to detect the liveness of individual endpoints within the NEG.
```
gcloud compute health-checks create http http-basic-check \
    --use-serving-port
```

Create a backend service that specifies that this is a global external Application Load Balancer:

gcloud compute backend-services create my-bes \
    --protocol HTTP \
    --health-checks http-basic-check \
    --global

Create a URL map and target proxy for the load balancer. This example is very straightforward because the serve_hostname app used for this guide has a single endpoint and does not feature URLs.
```
gcloud compute url-maps create web-map \
    --default-service my-bes
```
```
gcloud compute target-http-proxies create http-lb-proxy \
    --url-map web-map
```
Create a forwarding rule. This is what creates the load balancer.
```
gcloud compute forwarding-rules create http-forwarding-rule \
    --address=HOSTNAME_SERVER_VIP \
    --global \
    --target-http-proxy=http-lb-proxy \
    --ports=80
```
Replace HOSTNAME_SERVER_VIP with the IP address to use for the load balancer. If you omit --address, GKE automatically assigns an ephemeral IP address.

You can also reserve a new static external IP address.

Checkpoint

These are the resources you have created so far:

An external virtual IP address
The forwarding rules
The firewall rules
The target HTTP proxy
The URL map the Compute Engine health check
The backend service
The Compute Engine health check

The relationship between these resources is shown in the following diagram:

Relationship between the resources you've created.

These resources together are a load balancer. In the next step you will add backends to the load balancer.

One of the benefits of standalone NEGs demonstrated here is that the lifecycles of the load balancer and backend can be completely independent. The load balancer can continue running after the application, its services, or the GKE cluster is deleted. You can add and remove new NEGs or multiple NEGs from the load balancer without changing any of the frontend load balancer objects.

Add backends to the load balancer

Use gcloud compute backend-services add-backend to connect the NEG to the load balancer by adding it as a backend of the my-bes backend service:

gcloud compute backend-services add-backend my-bes \
    --global \
    --network-endpoint-group=NEG_NAME \
    --network-endpoint-group-zone=NEG_ZONE \
    --balancing-mode RATE --max-rate-per-endpoint 5

Replace the following:

NEG_NAME: the name of your network endpoint group. The name is either the name you specified when creating the NEG or an autogenerated name. If you did not specify a name for the NEG, see the following instructions to find the autogenerated name.
NEG_ZONE: the zone your network endpoint group is in. See the following instructions to find this value.

Use this command to get the name and location of the NEG:

gcloud compute network-endpoint-groups list

The output is similar to the following:

NAME                                          LOCATION       ENDPOINT_TYPE   SIZE
k8s1-70aa83a6-default-my-service-80-c9710a6f  ZONE_NAME      GCE_VM_IP_PORT  3

In this example output, the name of the NEG is k8s1-70aa83a6-default-my-service-80-c9710a6f.

Multiple NEGs can be added to the same backend service. Global backend services like my-bes can have NEG backends in different regions, while regional backend services must have backends in a single region.

Validate that the load balancer works

There are two ways to validate that the load balancer you set up is working:

Verify that the health check is correctly configured and reporting healthy.
Access the application and verify its response.

Verify health checks

Check that the backend service is associated with the health check and network endpoint groups, and that the individual endpoints are healthy.

Use this command to check that the backend service is associated with your health check and your network endpoint group:

gcloud compute backend-services describe my-bes --global

The output is similar to the following:

backends:
- balancingMode: RATE
  capacityScaler: 1.0
  group: ... /networkEndpointGroups/k8s1-70aa83a6-default-my-service-80-c9710a6f
...
healthChecks:
- ... /healthChecks/http-basic-check
...
name: my-bes
...

Next, check the health of the individual endpoints:

gcloud compute backend-services get-health my-bes --global

The status: section of the output is similar to the following:

status:
  healthStatus:
  - healthState: HEALTHY
    instance: ... gke-cluster-3-default-pool-4cc71a15-qlpf
    ipAddress: 10.12.1.43
    port: 50000
  - healthState: HEALTHY
    instance: ... gke-cluster-3-default-pool-4cc71a15-qlpf
    ipAddress: 10.12.1.44
    port: 50000
  - healthState: HEALTHY
    instance: ... gke-cluster-3-default-pool-4cc71a15-w9nk
    ipAddress: 10.12.2.26
    port: 50000

Access the application

Access the application through the load balancer's IP address to confirm that everything is working.

First, get the virtual IP address of the load balancer:

gcloud compute addresses describe hostname-server-vip --global | grep "address:"

The output will include an IP address. Next, send a request to that IP address (34.98.102.37 in this example):

curl 34.98.102.37

The response from the serve_hostname app should be neg-demo-app.

Attaching an internal Application Load Balancer to standalone NEGs

You can use NEGs to configure an internal Application Load Balancer for your services running in standalone GKE Pods.

Configuring the proxy-only subnet

The proxy-only subnet is for all regional internal Application Load Balancers in the load balancer's region.

Console

If you're using the Google Cloud console, you can wait and create the proxy-only subnet later.

gcloud

Create the proxy-only subnet with the gcloud compute networks subnets create command.

gcloud compute networks subnets create proxy-only-subnet \
    --purpose=REGIONAL_MANAGED_PROXY \
    --role=ACTIVE \
    --region=COMPUTE_REGION \
    --network=lb-network \
    --range=10.129.0.0/23

Replace COMPUTE_REGION with the Compute Engine for the subnet.

API

Create the proxy-only subnet with the subnetworks.insert method.

POST https://compute.googleapis.com/compute/projects/PROJECT_ID/regions/COMPUTE_REGION/subnetworks

{
  "name": "proxy-only-subnet",
  "ipCidrRange": "10.129.0.0/23",
  "network": "projects/PROJECT_ID/global/networks/lb-network",
  "region": "projects/PROJECT_ID/regions/COMPUTE_REGION",
  "purpose": "REGIONAL_MANAGED_PROXY",
  "role": "ACTIVE"
}

Replace the following:

PROJECT_ID: your project ID.
COMPUTE_REGION: the Compute Engine for the subnet.

Configuring firewall rules

This example uses the following firewall rules:

fw-allow-ssh: An ingress rule, applicable to the instances being load balanced, that allows incoming SSH connectivity on TCP port 22 from any address. You can choose a more restrictive source IP range for this rule. For example, you can specify just the IP ranges of the system from which you initiate SSH sessions. This example uses the target tag allow-ssh to identify the VMs to which the firewall rule applies.
fw-allow-health-check: An ingress rule, applicable to the instances being load balanced, that allows all TCP traffic from the Google Cloud health checking systems (in 130.211.0.0/22 and 35.191.0.0/16). This example uses the target tag load-balanced-backend to identify the instances to which it should apply.
fw-allow-proxies: An ingress rule, applicable to the instances being load balanced, that allows TCP traffic on port 9376 from the internal HTTP(S) load balancer's managed proxies. This example uses the target tag load-balanced-backend to identify the instances to which it should apply.

Without these firewall rules, the default deny ingress rule blocks incoming traffic to the backend instances.

Console

Go to the Firewall policies page in the Google Cloud console.

Go to Firewall policies
Click Create firewall rule to create the rule to allow incoming SSH connections:
- Name: fw-allow-ssh
- Network: lb-network
- Direction of traffic: ingress
- Action on match: allow
- Targets: Specified target tags
- Target tags: allow-ssh
- Source filter: IPv4 ranges
- Source IPv4 ranges: 0.0.0.0/0
- Protocols and ports:
  - Select Specified protocols and ports.
  - Select the tcp checkbox and specify port 22.
Click Create.
Click Create firewall rule again to create the rule to allow Google Cloud health checks:
- Name: fw-allow-health-check
- Network: lb-network
- Direction of traffic: ingress
- Action on match: allow
- Targets: Specified target tags
- Target tags: load-balanced-backend
- Source filter: IPv4 ranges
- Source IPv4 ranges: 130.211.0.0/22 and 35.191.0.0/16
- Protocols and ports:
  - Select Specified protocols and ports
  - Select the tcp checkbox and specify port 80. As a best practice, limit this rule to just the protocols and ports that match those used by your health check. If you use tcp:80 for the protocol and port, Google Cloud can contact your VMs using HTTP on port 80, but it cannot contact them using HTTPS on port 443.
Click Create.
Click Create firewall rule again to create the rule to allow the load balancer's proxy servers to connect the backends:
- Name: fw-allow-proxies
- Network: lb-network
- Direction of traffic: ingress
- Action on match: allow
- Targets: Specified target tags
- Target tags: load-balanced-backend
- Source filter: IPv4 ranges
- Source IPv4 ranges: 10.129.0.0/23
- Protocols and ports:
  - Select Specified protocols and ports.
  - Select the tcp checkbox and specify port 9376.
Click Create.

gcloud

Create the fw-allow-ssh firewall rule to allow SSH connectivity to VMs with the network tag allow-ssh. When you omit source-ranges, Google Cloud interprets the rule to mean any source.

gcloud compute firewall-rules create fw-allow-ssh \
    --network=lb-network \
    --action=allow \
    --direction=ingress \
    --target-tags=allow-ssh \
    --rules=tcp:22

Create the fw-allow-health-check rule to allow Google Cloud health checks. This example allows all TCP traffic from health check probers; however, you can configure a narrower set of ports to meet your needs.

gcloud compute firewall-rules create fw-allow-health-check \
    --network=lb-network \
    --action=allow \
    --direction=ingress \
    --source-ranges=130.211.0.0/22,35.191.0.0/16 \
    --target-tags=load-balanced-backend \
    --rules=tcp

Create the fw-allow-proxies rule to allow the internal HTTP(S) load balancer's proxies to connect to your backends.

gcloud compute firewall-rules create fw-allow-proxies \
    --network=lb-network \
    --action=allow \
    --direction=ingress \
    --source-ranges=10.129.0.0/23 \
    --target-tags=load-balanced-backend \
    --rules=tcp:9376

API

Create the fw-allow-ssh firewall rule by making a POST request to the firewalls.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/global/firewalls

{
  "name": "fw-allow-ssh",
  "network": "projects/PROJECT_ID/global/networks/lb-network",
  "sourceRanges": [
    "0.0.0.0/0"
  ],
  "targetTags": [
    "allow-ssh"
  ],
  "allowed": [
   {
     "IPProtocol": "tcp",
     "ports": [
       "22"
     ]
   }
  ],
 "direction": "INGRESS"
}

Replace PROJECT_ID with your project ID.

Create the fw-allow-health-check firewall rule by making a POST request to the firewalls.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/global/firewalls

{
  "name": "fw-allow-health-check",
  "network": "projects/PROJECT_ID/global/networks/lb-network",
  "sourceRanges": [
    "130.211.0.0/22",
    "35.191.0.0/16"
  ],
  "targetTags": [
    "load-balanced-backend"
  ],
  "allowed": [
    {
      "IPProtocol": "tcp"
    }
  ],
  "direction": "INGRESS"
}

Create the fw-allow-proxies firewall rule to allow TCP traffic within the proxy subnet the firewalls.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/global/firewalls

{
  "name": "fw-allow-proxies",
  "network": "projects/PROJECT_ID/global/networks/lb-network",
  "sourceRanges": [
    "10.129.0.0/23"
  ],
  "targetTags": [
    "load-balanced-backend"
  ],
  "allowed": [
    {
      "IPProtocol": "tcp",
      "ports": [
        "9376"
      ]
    }
  ],
  "direction": "INGRESS"
}

Replace PROJECT_ID with your project ID.

Configuring the load balancer

For the forwarding rule's IP address, use a backend subnet. If you try to use the proxy-only subnet, forwarding rule creation fails.

Console

Select a load balancer type

Go to the Create a load balancer page in the Google Cloud console. Go to Create a load balancer
Under HTTP(S) Load Balancing, click Start configuration.
Select Only between my VMs. This setting means that the load balancer is internal.
Click Continue.

Prepare the load balancer

For the Name of the load balancer, enter l7-ilb-gke-map.
For the Region, select the region where you created the subnet.
For the Network, select lb-network.

Reserve a proxy-only subnet

Reserve a proxy-only subnet:

Click Reserve a Subnet.
For the Name, enter proxy-only-subnet.
For the IP address range, enter 10.129.0.0/23.
Click Add.

Configure the backend service

Click Backend configuration.
From the Create or select backend services menu, select Create a backend service.
Set the Name of the backend service to l7-ilb-gke-backend-service.
For Backend type, select Network endpoint groups.
In the New backend card of the Backends section:
1. Set the Network endpoint group to the NEG was created by GKE. To get the NEG name, see Validate NEG creation.
2. For Maximum RPS, specify a maximum rate of 5 RPS per endpoint. Google Cloud will exceed this maximum if necessary.
3. Click Done.
From the Health check drop-down list, select Create a health check and then specify the following parameters:
1. Name: l7-ilb-gke-basic-check
2. Protocol: HTTP
3. Port specification: Serving port
4. Click Save and Continue.
Click Create.

Configure the URL map

Click Routing rules. Ensure that the l7-ilb-gke-backend-service is the only backend service for any unmatched host and any unmatched path.

Configure the frontend

Click Frontend configuration and perform the following steps:

For HTTP:

Click Frontend configuration.
Click Add frontend IP and port.
Set the Name to l7-ilb-gke-forwarding-rule.
Set the Protocol to HTTP.
Set the Subnetwork to backend-subnet.
Under Internal IP, select Reserve a static internal IP address.
In the panel that appears provide the following details:
1. Name: l7-ilb-gke-ip
2. In the Static IP address section, select Let me choose.
3. In the Custom IP address section, enter 10.1.2.199.
4. Click Reserve.
Set the Port to 80.
Click Done.

For HTTPS:

If you are using HTTPS between the client and the load balancer, you need one or more SSL certificate resources to configure the proxy. See SSL Certificates for information on how to create SSL certificate resources. Google-managed certificates aren't supported with internal HTTP(S) load balancers.

Click Frontend configuration.
Click Add frontend IP and port.
In the Name field, enter l7-ilb-gke-forwarding-rule.
In the Protocol field, select HTTPS (includes HTTP/2).
Set the Subnet to backend-subnet.
Under Internal IP, select Reserve a static internal IP address.
In the panel that appears provide the following details:
1. Name: l7-ilb-gke-ip
2. In the Static IP address section, select Let me choose.
3. In the Custom IP address section, enter 10.1.2.199.
4. Click Reserve.
Ensure that the Port is set to 443, to allow HTTPS traffic.
Click the Certificate drop-down list.
1. If you already have a self-managed SSL certificate resource you want to use as the primary SSL certificate, select it from the drop-down menu.
2. Otherwise, select Create a new certificate.
  1. Fill in a Name of l7-ilb-cert.
  2. In the appropriate fields upload your PEM-formatted files:
    - Public key certificate
    - Certificate chain
    - Private key
  3. Click Create.
To add certificate resources in addition to the primary SSL certificate resource:
1. Click Add certificate.
2. Select a certificate from the Certificates list or click Create a new certificate and follow the instructions.
Click Done.

Complete the configuration

Click Create.

gcloud

Define the HTTP health check with the gcloud compute health-checks create http command.

gcloud compute health-checks create http l7-ilb-gke-basic-check \
    --region=COMPUTE_REGION \
    --use-serving-port

Define the backend service with the gcloud compute backend-services create command.

gcloud compute backend-services create l7-ilb-gke-backend-service \
    --load-balancing-scheme=INTERNAL_MANAGED \
    --protocol=HTTP \
    --health-checks=l7-ilb-gke-basic-check \
    --health-checks-region=COMPUTE_REGION \
    --region=COMPUTE_REGION

Set the DEPLOYMENT_NAME variable:
```
export DEPLOYMENT_NAME=NEG_NAME
```
Replace NEG_NAME with the name of the NEG.

Add NEG backends to the backend service with the gcloud compute backend-services add-backend command.

gcloud compute backend-services add-backend l7-ilb-gke-backend-service \
    --network-endpoint-group=$DEPLOYMENT_NAME \
    --network-endpoint-group-zone=COMPUTE_ZONE-b \
    --region=COMPUTE_REGION \
    --balancing-mode=RATE \
    --max-rate-per-endpoint=5

Create the URL map with the gcloud compute url-maps create command.

gcloud compute url-maps create l7-ilb-gke-map \
    --default-service=l7-ilb-gke-backend-service \
    --region=COMPUTE_REGION

Create the target proxy.

For HTTP:

Use the gcloud compute target-http-proxies create command.

gcloud compute target-http-proxies create l7-ilb-gke-proxy \
    --url-map=l7-ilb-gke-map \
    --url-map-region=COMPUTE_REGION \
    --region=COMPUTE_REGION

For HTTPS:

See SSL Certificates for information on how to create SSL certificate resources. Google-managed certificates aren't supported with internal HTTP(S) load balancers.

Assign your filepaths to variable names.

export LB_CERT=PATH_TO_PEM_FORMATTED_FILE

export LB_PRIVATE_KEY=PATH_TO_PEM_FORMATTED_FILE

Create a regional SSL certificate using the gcloud compute ssl-certificates create command.

gcloud compute ssl-certificates create

gcloud compute ssl-certificates create l7-ilb-cert \
    --certificate=$LB_CERT \
    --private-key=$LB_PRIVATE_KEY \
    --region=COMPUTE_REGION

Use the regional SSL certificate to create a target proxy with the gcloud compute target-https-proxies create command.

gcloud compute target-https-proxies create l7-ilb-gke-proxy \
    --url-map=l7-ilb-gke-map \
    --region=COMPUTE_REGION \
    --ssl-certificates=l7-ilb-cert

Create the forwarding rule.

For custom networks, you must reference the subnet in the forwarding rule. Note that this is the VM subnet, not the proxy subnet.

For HTTP:

Use the gcloud compute forwarding-rules create command with the correct flags.

gcloud compute forwarding-rules create l7-ilb-gke-forwarding-rule \
    --load-balancing-scheme=INTERNAL_MANAGED \
    --network=lb-network \
    --subnet=backend-subnet \
    --address=10.1.2.199 \
    --ports=80 \
    --region=COMPUTE_REGION \
    --target-http-proxy=l7-ilb-gke-proxy \
    --target-http-proxy-region=COMPUTE_REGION

For HTTPS:

Use the gcloud compute forwarding-rules create command with the correct flags.

gcloud compute forwarding-rules create l7-ilb-gke-forwarding-rule \
    --load-balancing-scheme=INTERNAL_MANAGED \
    --network=lb-network \
    --subnet=backend-subnet \
    --address=10.1.2.199 \
    --ports=443 \
    --region=COMPUTE_REGION \
    --target-https-proxy=l7-ilb-gke-proxy \
    --target-https-proxy-region=COMPUTE_REGION

API

Create the health check by making a POST request to the regionHealthChecks.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/COMPUTE_REGION/healthChecks

{
   "name": "l7-ilb-gke-basic-check",
   "type": "HTTP",
   "httpHealthCheck": {
     "portSpecification": "USE_SERVING_PORT"
   }
}

Replace PROJECT_ID with the project ID.

Create the regional backend service by making a POST request to the regionBackendServices.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/COMPUTE_REGION/backendServices

{
  "name": "l7-ilb-gke-backend-service",
  "backends": [
    {
      "group": "https://www.googleapis.com/compute/v1/projects/PROJECT_ID/zones/COMPUTE_ZONE/networkEndpointGroups/NEG_NAME",
      "balancingMode": "RATE",
      "maxRatePerEndpoint": 5
    }
  ],
  "healthChecks": [
    "projects/PROJECT_ID/regions/COMPUTE_REGION/healthChecks/l7-ilb-gke-basic-check"
  ],
  "loadBalancingScheme": "INTERNAL_MANAGED"
}

Replace the following:

PROJECT_ID: the project ID.
NEG_NAME: the name of the NEG.

Create the URL map by making a POST request to the regionUrlMaps.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/COMPUTE_REGION/urlMaps

{
  "name": "l7-ilb-gke-map",
  "defaultService": "projects/PROJECT_ID/regions/COMPUTE_REGION/backendServices/l7-ilb-gke-backend-service"
}

Replace PROJECT_ID with the project ID.

Create the target HTTP proxy by making a POST request to the regionTargetHttpProxies.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/COMPUTE_REGION/targetHttpProxy

{
  "name": "l7-ilb-gke-proxy",
  "urlMap": "projects/PROJECT_ID/global/urlMaps/l7-ilb-gke-map",
  "region": "COMPUTE_REGION"
}

Replace PROJECT_ID with the project ID.

Create the forwarding rule by making a POST request to the forwardingRules.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/COMPUTE_REGION/forwardingRules

{
  "name": "l7-ilb-gke-forwarding-rule",
  "IPAddress": "10.1.2.199",
  "IPProtocol": "TCP",
  "portRange": "80-80",
  "target": "projects/PROJECT_ID/regions/COMPUTE_REGION/targetHttpProxies/l7-ilb-gke-proxy",
  "loadBalancingScheme": "INTERNAL_MANAGED",
  "subnetwork": "projects/PROJECT_ID/regions/COMPUTE_REGION/subnetworks/backend-subnet",
  "network": "projects/PROJECT_ID/global/networks/lb-network",
  "networkTier": "PREMIUM",
}

Replace PROJECT_ID with the project ID.

Testing

Create a VM instance in the zone to test connectivity:

gcloud compute instances create l7-ilb-client \
    --image-family=debian-9 \
    --image-project=debian-cloud \
    --zone=COMPUTE_ZONE \
    --network=lb-network \
    --subnet=backend-subnet \
    --tags=l7-ilb-client,allow-ssh

Sign in to the client instance to check that HTTP(S) services on the backends are reachable using the internal Application Load Balancer's forwarding rule IP address, and traffic is being load balanced among endpoints in the NEG.

Connect to each client instance using SSH:

gcloud compute ssh l7-ilb-client \
    --zone=COMPUTE_ZONE-b

Verify that the IP is serving its hostname:

curl 10.1.2.199

For HTTPS testing, run the following command:

curl -k -s 'https://test.example.com:443' --connect-to test.example.com:443:10.1.2.199:443

The -k flag causes curl to skip certificate validation.

Run 100 requests and confirm that they are load balanced.

For HTTP:

{
RESULTS=
for i in {1..100}
do
    RESULTS="$RESULTS:$(curl --silent 10.1.2.199)"
done
echo "***"
echo "*** Results of load-balancing to 10.1.2.199: "
echo "***"
echo "$RESULTS" | tr ':' '\n' | grep -Ev "^$" | sort | uniq -c
echo
}

For HTTPS:

{
RESULTS=
for i in {1..100}
do
    RESULTS="$RESULTS:$(curl -k -s 'https://test.example.com:443' --connect-to test.example.com:443:10.1.2.199:443
)"
done
echo "***"
echo "*** Results of load-balancing to 10.1.2.199: "
echo "***"
echo "$RESULTS" | tr ':' '\n' | grep -Ev "^$" | sort | uniq -c
echo
}

Attach an external proxy Network Load Balancer to standalone NEGs

You can use standalone NEGs to load balance directly to containers with the external proxy Network Load Balancer which is not supported natively by Kubernetes or GKE. Proxy Network Load Balancers are intended for TCP traffic only, with or without SSL. For HTTP(S) traffic, we recommend that you use an Application Load Balancer instead.

Create a firewall rule to allow health checks.

Load balancers need to access cluster endpoints to perform health checks. The following command creates a firewall rule to allow access:
```
gcloud compute firewall-rules create allow-tcp-lb-and-health \
   --network=NETWORK_NAME \
   --target-tags=GKE_NODE_NETWORK_TAGS \
   --source-ranges=130.211.0.0/22,35.191.0.0/16 \
   --allow tcp:9376
```
Replace the following:
- NETWORK_NAME: the network where the cluster runs.
- GKE_NODE_NETWORK_TAGS: the networking tags on the GKE nodes.
If you didn't create custom network tags for your nodes, GKE automatically generates tags for you. You can look up these generated tags by running the following command:
```
gcloud compute instances describe INSTANCE_NAME
```
Replace INSTANCE_NAME with the name of the host Compute Engine VM instance running the GKE node. For example, the output in the Inspect NEGs directly section displays the instance names in the INSTANCE column for the GKE nodes.

For Standard clusters, you can also run the gcloud compute instances list to list all instances in your project.

For Autopilot clusters, target the cluster service account instead of network tags.

Replace the --target-tags=GKE_NODE_NETWORK_TAGS flag with --target-service-accounts=SERVICE_ACCOUNT_EMAIL. We recommended that you use a custom minimally-privileged service account for the cluster.

Create a global virtual IP address for the load balancer:

gcloud compute addresses create tcp-lb-static-ipv4 \
   --ip-version=IPV4 \
   --global

Define a health check for backend endpoints.

The load balancer uses health checks to detect the liveness of individual endpoints within the NEG.
```
gcloud compute health-checks create tcp my-tcp-health-check \
   --use-serving-port
```

Create a backend service with session affinity.

This backend service specifies that this is a external proxy Network Load Balancer with CLIENT_IP session affinity:

gcloud compute backend-services create my-tcp-lb \
   --load-balancing-scheme EXTERNAL_MANAGED \
   --global-health-checks \
   --global \
   --protocol TCP \
   --health-checks my-tcp-health-check \
   --timeout 5m \
   --session-affinity=CLIENT_IP

Create a target TCP proxy for the load balancer.

If you want to turn on the proxy header, set it to PROXY_V1 instead of NONE.

gcloud beta compute target-tcp-proxies create my-tcp-lb-target-proxy \
    --backend-service my-tcp-lb \
    --proxy-header NONE

Create a forwarding rule to route traffic.

This forwarding rule creates the load balancer.

 gcloud compute forwarding-rules create my-tcp-lb-ipv4-forwarding-rule \
    --load-balancing-scheme EXTERNAL_MANAGED \
    --global \
    --target-tcp-proxy my-tcp-lb-target-proxy \
    --address tcp-lb-static-ipv4 \
    --ports 80

Verify load balancer resources creation

You have created the following resources:

An external virtual IP address
The forwarding rule
The firewall rule
The target proxy
The backend service
The Compute Engine health check

The external virtual IP address is connected to the forwarding rule, which directs traffic that is allowed by the firewall rule to the target proxy. The target proxy then communicates with the backend service, which is periodically checked by the health check. The relationship between these resources is shown in the following diagram:

Relationship between the resources you've created.

These resources together are a load balancer. In the next step, you will add backends to the load balancer.

One of the benefits of the standalone NEGs demonstrated here is that the lifecycles of the load balancer and backend can be completely independent. The load balancer can continue running after the application, its services, or the GKE cluster is deleted. You can add and remove new NEGs or multiple NEGs from the load balancer without changing any of the frontend load balancer objects.

Add standalone NEGs as backends to the load balancer

Use gcloud compute backend-services add-backend to connect the NEG to the load balancer by adding as a backend of the my-tcp-lb backend service:

gcloud compute backend-services add-backend my-tcp-lb \
    --global \
    --network-endpoint-group=NEG_NAME \
    --network-endpoint-group-zone=NEG_ZONE \
    --balancing-mode CONNECTION \
    --max-connections 100

Replace the following:

NEG_NAME: the name of your network endpoint group. The name is either the name you specified when creating the NEG or an autogenerated name. If you did not specify a name for the NEG, see the following instructions to find the autogenerated name.
NEG_ZONE: the zone your network endpoint group is in. See the following instructions to find this value.

To get the name and location of the NEGs, use this command:

gcloud compute network-endpoint-groups list

The output is similar to the following:

NAME: k8s1-65a95e90-default-neg-demo-svc-80-663a85e4
LOCATION: us-central1-a
ENDPOINT_TYPE: GCE_VM_IP_PORT
SIZE: 3

In this example output, the name of the NEG is kk8s1-65a95e90-default-neg-demo-svc-80-663a85e4 and zone is us-central1-a.

Multiple NEGs can be added to the same backend service. Global backend services like my-tcp-lb can have NEG backends in different regions, while regional backend services must have backends in a single region.

Verify the load balancer configuration and connectivity

There are two ways to validate that the load balancer you set up is working:

Verify that the health check is correctly configured and is reporting as healthy.
Access the application and verify its response.

Verify health checks

Check that the backend service is associated with the health check and network endpoint groups, and that the individual endpoints are healthy.

To check that the backend service is associated with your health check and your network endpoint group, use this command:

gcloud compute backend-services describe my-tcp-lb --global

The output is similar to the following:

backends:
- balancingMode: CONNECTION
  group: ... /networkEndpointGroups/k8s1-65a95e90-default-neg-demo-svc-80-663a85e4
  maxConnections: 100
...
healthChecks:
- ... /healthChecks/my-tcp-health-check
...
name: my-tcp-lb
...

Next, check the health of the individual endpoints:

gcloud compute backend-services get-health my-tcp-lb --global

The status: section of the output is similar to the following:

status:
  healthStatus:
  - healthState: HEALTHY
    instance: ... gke-cluster-3-default-pool-4cc71a15-qlpf
    ipAddress: 10.12.1.43
    port: 50000
  - healthState: HEALTHY
    instance: ... gke-cluster-3-default-pool-4cc71a15-qlpf
    ipAddress: 10.12.1.44
    port: 50000
  - healthState: HEALTHY
    instance: ... gke-cluster-3-default-pool-4cc71a15-w9nk
    ipAddress: 10.12.2.26
    port: 50000

Verify application connectivity

To verify that the load balancer is functioning correctly, access the application through the external IP address of the load balancer.

Get the external IP address of the load balancer:

To retrieve the external IP address that you reserved for the load balancer, use the following command:
```
gcloud compute addresses describe tcp-lb-static-ipv4 --global | grep "address:"
```
This command outputs the IP address.
Send a request to the IP address:

Use the curl command to send a request to the external IP address.
```
curl EXTERNAL_IP_ADDRESS
```
Replace EXTERNAL_IP_ADDRESS with the IP address that you obtained in the previous step:

The response from the serve_hostname application should begin with neg-demo-app.

Implementing heterogeneous services (VMs and containers)

Load balancers can be frontends to mixed Kubernetes and non-Kubernetes workloads. This could be part of a migration from VMs to containers or a permanent architecture that benefits from a shared load balancer. This can be achieved by creating load balancers that target different kinds of backends including standalone NEGs.

VMs and containers in the same backend service

This example shows how to create a NEG that points at an existing VM running a workload, and how to add this NEG as another backend of an existing backendService. This way a single load balancer balances between VMs and GKE containers.

This examples extends the previous example that uses an external HTTP load balancer.

Because all endpoints are grouped by the same backendService, the VM and container endpoints are considered the same service. This means the host or path matching will treat all backends identically based on the URL map rules.

The described architecture. The load balancer created earlier points to two NEGs, the NEG for containers created earlier and a new NEG containing a VM's IP address.

When you use a NEG as a backend for a backend service, all other backends in that backend service must also be NEGs. You can't use instance groups and NEGs as backends in the same backend service. Additionally, containers and VMs cannot exist as endpoints within the same NEG, so they must always be configured with separate NEGs.

Deploy a VM to Compute Engine with this command:

gcloud compute instances create vm1 \
    --zone=COMPUTE_ZONE \
    --network=NETWORK \
    --subnet=SUBNET \
    --image-project=cos-cloud \
    --image-family=cos-stable --tags=vm-neg-tag

Replace the following:

COMPUTE_ZONE: the name of the zone.
NETWORK: the name of the network.
SUBNET: the name of the subnet associated with the network.

Deploy an application to the VM:
```
gcloud compute ssh vm1 \
    --zone=COMPUTE_ZONE \
    --command="docker run -d --rm --network=host registry.k8s.io/serve_hostname:v1.4 && sudo iptables -P INPUT ACCEPT"
```
This command deploys to the VM the same example application used in the earlier example. For simplicity, the application is run as a Docker container but this is not essential. The iptables command is required to allow firewall access to the running container.
Validate that the application is serving on port 9376 and reporting that it is running on vm1:
```
gcloud compute ssh vm1 \
    --zone=COMPUTE_ZONE \
    --command="curl -s localhost:9376"
```
The server should respond with vm1.
Create a NEG to use with the VM endpoint. Containers and VMs can both be NEG endpoints, but a single NEG can't have both VM and container endpoints.
```
gcloud compute network-endpoint-groups create vm-neg \
    --subnet=SUBNET \
    --zone=COMPUTE_ZONE
```

Attach the VM endpoint to the NEG:

gcloud compute network-endpoint-groups update vm-neg \
    --zone=COMPUTE_ZONE \
    --add-endpoint="instance=vm1,ip=VM_PRIMARY_IP,port=9376"

Replace VM_PRIMARY_IP with the VM's primary IP address.

Confirm that the NEG has the VM endpoint:

gcloud compute network-endpoint-groups list-network-endpoints vm-neg \
    --zone COMPUTE_ZONE

Attach the NEG to the backend service using the same command that you used to add a container backend:

gcloud compute backend-services add-backend my-bes
    --global \
    --network-endpoint-group vm-neg \
    --network-endpoint-group-zone COMPUTE_ZONE \
    --balancing-mode RATE --max-rate-per-endpoint 10

Open firewall to allow the VM's health check:

gcloud compute firewall-rules create fw-allow-health-check-to-vm1 \
    --network=NETWORK \
    --action=allow \
    --direction=ingress \
    --target-tags=vm-neg-tag \
    --source-ranges=130.211.0.0/22,35.191.0.0/16 \
    --rules=tcp:9376

Validate that the load balancer is forwarding traffic to both the new vm1 backend and the existing container backend by sending test traffic:
```
for i in `seq 1 100`; do curl ${VIP};echo; done
```
You should see responses from both the container (neg-demo-app) and VM (vm1) endpoints.

VMs and containers for different backend services

This example shows how to create a NEG that points at an existing VM running a workload, and how to add this NEG as the backend to a new backendService. This is useful for the case where the containers and VMs are different services but need to share the same L7 load balancer, such as if the services share the same IP address or domain name.

This examples extends the previous example that has a VM backend in the same backend service as the container backend. This example reuses that VM.

Because the container and VM endpoints are grouped in separate backend services, they are considered different services. This means that the URL map will match backends and direct traffic to the VM or container based on the hostname.

The following diagram shows how a single virtual IP address corresponds to two host names, which in turn correspond to a container-based backend service and a VM-based backend service.

A single virtual IP address mapping to two hostnames, one hostname for a container-based backend and one for a VM-based backend.

The following diagram shows the architecture described in the previous section:

The architecture has two NEGs, one for the service implemented with containers
and another for the service implemented with VMs. There is a backend service
object for each NEG. The URL Map object directs traffic to the correct backend
service based on the requested URL.

Create a new backend service for the VM:

gcloud compute backend-services create my-vm-bes \
   --protocol HTTP \
   --health-checks http-basic-check \
   --global

Attach the NEG for the VM, vm-neg, to the backend-service:

gcloud compute backend-services add-backend my-vm-bes \
    --global \
    --network-endpoint-group vm-neg \
    --network-endpoint-group-zone COMPUTE_ZONE \
    --balancing-mode RATE --max-rate-per-endpoint 10

Add a host rule to the URL map to direct requests for container.example.com host to the container backend service:

gcloud compute url-maps add-path-matcher web-map \
    --path-matcher-name=container-path --default-service=my-bes \
    --new-hosts=container.example.com --global

Add another host rule the URL map to direct requests for vm.example.com host to the VM backend service:
```
gcloud compute url-maps add-path-matcher web-map \
    --path-matcher-name=vm-path --default-service=my-vm-bes \
    --new-hosts=vm.example.com --global
```
Note: It takes a few minutes for the load balancer to be fully programmed.
Validate that the load balancer sends traffic to the VM backend based on the requested path:
```
curl -H "HOST:vm.example.com" VIRTUAL_IP
```
Replace VIRTUAL_IP with the virtual IP address.

Limitations of standalone NEGs

Annotation validation errors are exposed to the user through Kubernetes events.
The limitations of NEGs also apply to standalone NEGs.
Standalone NEGs don't work with legacy networks.
Standalone NEGs can only be used with compatible network services including Cloud Service Mesh and the compatible load balancer types.

Pricing

Refer to the load balancing section of the pricing page for details on pricing of the load balancer. There is no additional charge for NEGs.

Troubleshooting

This section provides troubleshooting steps for common issues that you might encounter with the standalone NEGs.

No standalone NEG configured

Symptom: No NEG is created.

Potential Resolution:

Check the events associated with the Service and look for error messages.
Verify the standalone NEG annotation is well-formed JSON, and that the exposed ports match existing Ports in the Service spec.
Verify the NEG status annotation and see if expected service ports has corresponding NEGs.
Verify that the NEGs have been created in the expected zones, with the command gcloud compute network-endpoint-groups list.
If using GKE version 1.18 or later, check if the svcneg resource for the Service exists. If it does, check the Initialized condition for any error information.
If you are using custom NEG names, ensure that each NEG name in is unique in its region.

Traffic does not reach the endpoints

Symptom: 502 errors or rejected connections.

Potential Resolution:

After the service is configured new endpoints will generally become reachable after attaching them to NEG, provided they respond to health checks.
If after this time traffic still can't reach the endpoints resulting in 502 error code for HTTP(S) or connections being rejected for TCP/SSL load balancers, check the following:
- Verify that firewall rules allow incoming TCP traffic to your endpoints from following ranges: 130.211.0.0/22 and 35.191.0.0/16.
- Verify that your endpoints are healthy using the Google Cloud CLI or by calling getHealth API on the backendService or the listEndpoints API on the NEG with the showHealth parameter set to SHOW.

Stalled rollout

Symptom: Rolling out an updated Deployment stalls, and the number of up-to-date replicas does not match the chosen number of replicas.

Potential Resolution:

The deployment's health checks are failing. The container image might be bad or the health check might be misconfigured. The rolling replacement of Pods waits until the newly started Pod passes its Pod Readiness gate. This only occurs if the Pod is responding to load balancer health checks. If the Pod does not respond, or if the health check is misconfigured, the readiness gate conditions can't be met and the rollout can't continue.

If you're using kubectl 1.13 or higher, you can check the status of a Pod's readiness gates with the following command:
```
kubectl get my-Pod -o wide
```
Check the READINESS GATES column.

This column doesn't exist in kubectl 1.12 and lower. A Pod that is marked as being in the READY state may have a failed readiness gate. To verify this, use the following command:
```
kubectl get my-pod -o yaml
```
The readiness gates and their status are listed in the output.
Verify that the container image in your Deployment's Pod specification is functioning correctly and is able to respond to health checks.
Verify that the health checks are correctly configured.

NEG is not garbage collected

Symptom: A NEG that should have been deleted still exists.

Potential Resolution:

The NEG is not garbage collected if the NEG is referenced by a backend service. See Preventing leaked NEGs for details.
If using 1.18 or later, you can check for events in the ServiceNetworkEndpointGroup resource using the service neg procedure.
Check to see if the NEG is still needed by a service. Check the svcneg resource for the service that corresponds to the NEG and check if a Service annotation exists.

NEG is not synced with Service

Symptom: Expected (Pod IP) endpoints don't exist in the NEG, the NEG is not synchronized, or the error Failed to sync NEG_NAME (will not retry): neg name NEG_NAME is already in use, found a custom named neg with an empty description

Potential Resolution:

If you are using GKE 1.18 or later, check the svcneg resource for information:

Check the status.lastSyncTime value to verify if the NEG has synced recently.
Check the Synced condition for any errors that occurred in the most recent sync.

If you are using GKE 1.19.9 or later, check whether there exists a NEG whose name and zone match the name and zone of the NEG that the GKE NEG controller needs to create. For example, a NEG with the name that the NEG controller needs to use might have been created using gcloud CLI or the Google Cloud console in the cluster's zone (or one of the cluster's zones). In this case, you must delete the existing NEG before the NEG controller can synchronize its endpoints. Standalone NEG creation and membership are designed to be managed by the NEG controller.

Container-native load balancing through standalone zonal NEGs

Overview

Ingress with NEGs

Standalone NEGs

Preventing leaked NEGs

Use cases of standalone NEGs

Heterogeneous services of containers and VMs

Customized Ingress controllers

Use Cloud Service Mesh with GKE

Use external proxy Network Load Balancers with GKE

Pod readiness

Before you begin

Using standalone NEGs

Create a Deployment

Using Pod readiness feedback

Using hardcoded delay

Create a Service

Service types

Naming NEGs

Specifying a name

Using an automatically generated name

Mapping ports to multiple NEGs

Retrieve NEG statuses

Validate NEG creation

The ServiceNetworkEndpointGroup resource

Inspecting NEGs directly

Attaching an external Application Load Balancer to standalone NEGs

Checkpoint

Add backends to the load balancer

Validate that the load balancer works

Verify health checks

Access the application

Attaching an internal Application Load Balancer to standalone NEGs

Configuring the proxy-only subnet

Console

gcloud

API

Configuring firewall rules

Console

gcloud

API

Configuring the load balancer

Console

Select a load balancer type

Prepare the load balancer

Reserve a proxy-only subnet

Configure the backend service

Configure the URL map

Configure the frontend

Complete the configuration

gcloud

API

Testing

Attach an external proxy Network Load Balancer to standalone NEGs

Verify load balancer resources creation

Add standalone NEGs as backends to the load balancer

Verify the load balancer configuration and connectivity

Verify health checks

Verify application connectivity

Implementing heterogeneous services (VMs and containers)

VMs and containers in the same backend service

VMs and containers for different backend services

Limitations of standalone NEGs

Pricing

Troubleshooting

No standalone NEG configured

Traffic does not reach the endpoints

Stalled rollout

NEG is not garbage collected

NEG is not synced with Service

What's next

The `ServiceNetworkEndpointGroup` resource