Setting up HTTP(S) Load Balancing with Ingress

This tutorial shows how to run a web application behind an external HTTP(S) load balancer by configuring the Ingress resource.

Background

Google Kubernetes Engine (GKE) offers integrated support for two types of Cloud Load Balancing for a publicly accessible application:

  • When you specify type:LoadBalancer in the resource manifest, GKE creates a Service of type LoadBalancer. GKE makes appropriate Google Cloud API calls to create either an external network load balancer or an internal TCP/UDP load balancer. GKE creates an internal TCP/UDP load balancer when you add the cloud.google.com/load-balancer-type: "Internal" annotation; otherwise, GKE creates an external network load balancer.

    Although you can use either of these types of load balancers for HTTP(S) traffic, they operate in OSI layers 3/4 and are not aware of HTTP connections or individual HTTP requests and responses. Another important characteristic is that the requests are not proxied to the destination.

  • When you specify type:Ingress in the resource manifest, you instruct GKE to create an Ingress resource. By including annotations and supporting workloads and Services, you can create a custom Ingress controller. Otherwise, GKE makes appropriate Google Cloud API calls to create an external HTTP(S) load balancer. The load balancer's URL map's host rules and path matchers reference one or more backend services, where each backend service corresponds to a GKE Service of type NodePort, as referenced in the Ingress. The backends for each backend service are either instance groups or network endpoint groups (NEGs). NEGs are created when you configure container-native load balancing as part of the configuration for your Ingress. For each backend service, GKE creates a Google Cloud health check, based on the readiness probe settings of the workload referenced by the corresponding GKE Service.

    If you are exposing an HTTP(S) service hosted on GKE, HTTP(S) load balancing is the recommended method for load balancing.

Before you begin

Take the following steps to enable the Kubernetes Engine API:
  1. Visit the Kubernetes Engine page in the Google Cloud Console.
  2. Create or select a project.
  3. Wait for the API and related services to be enabled. This can take several minutes.
  4. Make sure that billing is enabled for your Google Cloud project. Learn how to confirm billing is enabled for your project.

Install the following command-line tools used in this tutorial:

  • gcloud is used to create and delete Kubernetes Engine clusters. gcloud is included in the Google Cloud SDK.
  • kubectl is used to manage Kubernetes, the cluster orchestration system used by Kubernetes Engine. You can install kubectl using gcloud:
    gcloud components install kubectl

Set defaults for the gcloud command-line tool

To save time typing your project ID and Compute Engine zone options in the gcloud command-line tool, you can set the defaults:
gcloud config set project project-id
gcloud config set compute/zone compute-zone

Create a container cluster

Create a container cluster named loadbalancedcluster by running:

gcloud container clusters create loadbalancedcluster

Step 1: Deploy a web application

Create a Deployment using the sample web application container image that listens on a HTTP server on port 8080:

  1. Download the web-deployment.yaml manifest.
  2. Apply the resource to the cluster:

    kubectl apply -f web-deployment.yaml
    

Step 2: Expose your Deployment as a Service internally

Create a Service resource to make the web deployment reachable within your container cluster.

  1. Download the web-service.yaml manifest.
  2. Apply the resource to the cluster:

    kubectl apply -f web-service.yaml
    

    When you create a Service of type NodePort with this command, GKE makes your Service available on a randomly- selected high port number (e.g. 32640) on all the nodes in your cluster.

  3. Verify the Service was created and a node port was allocated:

    kubectl get service web
    
    Output:
    NAME      TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
    web       NodePort   10.35.245.219   <none>        8080:32640/TCP   5m
    

    In the sample output above, the node port for the web Service is 32640. Also, note that there is no external IP allocated for this Service. Since the GKE nodes are not externally accessible by default, creating this Service does not make your application accessible from the Internet.

To make your HTTP(S) web server application publicly accessible, you need to create an Ingress resource.

Step 3: Create an Ingress resource

Ingress is a Kubernetes resource that encapsulates a collection of rules and configuration for routing external HTTP(S) traffic to internal services.

On GKE, Ingress is implemented using Cloud Load Balancing. When you create an Ingress in your cluster, GKE creates an HTTP(S) load balancer and configures it to route traffic to your application.

While the Kubernetes Ingress is a beta resource, meaning how you describe the Ingress object is subject to change, the Cloud Load Balancers that GKE provisions to implement the Ingress are production-ready.

The following config file defines an Ingress resource that directs traffic to your web Service:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: basic-ingress
spec:
  backend:
    serviceName: web
    servicePort: 8080

To deploy this Ingress resource:

  1. Download the basic-ingress.yaml manifest.
  2. Apply the resource to the cluster:

    kubectl apply -f basic-ingress.yaml
    

Once you deploy this manifest, Kubernetes creates an Ingress resource on your cluster. The GKE ingress controller creates and configures an HTTP(S) Load Balancer according to the information in the Ingress, routing all external HTTP traffic (on port 80) to the web NodePort Service you exposed.

Step 4: Visit your application

Find out the external IP address of the load balancer serving your application by running:

kubectl get ingress basic-ingress
Output:
NAME            HOSTS     ADDRESS         PORTS     AGE
basic-ingress   *         203.0.113.12    80        2m

Point your browser to the external IP address of your application and see a plain text HTTP response like the following:

Hello, world!
Version: 1.0.0
Hostname: web-6498765b79-fq5q5

You can visit Load Balancing on Cloud Console and inspect the networking resources created by the Ingress controller.

Step 5: (Optional) Configure a static IP address

When you expose a web server on a domain name, you need the external IP address of an application to be a static IP that does not change.

By default, GKE allocates ephemeral external IP addresses for HTTP applications exposed through an Ingress. Ephemeral addresses are subject to change. For a web application you are planning for a long time, you need to use a static external IP address.

Note that once you configure a static IP for the Ingress resource, deleting the Ingress will not delete the static IP address associated to it. Make sure to clean up the static IP addresses you configured once you no longer plan to use them again.

Option 1: Convert existing ephemeral IP address to static IP address

If you already have an Ingress deployed, you can convert the existing ephemeral IP address of your application to a reserved static IP address without changing the external IP address by visiting the External IP addresses section on Cloud Console.

Option 2: Reserve a new static IP address

  1. Reserve a static external IP address named web-static-ip:

    gcloud

    gcloud compute addresses create web-static-ip --global
    

    Config Connector

    Note: This step requires Config Connector. Follow the installation instructions to install Config Connector on your cluster.

    apiVersion: compute.cnrm.cloud.google.com/v1beta1
    kind: ComputeAddress
    metadata:
      name: web-static-ip
    spec:
      location: global
    To deploy this manifest, download it to your machine as compute-address.yaml, and run:
    kubectl apply -f compute-address.yaml

  2. Configure the existing Ingress resource to use the reserved IP address. Replace the basic-ingress.yaml manifest used earlier with the following manifest:

    apiVersion: networking.k8s.io/v1beta1
    kind: Ingress
    metadata:
      name: basic-ingress
      annotations:
        kubernetes.io/ingress.global-static-ip-name: "web-static-ip"
    spec:
      backend:
        serviceName: web
        servicePort: 8080
    

    This change adds an annotation on Ingress to use the static IP resource named web-static-ip .

  3. Apply this modification to the existing Ingress:

    kubectl apply -f basic-ingress.yaml
    
  4. Check the external IP address:

    kubectl get ingress basic-ingress
    

    Wait until the IP address of your application changes to use the reserved IP address of the web-static-ip resource.

    It may take a couple of minutes to update the existing Ingress resource, re- configure the load balancer and propagate the load balancing rules across the globe. Once this operation completes, the GKE releases the ephemeral IP address previously allocated to your application.

Step 6: (Optional) Serve multiple applications on a load balancer

You can run multiple services on a single load balancer and public IP by configuring routing rules on the Ingress. By hosting multiple services on the same Ingress, you can avoid creating additional load balancers (which are billable resources) for every Service you expose to the Internet.

Create another web server Deployment with version 2.0 of the same web application.

Download web-deployment-v2.yaml, then apply the resource to the cluster:

kubectl apply -f web-deployment-v2.yaml

Then, expose the web2 Deployment internally to the cluster on a NodePort Service called web2.

Download web-service-v2.yaml, then apply the resource to the cluster:

kubectl apply -f web-service-v2.yaml

The following manifest describes an Ingress resource that:

  • routes the requests with path starting with /v2/ to the web2 Service
  • routes all other requests to the web Service
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: fanout-ingress
spec:
  rules:
  - http:
      paths:
      - path: /*
        backend:
          serviceName: web
          servicePort: 8080
      - path: /v2/*
        backend:
          serviceName: web2
          servicePort: 8080

To deploy this manifest, save it to a fanout-ingress.yaml, and run:

kubectl create -f fanout-ingress.yaml

Once the ingress is deployed, run kubectl get ingress fanout-ingress to find out the public IP address of the cluster.

Then visit the IP address to see that both applications are reachable on the same load balancer:

  • Visit http://<IP_ADDRESS>/ and note that the response contains Version: 1.0.0 (as the request is routed to the web Service)
  • Visit http://<IP_ADDRESS>/v2/ and note that the response contains Version: 2.0.0 (as the request is routed to the web2 Service)

The only supported wildcard pattern matching for the path field on GKE Ingress is through the * character. For example, you can have rules with path fields like /* or /foo/bar/*. Refer to the URL Maps documentation for the path limitations.

Step 7: (Optional) Monitor the availability and latency of your service

Google Cloud Uptime checks perform blackbox monitoring of applications from the viewpoint of the user, determining latency and availability from multiple external IPs to the IP address of the load balancer. In comparison, Google Cloud health checks perform an internal check against the Pod IPs, determining availability at the instance level. They are complementary and provide a holistic picture of application health.

You can create an uptime check by using the Google Cloud Console, the Cloud Monitoring API, or by using the Cloud Monitoring client libraries. For information, see Managing uptime checks. If you want to create an uptime check by using the Google Cloud Console, do the following:

  1. In the Google Cloud Console, select Monitoring, or click the following button:

    Go to Monitoring

  2. In the Monitoring navigation pane, select Uptime checks and then click Create uptime check.

  3. For the target of your uptime check, set the following fields:

    • Select the type of protocol as TCP.
    • For the Resource type, select URL.
    • For the Hostname, enter the IP address of the Load Balancer.
    • Enter the Load Balancer port number in the Port field.

    For complete documentation on the all the fields in an uptime check, see Creating a uptime check.

To monitor an uptime check, you can create an alerting policy or view the uptime check dashboard. An alerting policy can notify you by email or through a different channel if your uptime check fails. For general information about alerting policies, see Introduction to alerting.

Remarks

By default, Ingress performs a periodic health check by making a GET request on the / path to determine health of the application, and expects HTTP 200 response. If you want to check a different path or to expect a different response code, you can use a custom health check path.

Ingress supports more advanced use cases, such as:

  • Name-based virtual hosting: You can use Ingress to reuse the load balancer for multiple domain names, subdomains and to expose multiple Services on a single IP address and load balancer. Check out the simple fanout and name-based virtual hosting examples to learn how to configure Ingress for these tasks.

  • HTTPS termination: You can configure the Ingress to terminate the HTTPS traffic using the Cloud Load Balancer.

When an Ingress is deleted, the Ingress controller cleans up the associated resources (except reserved static IP addresses) automatically.

Cleaning up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:

  1. Delete any manually created forwarding rules and target proxies that reference the Ingress:

    A dangling target proxy that is referencing a Ingress controller managed URL map will cause the deletion of the Ingress to fail in GKE versions 1.15.4-gke.22+. The Ingress resource can be inspected to find an event with error message similar to the following:

     Error during GC: error running load balancer garbage collection routine: googleapi: Error 400: The url_map resource 'projects/project-id/global/urlMaps/k8s2-um-tlw9rhgp-default-my-ingress-9ifnni82' is already being used by 'projects/project-id/global/targetHttpsProxies/k8s2-um-tlw9rhgp-default-my82-target-proxy', resourceInUseByAnotherResource
     

    In the sample error message above, k8s2-um-tlw9rhgp-default-my82-target-proxy is a manually created target https proxy that is still referecing the URL map k8s2-um-tlw9rhgp-default-my-ingress-9ifnni82 which was created and managed by ingress controller.

    So, these manually created frontend resources (both forwarding rule and target proxy) needs to be deleted before proceeding with the deletion of the Ingress.

  2. Delete the Ingress: This deallocates the ephemeral external IP address and the load balancing resources associated to your application:

    kubectl delete ingress basic-ingress

    If you have followed "Step 6", delete the ingress by running:

    kubectl delete ingress fanout-ingress

  3. Delete the static IP address: Execute this only if you followed Step 5.

    • If you have followed "Option 1" in Step 5 to convert an existing ephemeral IP address to static IP, visit Cloud Console to delete the static IP.

    • If you have followed "Option 2" in Step 5, run the following command to delete the static IP address:

      gcloud compute addresses delete web-static-ip --global
  4. Delete the cluster: This deletes the compute nodes of your container cluster and other resources such as the Deployments in the cluster:

    gcloud container clusters delete loadbalancedcluster

What's next