GKE Ingress for HTTP(S) load balancing

This page provides a general overview of how Ingress for HTTP(S) load balancing works. Google Kubernetes Engine (GKE) provides a built-in and managed Ingress controller called GKE Ingress. This controller implements Ingress resources as Google Cloud Load Balancers for HTTP(S) workloads in GKE.

Overview

In GKE, an Ingress object defines rules for routing HTTP(S) traffic to applications running in a cluster. An Ingress object is associated with one or more Service objects, each of which is associated with a set of Pods.

When you create an Ingress object, the GKE Ingress controller creates a Google Cloud HTTP(S) load balancer and configures it according to the information in the Ingress and its associated Services.

Ingress for external and internal traffic

GKE Ingress resources come in two types:

Features of HTTP(S) load balancing

HTTP(S) load balancing, configured by Ingress, includes the following features:

Flexible configuration for Services
An Ingress defines how traffic reaches your Services and how the traffic is routed to your application. In addition, an Ingress can provide a single IP address for multiple Services in your cluster.
Integration with Google Cloud network services
An Ingress can configure Google Cloud features such as Google-managed SSL certificates (Beta), Google Cloud Armor, Cloud CDN, and Identity-Aware Proxy.
Support for multiple TLS certificates
An Ingress can specify the use of multiple TLS certificates for request termination.

Container-native load balancing

Container-native load balancing is the practice of load balancing directly to Pod endpoints in GKE using Network Endpoint Groups (NEGs).

Prior to NEGs, Compute Engine load balancers sent traffic to Instance Groups using Node IPs as backends. This method has a number of limitations:

  • It incurs two hops of load balancing.
  • It adds latency.
  • The Compute Engine load balancer has no direct visibility to Pods resulting in suboptimal traffic balancing.

With NEGs, traffic is load balanced from the Ingress proxy directly to the Pod IP as opposed to traversing the node IP or kube-proxy networking. In addition, Pod readiness gates are implemented to determine the health of Pods from the perspective of the load balancer and not just the Kubernetes readiness and liveness checks. This ensures that traffic is not dropped during lifecycle events such as Pod startup, Pod loss, or node loss.

Leveraging container-native load balancing by deploying Ingress with NEGs is highly recommended and should be used whenever possible. It is not the default mode for Services and must be explicitly applied with the cloud.google.com/neg annotation to Services that are backends for an Ingress rule:

kind: Service
...
  annotations:
    cloud.google.com/neg: '{"ingress": true}'
...

Multiple backend services

An HTTP(S) load balancer provides one stable IP address that you can use to route requests to a variety of backend services.

For example, you can configure the load balancer to route requests to different backend services depending on the URL path. Requests sent to your-store.example could be routed to a backend service that displays full-price items, and requests sent to your-store.example/discounted could be routed to a backend service that displays discounted items.

You can also configure the load balancer to route requests according to the hostname. Requests sent to your-store.example could go to one backend service, and requests sent to your-experimental-store.example could go to another backend service.

In a GKE cluster, you create and configure an HTTP(S) load balancer by creating a Kubernetes Ingress object. An Ingress object must be associated with one or more Service objects, each of which is associated with a set of Pods.

Here is a manifest for an Ingress called my-ingress:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: my-ingress
spec:
  rules:
  - http:
      paths:
      - path: /*
        backend:
          serviceName: my-products
          servicePort: 60000
      - path: /discounted
        backend:
          serviceName: my-discounted-products
          servicePort: 80

When you create the Ingress, the GKE ingress controller creates and configures an HTTP(S) load balancer according to the information in the Ingress and the associated Services. Also, the load balancer is given a stable IP address that you can associate with a domain name.

In the preceding example, assume you have associated the load balancer's IP address with the domain name your-store.example. When a client sends a request to your-store.example, the request is routed to a Kubernetes Service named my-products on port 60000. And when a client sends a request to your-store.example/discounted, the request is routed to a Kubernetes Service named my-discounted-products on port 80.

The only supported wildcard character for the path field of an Ingress is the * character. The * character must follow a forward slash (/) and must be the last character in the pattern. For example, /*, /foo/*, and /foo/bar/* are valid patterns, but *, /foo/bar*, and /foo/*/bar are not.

A more specific pattern takes precedence over a less specific pattern. If you have both /foo/* and /foo/bar/*, then /foo/bar/bat is taken to match /foo/bar/*.

For more information about path limitations and pattern matching, see the URL Maps documentation.

The manifest for the my-products Service might look like this:

apiVersion: v1
kind: Service
metadata:
  name: my-products
spec:
  type: NodePort
  selector:
    app: products
    department: sales
  ports:
  - protocol: TCP
    port: 60000
    targetPort: 50000

In the Service manifest, notice that the type is NodePort. This is the required type for an Ingress that is used to configure an HTTP(S) load balancer.

In the Service manifest, the selector field says any Pod that has both the app: products label and the department: sales label is a member of this Service.

When a request comes to the Service on port 60000, it is routed to one of the member Pods on TCP port 50000.

Each member Pod must have a container listening on TCP port 50000.

The manifest for the my-discounted-products Service might look like this:

apiVersion: v1
kind: Service
metadata:
  name: my-discounted-products
spec:
  type: NodePort
  selector:
    app: discounted-products
    department: sales
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8080

In the Service manifest, the selector field says any Pod that has both the app: discounted-products label and the department: sales label is a member of this Service.

When a request comes to the Service on port 80, it is routed to one of the member Pods on TCP port 8080.

Each member Pod must have a container listening on TCP port 8080.

Default backend

You can specify a default backend by providing a backend field in your Ingress manifest. Any requests that don't match the paths in the rules field are sent to the Service and port specified in the backend field. For example, in the following Ingress, any requests that don't match / or /discounted are sent to a Service named my-products on port 60001.

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: my-ingress
spec:
  backend:
    serviceName: my-products
    servicePort: 60001
  rules:
  - http:
      paths:
      - path: /
        backend:
          serviceName: my-products
          servicePort: 60000
      - path: /discounted
        backend:
          serviceName: my-discounted-products
          servicePort: 80

If you don't specify a default backend, GKE provides a default backend that returns 404.

Ingress to Compute Engine resource mappings

The GKE Ingress controller deploys and manages Compute Engine load balancer resources based on the Ingress resources that are deployed in the cluster. The mapping of Compute Engine resources depends on the structure of the Ingress resource. Awareness of these resources mappings helps you with planning, designing, and troubleshooting.

The my-ingress manifest shown in the Multiple backend services section specifies an external Ingress resource with two URL path matches that reference two different Kubernetes Services. Here are some of the Compute Engine resources created on behalf of my-ingress:

  • A public VIP corresponding to a forwardingRule.
  • Compute Engine firewall rules that permit traffic for health checks and application traffic.
  • A target HTTP proxy. If TLS was configured, this would result in an additional HTTPS proxy.
  • A URL map which has a single hostRule, pathMatchers for /* and /discounted. This points to corresponding backendServices.
  • NEGs which hold a list of Pod IPs from each Service as endpoints. These are created as a result of the my-discounted-products and my-products Services. The following diagram provides an overview of the Ingress to Compute Engine resource mappings.

Ingress to Compute Engine resource mapping diagram

Options for providing SSL certificates

There are three ways to provide SSL certificates to an HTTPS load balancer:

Google-managed certificates
Google-managed SSL certificates are provisioned, deployed, renewed, and managed for your domains. Managed certificates do not support wildcard domains.
Self-managed certificates shared with Google Cloud
You can provision your own SSL certificate and create a certificate resource in your Google Cloud project. You can then list the certificate resource in an annotation on an Ingress to create an HTTP(S) load balancer that uses the certificate. Refer to instructions for pre-shared certificates for more information.
Self-managed certificates as Secret resources
You can provision your own SSL certificate and create a Secret to hold it. You can then refer to the Secret in an Ingress specification to create an HTTP(S) load balancer that uses the certificate. Refer to the instructions for using certificates in Secrets for more information.

Health checks

A Service exposed through an Ingress must respond to health checks from the load balancer. Any container that is the final destination of load-balanced traffic must do one of the following to indicate that it is healthy:

  • Serve a response with an HTTP 200 status to GET requests on the / path.

  • Configure an HTTP readiness probe. Serve a response with an HTTP 200 status to GET requests on the path specified by the readiness probe. The Service exposed through an Ingress must point to the same container port on which the readiness probe is enabled.

    For example, suppose a container specifies this readiness probe:

    ...
    readinessProbe:
      httpGet:
        path: /healthy
    

    Then if the handler for the container's /healthy path returns an HTTP 200 status, the load balancer considers the container to be alive and healthy.

Using multiple TLS certificates

Suppose you want an HTTP(S) load balancer to serve content from two hostnames: your-store.example and your-experimental-store.example. Also, you want the load balancer to use one certificate for your-store.example and a different certificate for your-experimental-store.example.

You can do this by specifying multiple certificates in an Ingress manifest. The load balancer chooses a certificate if the Common Name (CN) in the certificate matches the hostname used in the request. For detailed information on how to configure multiple certificates, see Using multiple SSL certificates in HTTP(S) Load Balancing with Ingress.

Kubernetes Service compared to Google Cloud backend service

A Kubernetes Service and a Google Cloud backend service are different things. There is a strong relationship between the two, but the relationship is not necessarily one to one. The GKE ingress controller creates a Google Cloud backend service for each (serviceName, servicePort) pair in an Ingress manifest. So it is possible for one Kubernetes Service object to be related to several Google Cloud backend services.

Limitations

  • The total length of the namespace and name of an Ingress must not exceed 40 characters. Failure to follow this guideline may cause the GKE Ingress controller to act abnormally. For more information, see this issue on GitHub.

  • The maximum number of rules for a URL map is 50. This means that you can specify a maximum of 50 rules in an Ingress.

  • If you're not using NEGs with the GKE ingress controller then GKE clusters have a limit of 1000 nodes. When services are deployed with NEGs, there is no GKE node limit. Any non-NEG Services exposed through Ingress do not function correctly on clusters above 1000 nodes.

  • For the GKE Ingress controller to use your readinessProbes as health checks, the Pods for an Ingress must exist at the time of Ingress creation. If your replicas are scaled to 0, the default health check applies. For more information, see this issue comment.

  • Changes to a Pod's readinessProbe do not affect the Ingress after it is created.

  • The HTTPS load balancer terminates TLS in locations that are distributed globally, to minimize latency between clients and the load balancer. If you require geographic control over where TLS is terminated, you should use a custom ingress controller and GCP Network Load Balancing instead, and terminate TLS on backends that are located in regions appropriate to your needs.

What's next