Ingress for Anthos

Ingress for Anthos (Ingress) is a cloud-hosted multi-cluster Ingress controller for Anthos GKE clusters. It's a Google-hosted service that supports deploying shared load balancing resources across clusters and across regions. To deploy Ingress for Anthos across multiple clusters, complete Setting up Ingress for Anthos then see Deploying Ingress across multiple clusters.

Multi-cluster networking

Many factors drive multi-cluster topologies, including close user proximity for apps, cluster and regional high availability, security and organizational separation, cluster migration, and data locality. These use cases are rarely isolated. As the reasons for multiple clusters grow, the need for a formal and productized multi-cluster platform becomes more urgent.

Ingress for Anthos is designed to meet the load balancing needs of multi-cluster, multi-regional environments. It's a controller for the external HTTP(S) load balancer to provide ingress for traffic coming from the internet across one or more clusters.

Ingress for Anthos's multi-cluster support satisfies many use cases including:

  • A single, consistent virtual IP (VIP) for an app, independent of where the app is deployed globally.
  • Multi-regional, multi-cluster availability through health checking and traffic failover.
  • Proximity-based routing through public Anycast VIPs for low client latency.
  • Transparent cluster migration for upgrades or cluster rebuilds.

How Ingress for Anthos works

Ingress for Anthos builds on the architecture of the External HTTP(S) Load Balancing. HTTP(S) Load Balancing is a globally distributed load balancer with proxies deployed at 100+ Google points of presence (PoPs) around the world. These proxies sit at the edge of Google's network to be positioned close to clients. Load balancer VIPs are advertised as Anycast IPs. Requests from clients are routed cold potato to Google PoPs, meaning that internet traffic goes to the closest PoP and gets to the Google backbone as fast as possible.

Terminating HTTP and HTTPS connections at the edge allows the Google load balancer to decide where to route traffic by determining backend availability before traffic enters a data center or region. This gives traffic the most efficient path from the client to the backend while considering the backends' health and capacity.

Ingress for Anthos is an Ingress controller that programs the external HTTP(S) load balancer using network endpoint groups (NEGs). When you create a MultiClusterIngress resource, GKE deploys Compute Engine load balancer resources and configures the appropriate Pods across clusters as backends. The NEGs are used to track Pod endpoints dynamically so the Google load balancer has the right set of healthy backends.

Ingress for Anthos traffic flow

As you deploy applications across clusters in GKE, Ingress for Anthos ensures that the load balancer is in sync with events that occur in the cluster:

  • A Deployment is created with the right matching labels.
  • A Pod's process dies and fails its health check.
  • A cluster is removed from the pool of backends.

Ingress for Anthos updates the load balancer, keeping it consistent with the environment and desired state of Kubernetes resources.

Ingress for Anthos architecture

Ingress for Anthos uses a centralized Kubernetes API server to deploy Ingress across multiple clusters. This centralized API server is called the config cluster. Any GKE cluster can act as the config cluster. The config cluster uses two custom resource types: MultiClusterIngress and MultiClusterService. By deploying these resources on the config cluster, the Anthos Ingress Controller deploys load balancers across multiple clusters.

The following concepts and components make up Ingress for Anthos:

  • Anthos Ingress controller - This is a globally distributed control plane that runs as a service outside of your clusters. This allows the lifecycle and operations of the controller to be independent of GKE clusters.

  • Config cluster - This is a chosen GKE cluster running on Google Cloud where the MultiClusterIngress and MultiClusterService resources are deployed. This is a centralized point of control for these multi-cluster resources. These multi-cluster resources exist in and are accessible from a single logical API to retain consistency across all clusters. The Ingress controller watches the config cluster and reconciles the load balancing infrastructure.

  • Environ - An environ is a domain that groups clusters and infrastructure, manages resources, and keeps a consistent policy across them. Ingress uses the concept of environs for how Ingress is applied across different clusters. Clusters that you register to an environ become visible to Ingress, so they can be used as backends for Ingress.

  • Member cluster - Clusters registered to an environ are called member clusters. Member clusters in the environ comprise the full scope of backends that Ingress is aware of. The Google Kubernetes Engine cluster management view provides a secure console to view the state of all your registered clusters.

Ingress for Anthos arch

Deployment workflow

The following steps illustrate a high-level workflow for using Ingress for Anthos across multiple clusters.

  1. Register GKE clusters as member clusters.

  2. Configure a GKE cluster as the central config cluster. This cluster can be a dedicated control plane, or it can run other workloads.

  3. Deploy applications to the GKE clusters where they need to run.

  4. Deploy one or more MultiClusterService resources in the config cluster with label and cluster matches to select clusters, namespace, and Pods that are considered backends for a given Service. This creates NEGs in Compute Engine, which begins to register and manage service endpoints.

  5. Deploy a MultiClusterIngress resource in the config cluster that references one or more MultiClusterService resources as backends for the load balancer. This deploys the Compute Engine external load balancer resources and exposes the endpoints across clusters through a single load balancer VIP.

Ingress concepts

Ingress for Anthos uses a centralized Kubernetes API server to deploy Ingress across multiple clusters. The following sections describe the Ingress for Anthos resource model, how to deploy Ingress, and concepts important for managing this highly available network control plane.

MultiClusterService resources

A MultiClusterService (MCS) is a custom resource used by Ingress for Anthos that is a logical representation of a Service across multiple clusters. An MCS is similar to, but substantially different from, the core Service type. An MCS exists only in the config cluster and generates derived Services in the target clusters. An MCS does not route anything like a ClusterIP, LoadBalancer, or NodePort Service does. It simply allows the MCI to refer to a singular distributed resource. The following is a simple MCS for the foo application:

apiVersion: networking.gke.io/v1alpha1
kind: MultiClusterService
metadata:
  name: foo
  namespace: blue
spec:
  template:
    spec:
      selector:
        app: foo
      ports:
      - name: web
        protocol: TCP
        port: 80
        targetPort: 80

Like a Service, a MCS is a selector for Pods but it is also capable of selecting labels and clusters. The pool of clusters that it selects across are called member clusters, and these are all the clusters registered to the environ. This MCS deploys a derived Service in all member clusters with the selector app: foo. If app: foo Pods exist in that cluster then those Pod IPs will be added as backends for the MCI.

The following mci-zone1-svc-j726y6p1lilewtu7is a derived Service that the MCS generated in one of the target clusters. This Service creates a NEG which tracks Pod endpoints for all pods that match the specified label selector in this cluster. A derived Service and NEG will exist in every target cluster for every MCS (unless using cluster selectors). If no matching Pods exist in a target cluster then the Service and NEG will be empty. The derived Services are managed fully by the MCS and are not managed by users directly.

apiVersion: v1
kind: Service
metadata:
  annotations:
    cloud.google.com/neg: '{"exposed_ports":{"8080":{}}}'
    cloud.google.com/neg-status: '{"network_endpoint_groups":{"8080":"k8s1-a6b112b6-default-mci-zone1-svc-j726y6p1lilewt-808-e86163b5"},"zones":["us-central1-a"]}'
    networking.gke.io/multiclusterservice-parent: '{"Namespace":"default","Name":"zone1"}'
  name: mci-zone1-svc-j726y6p1lilewtu7
  namespace: blue
spec:
  selector:
    app: foo-frontend
  ports:
  - name: web
    protocol: TCP
    port: 80
    targetPort: 80

A few notes about the derived Service: - Its function is as a logical grouping of endpoints as backends for Ingress for Anthos. - It manages the lifecycle of the NEG for a given cluster and application. - It's created as a headless Service. Note that only the Selector and Ports fields are carried over from the MCS spec to the derived service spec. - The Anthos Ingress controller manages its lifecycle.

MultiClusterIngress resource

A MultiClusterIngress (MCI) resource behaves identically in many ways to the core Ingress resource. Both have the same specification for defining hosts, paths, protocol termination and backends. This is a simple example of an MCI resource that routes traffic to the foo and bar backends depending on the HTTP host headers.

apiVersion: networking.gke.io/v1alpha1
kind: MultiClusterIngress
metadata:
  name: foobar-ingress
  namespace: blue
spec:
  template:
    spec:
      backend:
        serviceName: default-backend
        servicePort: 80
      rules:
      - host: foo.example.com
        backend:
          serviceName: foo
          servicePort: 80
      - host: bar.example.com
        backend:
          serviceName: bar
          servicePort: 80

Note that this MCI matches traffic to the VIP on foo.example.com and bar.example.com by sending this traffic to the MultiClusterService (MCS) resources named foo and bar. This MCI has a default backend that matches on all other traffic and sends that traffic to the default-backend MCS.

Host header matching

Ingress resources across clusters

The config cluster is the only cluster that can have MultiClusterIngress and MultiClusterService resources. Each target cluster which has Pods that match the MCS label selectors will also have a corresponding derived Service scheduled on them. If a cluster is explicitly not selected by an MCS, then a corresponding derived Service is not created in that cluster.

A diagram of an Ingress resources model

Namespace sameness

Registered GKE clusters become member of an environ.

Environs possess a characteristic known as namespace sameness which assumes that resources with the identical names and same namespace across clusters are considered to be instances of the same resource. In effect, this means that Pods in the ns1 namespace with the labels app: foo across different clusters are all considered part of the same pool of application backends from the perspective of Ingress for Anthos.

This has ramifications for how different development teams operate across a group of clusters. Different teams can still leverage the same fleet of clusters by using Namespaces to segment workloads even across cluster boundaries; however, it is important for each team to either explicitly or implicitly reserve their respective namespace(s) on all cluster in their environ.

In the following example a blue team has access to deploy in all blue Namespaces across clusters in the fleet. Blue is considered to be the same Namespace in each cluster. The blue Namespace also exists in the Ingress for Anthos config cluster. MultiClusterService resources deployed there by the blue team can only select across Pods that also exist in the blue Namespace in different clusters. MCI and MCS resources have no visibility or access across Namespaces in different clusters and thus the significance and "sameness" of resources is held across clusters.

A diagram demonstrating namespace sameness

There are design ramifications to Namespace sameness. The following principles will help users be successful:

  • Namespaces for different purposes should not have the same name across clusters.
  • Namespaces should be reserved either explicitly, by allocating a namespace, or implicitly, through out-of-band policies, for teams and clusters within an environ.
  • Namespaces for the same purpose across clusters should share the same name.
  • User permission to Namespaces across clusters should be tightly controlled to prevent unauthorized access.
  • The default Namespace or generic Namespaces like "prod" or "dev" should not be used for normal application deployment. It is too easy for users to deploy resources to the default Namespace accidentally and violate the segmentation principles of Namespaces.
  • The same Namespace should be created across clusters wherever a given team or group of users must deploy resources.

Config cluster design

The Ingress for Anthos config cluster is a single GKE cluster which hosts the MCI and MCS resources and acts as the single control point for Ingress across the fleet of target GKE clusters. You choose the config cluster when you enable Ingress for Anthos. You can choose any GKE cluster as the config cluster and change the config cluster at any time.

Config cluster availability

Because the config cluster is a single point of control, Ingress for Anthos resources cannot be created or updated if the config cluster API is unavailable. Load balancers and traffic being served by them will not be affected by a config cluster outage but changes to MCI and MCS's will not be reconciled by the controller until it is available again.

Here are some design principles for how to use and manage the config cluster:

  • The config cluster should be chosen such that it is highly available. Regional clusters are preferred over zonal clusters.
  • The config cluster does not have to be a dedicated cluster for Ingress for Anthos. The config cluster may host administrative or even application workloads, though care should be made to ensure that hosted applications do not impact the availability of the config cluster API server. The config cluster can be a target cluster that hosts backends for MCI's, though if extra precautions are needed the config cluster can also be excluded as an MCI backend through cluster selection.
  • Config clusters should have all the Namespaces that are used by target cluster backends. An MCS can only reference Pods in the same Namespace across clusters so that Namespace must be present in the config cluster.
  • Users that deploy Ingress across multiple clusters must have access to the config cluster to deploy MCI and MCS resources. However, users should only have access to Namespaces they have permission to use.

Selecting and migrating the config cluster

You must choose the config cluster when you enable Ingress for Anthos. Any member cluster of an environ can be selected as the config cluster. You can update the config cluster at any time but must take care to ensure that it does not cause disruption. The Anthos Ingress controller will reconcile whatever MCI and MCS resources exist in the config cluster. When migrating the config cluster from the current one to the next one, MCI and MCS resources must be identical. If the resources are not identical, the Compute Engine Load Balancers may be updated or destroyed after the config cluster update.

In the following diagram a centralized CI/CD system applies MCI and MCS resources to the GKE API server for the config cluster (gke-us) and a backup cluster (gke-eu) at all times so that the resources are identical across the two clusters. If you need to change the config cluster, for emergencies or planned downtime, the config cluster can be updated without any impact because the MCI and MCS resources are identical.

A centralized CI/CD system appling MCI and MCS resources

Cluster selection

MCS resources have the capability to explicitly select across clusters. By default, an MCS will schedule derived Services on every target cluster. Cluster selection defines an explicit list of clusters for a given MCS where derived Services should be scheduled and all other target clusters will be ignored. See Setting up Ingress for Anthos to configure cluster selection.

There are many use-cases where you may want to apply ingress rules to specific clusters:

  • Isolating the config cluster to prevent MCSs from selecting across them.
  • Controlling traffic between clusters in a blue-green fashion for app migration.
  • Routing to application backends that only exist in a subset of clusters.
  • Using a single L7 VIP for host/path routing to backends that live on different clusters.

Cluster selection is done via the clusters field in the MCS. Clusters are explicitly referenced by <region | zone>/<name>. Member clusters within the same environ and region should have unique names so that there are no naming collisions.

In the following example, the foo MCS has a clusters field that references europe-west1-c/gke-eu and asia-northeast1-a/gke-asia. As a result, Pods with the matching labels in the gke-asia and gke-eu clusters can be included as backends for a given MCI. This will exclude the gke-us cluster from Ingress even if it has Pods with the app: foo label. This can be useful for onboarding or migrating to new clusters and controlling traffic independently of Pod deployment.

apiVersion: networking.gke.io/v1beta1
kind: MultiClusterService
metadata:
  name: foo
  namespace: blue
spec:
  template:
    spec:
      selector:
        app: foo
      ports:
      - name: web
        protocol: TCP
        port: 80
        targetPort: 80
  clusters:
  - link: "europe-west1-c/gke-eu"
  - link: "asia-northeast1-a/gke-asia-1"

The following diagram shows three clusters - gke-eu, gke-asia-1, and gke-asia-2. gke-asia-2 is excluded as a backend even though Pods with matching labels are already placed there. This allows the cluster to be excluded from receiving traffic for maintenance or other operations. Note that if the "clusters" field is omitted from an MCS it selects all member clusters implicitly.

A diagram that shows an excluded backend

Pricing and trials

Ingress for Anthos requires Anthos on GCP licensing for all clusters participating as backends for Ingress. Licensing can be purchased on an hourly per-vCPU basis or as a longer-term subscription. Charges for Ingress for Anthos will take effect starting August 1st, 2020. Until then, Ingress for Anthos is offered free of charge. Standard Compute Engine load balancer pricing applies to ingress traffic and forwarding rules created through Ingress.

After August 1st, 2020 Ingress for Anthos can still be trialed, free of charge, for up to 30 days and up to 100 vCPU across any number of clusters. To begin/end the trial, enable/disable the Anthos API within a Google Cloud project, which provides access to Anthos on Google Cloud products. If longer trial periods are required, contact your account team for extended terms.

What's next