Traffic Director GKE service mesh overview

This document is for Google Kubernetes Engine users who want to deploy a Traffic Director service mesh using the Kubernetes Gateway API.

You can configure Traffic Director for GKE using the Kubernetes Gateway APIs, enabling service-to-service communications, traffic management, global load balancing, and security policy enforcement for service mesh use cases.

Kubernetes APIs and Google Cloud APIs

You can configure a Traffic Director service mesh using two different APIs:

This document and the associated setup guides provide instructions for using the Kubernetes Gateway API to configure a Traffic Director service mesh.

We recommend that you use the Kubernetes Gateway APIs on Google Kubernetes Engine and we recommend against using both APIs to configure routing in the same service mesh on GKE.

The service routing API uses the same resource names as the resources in the Kubernetes Gateway APIs, which makes it easier for you when you are using the two APIs. The Kubernetes resources that you configure are functionally equivalent to the Google Cloud resources represented by the service routing API for Traffic Director.

The following sections describe the resources and architecture used by the Traffic Director integration with the Kubernetes Gateway APIs.

Gateway API

The Gateway API is a collection of resources that model service networking in Kubernetes. The Kubernetes Gateway API is an open source project that focuses on supporting ingress and load balancer use cases by providing a generic routing API. The generic routing API has many implementations. Traffic Director custom resource definitions (CRDs) are added as an extension to the open source Gateway API. The CRDs support service mesh use cases and use the same generic routing API that is introduced by the Gateway API.

The Gateway API is hierarchically organized, with a Gateway parent resource and associated GatewayClass, to which you attach routes. GKE includes a TDMesh resource that is a peer of the Gateway resource. You can attach the same Route types to the TDMesh resource. The TDMesh resource is where you attach routes and policies for service meshes.

Gateway API, Gateway resource, Mesh resource, and Routes
Gateway API, Gateway resource, Mesh resource, and Routes (click to enlarge)

Fleet

A fleet consists of one or more GKE clusters that are grouped logically. A fleet lets you manage capabilities and applies policies consistently across multiple clusters. When you use a fleet, you can manage a Traffic Director service mesh spanning multiple clusters.

Architecture

Traffic Director supports the Gateway API on GKE by programming your clusters' data planes to implement the networking behaviors specified in Gateway API resources. Traffic Director itself is a Google-managed control plane that does not process any data plane traffic. Envoy proxies running as sidecars to your workloads or proxyless gRPC clients process traffic in the data plane. Traffic Director configures both Envoy proxies and proxyless gRPC clients through the xDSv3 API.

Traffic Director offers a managed, globally available control plane solution that is more robust and scalable than running in-cluster controllers. Because it's a global solution, Traffic Director can load balance traffic across workloads distributed across multiple GKE clusters. In the following illustration, Traffic Director manages traffic to services in three clusters that are in a single fleet, using Gateway API resources.

A Traffic Director multi-cluster service mesh configured using the Gateway API
A Traffic Director multi-cluster service mesh configured using the Gateway API (click to enlarge)

You designate one cluster in your fleet as the config cluster. The config cluster is where Gateway API resources are stored. Traffic Director only watches resources that are in the config cluster, and ignores resources that are in other clusters in the fleet. See Config cluster design in the GKE documentation for more detailed information about the config cluster.

With GKE multi-cluster Services, Gateway API resources in the config cluster can reference Kubernetes services in any cluster within a fleet. See Multi-cluster Services for more information about Multi-Cluster Service Discovery.

Resources

Traffic Director supports both Envoy proxies and proxyless gRPC in the data plane of a service mesh. Both clients receive configuration from Traffic Director for a particular service mesh, by specifying the name and corresponding project number of the TDMesh resource in their respective bootstrap configuration. The setup guides for Traffic Director with the Kubernetes Gateway APIs provide demonstration data plane configurations with Envoy and proxyless gRPC.

TDMesh resource

The TDMesh resource is a Traffic Director custom resource. It's an extension to the open source Gateway APIs to support Traffic Director's service mesh use cases. Using the TDMesh resource, you create a service mesh instance in your fleet. Routes attached to the TDMesh resource specify service-to-service routing behaviors in the service mesh.

Route resources

A subset of the Gateway API Route resources can be attached to a TDMesh resource to specify service-level routing within the service mesh. Traffic Director supports the following Route resources:

  • HTTPRoute
  • TCPRoute
  • TDGRPCRoute (Traffic Director custom resource)

For example, you can create an HTTPRoute to specify that HTTP requests destined to host payments.svc.internal are routed to the Kubernetes service service-payments. When you attach the HTTPRoute resource to a TDMesh resource that data plane instances are subscribed to, HTTP requests sent by workloads within the mesh are routed accordingly.

This release augments the generic Route resources in the Gateway API with a new route type, TDGRPCRoute. The new route type provides a first-class experience for routing gRPC requests, by matching on native gRPC primitives, such as method and service definitions.

Using the Gateway API resources in GKE to configure Traffic Director
Using the Gateway API resources in GKE to configure Traffic Director (click to enlarge)

Limitations

  • Traffic Director configures the following default behaviors for all Kubernetes services in the service mesh. You cannot change these behaviors.
    • TCP health checks are configured on service ports referenced by any Gateway API Route resources.
    • A default 30-second timeout is configured for all incoming requests to services.
    • Session affinity is disabled.
  • The Envoy auto injector only supports one mesh per fleet.
  • Traffic Director's security features cannot be enabled using the Gateway API.
  • You must configure the TDMesh and Route resources on GKE using only the Gateway API. You cannot use the Google Cloud console, the gcloud CLI, or the REST APIs.
  • All clusters must be in one project. A service mesh spanning across clusters in multiple projects is not supported.
  • You cannot configure or view a GKE service mesh using the Google Cloud console.
  • Control plane observability with Cloud Logging and Cloud Monitoring is not supported.

What's next