Set up Google Kubernetes Engine Pods using automatic Envoy injection

Overview

In a service mesh, your application code doesn't need to know about your networking configuration. Instead, your applications communicate over a data plane, which is configured by a control plane that handles service networking. In this guide, Traffic Director is your control plane and the Envoy sidecar proxies are your data plane.

The Envoy sidecar injector makes it easy to add Envoy sidecar proxies to your Google Kubernetes Engine Pods. When the Envoy sidecar injector adds a proxy, it also sets that proxy up to handle application traffic and connect to Traffic Director for configuration.

The guide walks you through a simple setup of Traffic Director with Google Kubernetes Engine. These steps provide the foundation that you can extend to advanced use cases, such as a service mesh that extends across multiple Google Kubernetes Engine clusters and, potentially, Compute Engine VMs. You can also use these instructions if you are configuring Traffic Director with Shared VPC.

The setup process involves:

  1. Creating a GKE cluster for your workloads.
  2. Installing the Envoy sidecar injector and enabling injection.
  3. Deploying a sample client and verifying injection.
  4. Deploying a Kubernetes service for testing.
  5. Configuring Traffic Director with Cloud Load Balancing components to route traffic to the test service.
  6. Verifying the configuration by sending a request from the sample client to the test service.
Overview of components deployed as part of this setup guide (click to enlarge)
Overview of components deployed as part of this setup guide (click to enlarge)

Prerequisites

Before you follow the instructions in this guide, review Preparing for Traffic Director setup and make sure that you have completed the prerequisite tasks described in that document.

For information about the Envoy version that is supported, see the Traffic Director release notes.

Additional prerequisites with Shared VPC

If you are setting up Traffic Director in a Shared VPC environment, make sure of the following.

  • You have the correct permissions and roles for Shared VPC.
  • You have set up the correct projects and billing.
  • You have enabled billing in the projects.
  • You have enabled the Traffic Director and GKE APIs in each project, including the host project.
  • You have set up the correct service accounts for each project.
  • You have created a VPC network and subnets.
  • You have enabled Shared VPC.

For more information, see Shared VPC.

Configure IAM roles

This example of IAM role configuration assumes that the host project for Shared VPC has two subnets and there are two service projects in the Shared VPC.

  1. In Cloud Shell, create a working folder (WORKDIR) where you create the files associated with this section:

    mkdir -p ~/td-shared-vpc
    cd ~/td-shared-vpc
    export WORKDIR=$(pwd)
    
  2. Configure IAM permissions in the host project so that service projects can use the resources in the shared VPC.

    In this step, you configure the IAM permissions so that subnet-1 is accessible by service project 1 and subnet-2 is accessible by service project 2. You assign the Compute Network User IAM role (roles/compute.networkUser) to both the Compute Engine compute default service account and the Google Cloud API service account in each service project for each subnet.

    1. For service project 1, configure IAM permissions for subnet-1:

      export SUBNET_1_ETAG=$(gcloud beta compute networks subnets get-iam-policy subnet-1 --project ${HOST_PROJECT} --region ${REGION_1} --format=json | jq -r '.etag')
      
      cat > subnet-1-policy.yaml <<EOF
      bindings:
      - members:
        - serviceAccount:${SVC_PROJECT_1_API_SA}
        - serviceAccount:${SVC_PROJECT_1_GKE_SA}
        role: roles/compute.networkUser
      etag: ${SUBNET_1_ETAG}
      EOF
      
      gcloud beta compute networks subnets set-iam-policy subnet-1 \
      subnet-1-policy.yaml \
          --project ${HOST_PROJECT} \
          --region ${REGION_1}
      
    2. For service project 2, configure IAM permissions for subnet-2:

      export SUBNET_2_ETAG=$(gcloud beta compute networks subnets get-iam-policy subnet-2 --project ${HOST_PROJECT} --region ${REGION_2} --format=json | jq -r '.etag')
      
      cat > subnet-2-policy.yaml <<EOF
      bindings:
      - members:
        - serviceAccount:${SVC_PROJECT_2_API_SA}
        - serviceAccount:${SVC_PROJECT_2_GKE_SA}
        role: roles/compute.networkUser
      etag: ${SUBNET_2_ETAG}
      EOF
      
      gcloud beta compute networks subnets set-iam-policy subnet-2 \
      subnet-2-policy.yaml \
          --project ${HOST_PROJECT} \
          --region ${REGION_2}
      
  3. For each service project, you must grant the Kubernetes Engine Host Service Agent User IAM role (roles/container.hostServiceAgentUser) to the GKE service account in the host project:

    gcloud projects add-iam-policy-binding ${HOST_PROJECT} \
        --member serviceAccount:${SVC_PROJECT_1_GKE_SA} \
        --role roles/container.hostServiceAgentUser
    
    gcloud projects add-iam-policy-binding ${HOST_PROJECT} \
        --member serviceAccount:${SVC_PROJECT_2_GKE_SA} \
        --role roles/container.hostServiceAgentUser
    

    This role lets the GKE service account of the service project use the GKE service account of the host project to configure shared network resources.

  4. For each service project, grant the Compute Engine default service account the Compute Network Viewer IAM role (roles/compute.networkViewer) in the host project.

    gcloud projects add-iam-policy-binding ${SVC_PROJECT_1} \
        --member serviceAccount:${SVC_PROJECT_1_COMPUTE_SA} \
        --role roles/compute.networkViewer
    
    gcloud projects add-iam-policy-binding ${SVC_PROJECT_2} \
        --member serviceAccount:${SVC_PROJECT_2_COMPUTE_SA} \
        --role roles/compute.networkViewer
    

    When the Envoy sidecar proxy connects to the xDS service (Traffic Director API), the proxy uses the service account of the Compute Engine virtual machine (VM) host or of the GKE node instance. The service account must have the compute.globalForwardingRules.get project-level IAM permission. The Compute Network Viewer role is sufficient for this step.

Creating a GKE cluster for your workloads

GKE clusters must meet the following requirements to support Traffic Director:

Creating the GKE cluster

Create a GKE cluster called traffic-director-cluster in your preferred zone, for example, us-central1-a.

gcloud container clusters create traffic-director-cluster \
  --zone ZONE \
  --scopes=https://www.googleapis.com/auth/cloud-platform \
  --enable-ip-alias

Pointing kubectl to the newly created cluster

Change the current context for kubectl to the newly created cluster by issuing the following command:

gcloud container clusters get-credentials traffic-director-cluster \
    --zone ZONE

Installing the Envoy sidecar injector

The following sections provide instructions for installing the Envoy sidecar injector. When the sidecar injector is enabled, it automatically deploys sidecar proxies for both new and existing Google Kubernetes Engine workloads. Because the Envoy sidecar injector runs inside of your GKE cluster, you need to install it once to each cluster if you are using Traffic Director to support a multi-cluster service mesh.

Downloading the sidecar injector

Download and extract the Envoy sidecar injector.

wget https://storage.googleapis.com/traffic-director/td-sidecar-injector-xdsv3.tgz
tar -xzvf td-sidecar-injector-xdsv3.tgz
cd td-sidecar-injector-xdsv3

Configuring the sidecar injector

If you are using the older APIs, configure the sidecar injector by editing the specs/01-configmap.yaml file to:

  • Populate TRAFFICDIRECTOR_GCP_PROJECT_NUMBER by replacing YOUR_PROJECT_NUMBER_HERE with the project number of your project. The project number is a numeric identifier for your project. For information about obtaining a list of all your projects, see Identifying projects.
  • Populate TRAFFICDIRECTOR_NETWORK_NAME by replacing YOUR_NETWORK_NAME_HERE with the Google Cloud Virtual Private Cloud network name that you want to use with Traffic Director. Make note of this VPC network name, because you will need it later when you configure Traffic Director.

If you are using the new service routing APIs, which are currently in preview:

  • Populate TRAFFICDIRECTOR_MESH_NAME by replacing "" with the name of the service mesh, to obtain the configuration for a service mesh.
    • Note that if you are configuring a Gateway, you do not use the sidecar injector. You deploy an Envoy proxy as a Pod.

For example, the file might look like this:

$ cat specs/01-configmap.yaml
   apiVersion: v1
   kind: ConfigMap
   metadata:
     name: injector-mesh
     namespace: istio-control
   data:
     mesh: |-
       defaultConfig:
         discoveryAddress: trafficdirector.googleapis.com:443

         # Envoy proxy port to listen on for the admin interface.
         proxyAdminPort: 15000

         proxyMetadata:
           # Google Cloud Project number where Traffic Director resources are configured.
           # This is a numeric identifier of your project (e.g. "111222333444").
           # You can get a list of all your projects with their corresponding numbers by
           # using "gcloud projects list" command or looking it up under "Project info"
           # section of your Google Cloud console.
           # If left empty, configuration will be attempted to be fetched for the Google Cloud
           # project associated with service credentials.
           # Leaving empty is not recommended as it is not guaranteed to work in future
           # releases.
           TRAFFICDIRECTOR_GCP_PROJECT_NUMBER: "YOUR_PROJECT_NUMBER_HERE"

           # Google Cloud VPC network name for which the configuration is requested (This is the VPC
           # network name referenced in the forwarding rule in Google Cloud API). If left empty,
           # configuration will be attempted to be fetched for the VPC network over which
           # the request to Traffic Director (trafficdirector.googleapis.com) is sent out.
           # Leaving empty is not recommended as it is not guaranteed to work in future
           # releases.
           TRAFFICDIRECTOR_NETWORK_NAME: "default"

You can also optionally enable logging and tracing for each proxy that is injected automatically. For more information on these configurations, review Configuring additional attributes for sidecar proxies. When you use the sidecar injector, the value of TRAFFICDIRECTOR_ACCESS_LOG_PATH can only be set to a file in the directory /etc/envoy/. For example, the directory /etc/envoy/access.log is a valid location.

Note that TRAFFICDIRECTOR_INTERCEPTION_PORT should not be configured in this ConfigMap, because it is already configured by the sidecar injector.

Configuring TLS for the sidecar injector

This section shows you how to configure TLS for the sidecar injector.

The sidecar injector uses a Kubernetes mutating admission webhook to inject proxies when new pods are created. This webhook is an HTTPS endpoint so you need to provide a key and certificate for TLS.

You can create a private key and a self-signed certificate using openssl to secure the sidecar injector.

Optionally, if you have your own private key and certificate signed by a trusted certificate authority (CA), you can skip this next step.

CN=istio-sidecar-injector.istio-control.svc

openssl req \
  -x509 \
  -newkey rsa:4096 \
  -keyout key.pem \
  -out cert.pem \
  -days 365 \
  -nodes \
  -subj "/CN=${CN}" \
  -addext "subjectAltName=DNS:${CN}"

cp cert.pem ca-cert.pem

This example openssl command outputs a private 4096-bit RSA key to key.pem and a self-signed certificate in X.509 format to cert.pem. Because the certificate is self-signed, the certificate is copied to ca-cert.pem and considered the certificate of the signing CA as well. The certificate remains valid for 365 days and does not require a passphrase. For more information about certificate creation and signing, refer to the Kubernetes documentation about Certificate Signing Requests.

The steps in this section must be repeated annually to regenerate and re-apply new keys and certificates before they expire.

After you have your key and certificates, you must create a Kubernetes secret and update the sidecar injector's webhook.

  1. Create the namespace under which the Kubernetes secret should be created:

    kubectl apply -f specs/00-namespaces.yaml
    
  2. Create the secret for the sidecar injector.

    kubectl create secret generic istio-sidecar-injector -n istio-control \
      --from-file=key.pem \
      --from-file=cert.pem \
      --from-file=ca-cert.pem
    
  3. Modify the caBundle of the sidecar injection webhook named istio-sidecar-injector-istio-control in specs/02-injector.yaml:

    CA_BUNDLE=$(cat cert.pem | base64 | tr -d '\n')
    sed -i "s/caBundle:.*/caBundle:\ ${CA_BUNDLE}/g" specs/02-injector.yaml
    

Installing the sidecar injector to your GKE cluster

  1. Deploy the sidecar injector.

    kubectl apply -f specs/
    
  2. Verify that the sidecar injector is running.

    kubectl get pods -A | grep sidecar-injector
    

    This returns output similar to the following:

    istio-control   istio-sidecar-injector-6b475bfdf9-79965  1/1 Running   0   11s
    istio-control   istio-sidecar-injector-6b475bfdf9-vntjd  1/1 Running   0   11s
    

Opening required port on a private cluster

If you're following the instructions in Setting up Traffic Director service security with Envoy, you can skip this section and proceed to the next section, Enabling sidecar injection.

If you are installing the Envoy sidecar injector on a private cluster, you need to open TCP port 9443 in the firewall rule to the master node(s) for the webhook to work properly.

The following steps describe how to update the required firewall rule. Note that the update command replaces the existing firewall rule, so you need to make sure to include the default ports 443 (HTTPS) and 10250 (kubelet) as well as the new port that you want to open.

  1. Find the source range (master-ipv4-cidr) of the cluster. In the following command, replace CLUSTER_NAME with the name of your cluster, i.e. traffic-director-cluster:

    FIREWALL_RULE_NAME=$(gcloud compute firewall-rules list \
     --filter="name~gke-CLUSTER_NAME-[0-9a-z]*-master" \
     --format="value(name)")
    
  2. Update the firewall rule to open TCP port 9443 to enable auto-injection:

    gcloud compute firewall-rules update ${FIREWALL_RULE_NAME} \
     --allow tcp:10250,tcp:443,tcp:9443
    

Enabling sidecar injection

The following command enables injection for the default namespace. The sidecar injector injects sidecar containers to pods created under this namespace:

kubectl label namespace default istio-injection=enabled

You can verify that the default namespace is properly enabled by running the following command:

kubectl get namespace -L istio-injection

This should return:

NAME              STATUS   AGE     ISTIO-INJECTION
default           Active   7d16h   enabled
istio-control     Active   7d15h
istio-system      Active   7d15h

If you are configuring service security for Traffic Director with Envoy, return to the section Setting up a test service in the that setup guide.

Deploying a sample client and verifying injection

This section shows how to deploy a sample pod running Busybox, which provides a simple interface for reaching a test service. In a real deployment, you would deploy your own client application instead.

kubectl create -f demo/client_sample.yaml

The Busybox pod consists of two containers. The first container is the client based on the Busybox image and the second container is the Envoy proxy injected by the sidecar injector. You can get more information about the pod by running the following command:

kubectl describe pods -l run=client

This should return:

…
Init Containers:
# Istio-init sets up traffic interception for the pod.
  Istio-init:
…
Containers:
# busybox is the client container that runs application code.
  busybox:
…
# Envoy is the container that runs the injected Envoy proxy.
  envoy:
…

Deploying a Kubernetes service for testing

The following sections provide instructions for setting up a test service that you use later in this guide to provide end-to-end verification of your setup.

Configuring GKE services with NEGs

GKE services must be exposed through network endpoint groups (NEGs) so that you can configure them as backends of a Traffic Director backend service. Add the NEG annotation to your Kubernetes service specification and choose a name (by replacing NEG-NAME in the sample below) so that you can find it easily later. You need the name when you attach the NEG to your Traffic Director backend service. For more information on annotating NEGs, see Naming NEGs.

...
metadata:
  annotations:
    cloud.google.com/neg: '{"exposed_ports": {"80":{"name": "service-test-neg"}}}'
spec:
  ports:
  - port: 80
    name: service-test
    protocol: TCP
    targetPort: 8000

This annotation creates a standalone NEG containing endpoints corresponding with the IP addresses and ports of the service's pods. For more information and examples, refer to Standalone network endpoint groups.

The following sample service includes the NEG annotation. The service serves the hostname over HTTP on port 80. Use the following command to get the service and deploy it to your GKE cluster.

wget -q -O - \
https://storage.googleapis.com/traffic-director/demo/trafficdirector_service_sample.yaml \
| kubectl apply -f -

Verify that the new service is created and the application pod is running:

kubectl get svc

The output should be similar to the following:

NAME             TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service-test     ClusterIP   10.71.9.71   none          80/TCP    41m
[..skip..]

Verify that the application pod associated with this service is running:

kubectl get pods
This returns:
NAME                        READY     STATUS    RESTARTS   AGE
app1-6db459dcb9-zvfg2       2/2       Running   0          6m
busybox-5dcf86f4c7-jvvdd    2/2       Running   0          10m
[..skip..]

Saving the NEG's name

Find the NEG created from the example above and record its name for Traffic Director configuration in the next section.

gcloud compute network-endpoint-groups list

This returns the following:

NAME                       LOCATION            ENDPOINT_TYPE       SIZE
service-test-neg           ZONE     GCE_VM_IP_PORT      1

Save the NEG's name in the NEG_NAME variable:

NEG_NAME=$(gcloud compute network-endpoint-groups list \
| grep service-test | awk '{print $1}')

Configuring Traffic Director with Cloud Load Balancing components

This section configures Traffic Director using Compute Engine load balancing resources. This enables the sample client's sidecar proxy to receive configuration from Traffic Director. Outbound requests from the sample client are handled by the sidecar proxy and routed to the test service.

You must configure the following components:

Creating the health check and firewall rule

Use the following instructions to create a health check and the firewall rule that is required for the health check probes. For more information, see Firewall rules for health checks.

Console

  1. Go to the Health checks page in the Google Cloud console.
    Go to the Health checks page
  2. Click Create Health Check.
  3. For the name, enter td-gke-health-check.
  4. For the protocol, select HTTP.
  5. Click Create.

  6. Go to the Firewall policies page in the Google Cloud console.
    Go to the Firewall policies page

  7. Click Create firewall rules.

  8. On the Create a firewall rule page, supply the following information:

    • Name: Provide a name for the rule. For this example, use fw-allow-health-checks.
    • Network: Choose a VPC network.
    • Priority: Enter a number for the priority. Lower numbers have higher priorities. Be sure that the firewall rule has a higher priority than other rules that might deny ingress traffic.
    • Direction of traffic: Choose Ingress.
    • Action on match: Choose Allow.
    • Targets: Choose All instances in the network.
    • Source filter: Choose the correct IP range type.
    • Source IP ranges: 35.191.0.0/16,130.211.0.0/22
    • Destination filter: Select the IP type.
    • Protocols and ports: Click Specified ports and protocols, then check tcp. TCP is the underlying protocol for all health check protocols.
    • Click Create.

gcloud

  1. Create the health check.

    gcloud compute health-checks create http td-gke-health-check \
      --use-serving-port
    
  2. Create the firewall rule to allow the health checker IP address ranges.

    gcloud compute firewall-rules create fw-allow-health-checks \
      --action ALLOW \
      --direction INGRESS \
      --source-ranges 35.191.0.0/16,130.211.0.0/22 \
      --rules tcp
    

Creating the backend service

Create a global backend service with a load balancing scheme of INTERNAL_SELF_MANAGED. In the Google Cloud console, the load balancing scheme is set implicitly. Add the health check to the backend service.

Console

  1. Go to the Traffic Director page in the Google Cloud console.

    Go to the Traffic Director page

  2. On the Services tab, click Create Service.

  3. Click Continue.

  4. For the service name, enter td-gke-service.

  5. Select Network, which you configured in the Traffic Director ConfigMap.

  6. Under Backend type, select Network endpoint groups.

  7. Select the network endpoint group you created.

  8. Set the Maximum RPS to 5.

  9. Set the Balancing mode to Rate.

  10. Click Done.

  11. Under Health check, select td-gke-health-check, which is the health check you created.

  12. Click Continue.

gcloud

  1. Create the backend service and associate the health check with the backend service.

    gcloud compute backend-services create td-gke-service \
     --global \
     --health-checks td-gke-health-check \
     --load-balancing-scheme INTERNAL_SELF_MANAGED
    
  2. Add the previously created NEG as a backend to the backend service. If you are configuring Traffic Director with a target TCP proxy, you must use UTILIZATION balancing mode. If you are using an HTTP or HTTPS target proxy, you can use RATE mode.

    gcloud compute backend-services add-backend td-gke-service \
     --global \
     --network-endpoint-group ${NEG_NAME} \
     --network-endpoint-group-zone ZONE \
     --balancing-mode [RATE | UTILIZATION] \
     --max-rate-per-endpoint 5
    

Creating the routing rule map

The routing rule map defines how Traffic Director routes traffic in your mesh. As part of the routing rule map, you configure a virtual IP (VIP) address and a set of associated traffic management rules, such as host-based routing. When an application sends a request to the VIP, the attached Envoy sidecar proxy does the following:

  1. Intercepts the request.
  2. Evaluates it according to the traffic management rules in the URL map.
  3. Selects a backend service based on the hostname in the request.
  4. Chooses a backend or endpoint associated with the selected backend service.
  5. Sends traffic to that backend or endpoint.

Console

In the console, the target proxy is combined with the forwarding rule. When you create the forwarding rule, Google Cloud automatically creates a target HTTP proxy and attaches it to the URL map.

The route rule consist of the forwarding rule and the host and path rules (also known as the URL map).

  1. Go to the Traffic Director page in the Google Cloud console.

    Go to the Traffic Director page

  2. Click Routing rule maps

  3. Click Create Routing Rule.

  4. Enter td-gke-url-map as the Name of the URL map.

  5. Click Add forwarding rule.

  6. For the forwarding rule name, enter td-gke-forwarding-rule.

  7. Select your network.

  8. Select your Internal IP.

  9. Click Save.

  10. Optionally, add custom host and path rules or leave the path rules as the defaults.

  11. Set the host to service-test.

  12. Click Save.

gcloud

  1. Create a URL map that uses td-gke-service as the default backend service.

    gcloud compute url-maps create td-gke-url-map \
       --default-service td-gke-service
    
  2. Create a URL map path matcher and a host rule to route traffic for your service based on hostname and a path. This example uses service-test as the service name and a default path matcher that matches all path requests for this host (/*).

    gcloud compute url-maps add-path-matcher td-gke-url-map \
       --default-service td-gke-service \
       --path-matcher-name td-gke-path-matcher
    
    gcloud compute url-maps add-host-rule td-gke-url-map \
       --hosts service-test \
       --path-matcher-name td-gke-path-matcher
    
  3. Create the target HTTP proxy.

    gcloud compute target-http-proxies create td-gke-proxy \
       --url-map td-gke-url-map
    
  4. Create the forwarding rule.

    gcloud compute forwarding-rules create td-gke-forwarding-rule \
      --global \
      --load-balancing-scheme=INTERNAL_SELF_MANAGED \
      --address=0.0.0.0 \
      --target-http-proxy=td-gke-proxy \
      --ports 80 --network default
    

At this point, Traffic Director configures your sidecar proxies to route requests that specify the service-test hostname to backends of td-gke-service. In this case, those backends are endpoints in the network endpoint group associated with the Kubernetes test service that you deployed earlier.

Verifying the configuration

This section shows how to verify that traffic sent from the sample Busybox client is routed to your service-test Kubernetes service. To send a test request, you can access a shell on one of the containers and execute the following verification command. A service-test Pod should return the hostname of the serving pod.

# Get the name of the pod running Busybox.
BUSYBOX_POD=$(kubectl get po -l run=client -o=jsonpath='{.items[0].metadata.name}')

# Command to execute that tests connectivity to the service service-test at
# the VIP 10.0.0.1. Because 0.0.0.0 is configured in the forwarding rule, this
# can be any VIP.
TEST_CMD="wget -q -O - 10.0.0.1; echo"

# Execute the test command on the pod.
kubectl exec -it $BUSYBOX_POD -c busybox -- /bin/sh -c "$TEST_CMD"

Here's how the configuration is verified:

  • The sample client sent a request that specified the service-test hostname.
  • The sample client has an Envoy sidecar proxy that was injected by the Envoy sidecar injector.
  • The sidecar proxy intercepted the request.
  • Using the URL map, the Envoy matched the service-test hostname to the td-gke-service Traffic Director service.
  • The Envoy chose an endpoint from the network endpoint group associated with td-gke-service.
  • The Envoy sent the request to a pod associated with the service-test Kubernetes service.

What's next