This guide describes how to set up managed Cloud Service Mesh on a Google Kubernetes Engine (GKE) Autopilot cluster. Cloud Service Mesh is a fully managed service mesh based on Istio.
This tutorial shows you how to configure a production-ready service mesh running on a single GKE Autopilot cluster with default settings. We recommend that you also consult the full Cloud Service Mesh provisioning guide when you design your environment.
Advantages of running managed Cloud Service Mesh with GKE Autopilot
When you use GKE in Autopilot mode, Google handles setting up and managing your cluster automatically. Autopilot mode streamlines the experience of operating a cluster and lets you focus on your applications. In the same way, managed Cloud Service Mesh is a fully managed service mesh you can provision by following a few steps.
- You provision managed Cloud Service Mesh using the
Fleet API - without the need
for client-side tools like
istioctl
. - Cloud Service Mesh automatically injects sidecar proxies into workloads without the need for granting elevated privileges to your containers.
- You can view rich dashboards for your mesh and services without any extra configuration and then use these metrics to configure service level objectives (SLOs) and alerts to monitor the health of your applications
- The managed Cloud Service Mesh control plane is upgraded automatically to ensure that you get the latest security patches and features
- The Cloud Service Mesh managed data plane automatically upgrades the sidecar proxies in your workloads so that you don't need to restart services yourself when proxy upgrades and security patches are available
- Cloud Service Mesh is a supported product and can be configured using standard open source Istio APIs. See supported features.
Set up your environment
You can set up your environment using the gcloud CLI or Terraform.
gcloud
Set environment variables:
PROJECT_ID=PROJECT_ID gcloud config set project ${PROJECT_ID}
Enable the Mesh API:
gcloud services enable mesh.googleapis.com
Enabling mesh.googleapis.com enables the following APIs:
API Purpose Can Be Disabled meshconfig.googleapis.com
Cloud Service Mesh uses the Mesh Configuration API to relay configuration data from your mesh to Google Cloud. Additionally, enabling the Mesh Configuration API allows you to access the Cloud Service Mesh pages in the Google Cloud console and to use the Cloud Service Mesh certificate authority. No meshca.googleapis.com
Related to Cloud Service Mesh certificate authority used by managed Cloud Service Mesh. No container.googleapis.com
Required to create Google Kubernetes Engine (GKE) clusters. No gkehub.googleapis.com
Required to manage the mesh as a fleet. No monitoring.googleapis.com
Required to capture telemetry for mesh workloads. No stackdriver.googleapis.com
Required to use the Services UI. No opsconfigmonitoring.googleapis.com
Required to use the Services UI for off-Google Cloud clusters. No connectgateway.googleapis.com
Required so that the managed Cloud Service Mesh control plane can access mesh workloads. Yes* trafficdirector.googleapis.com
Enables a highly available and scalable managed control plane. Yes* networkservices.googleapis.com
Enables a highly available and scalable managed control plane. Yes* networksecurity.googleapis.com
Enables a highly available and scalable managed control plane. Yes*
Terraform
gcloud config set project PROJECT_ID
GOOGLE_CLOUD_PROJECT=$(gcloud config get-value project)
export GOOGLE_CLOUD_PROJECT
Create a GKE cluster
Create a GKE cluster in Autopilot mode.
gcloud
Create a cluster, registered as a member of a Fleet:
gcloud container clusters create-auto asm-cluster \ --location="us-central1" \ --enable-fleet
Verify the cluster is registered with the Fleet:
gcloud container fleet memberships list
The output is similar to the following:
NAME: asm-cluster EXTERNAL_ID: LOCATION: us-central1
Make note of the membership name, as you need it to configure Cloud Service Mesh.
Terraform
To create a GKE cluster, you can use the google_container_cluster
resource. You set the fleet
block so that the
cluster is added to a fleet when it is created.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
Provision managed Cloud Service Mesh
You provision managed Cloud Service Mesh using the servicemesh
feature on the
fleet membership for your cluster.
gcloud
Enable the Cloud Service Mesh fleet feature on the project:
gcloud container fleet mesh enable
Enable automatic management of the mesh:
gcloud container fleet mesh update \ --management=automatic \ --memberships=MEMBERSHIP_NAME \ --location=us-central1
Replace
MEMBERSHIP_NAME
with the membership name listed when you verified that your cluster is registered to the fleet.
Terraform
To enable the mesh API, you can use the google_project_service
resource.
You use the google_gke_hub_feature
, and
google_gke_hub_feature_membership
resources to configure managed Cloud Service Mesh on your cluster.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
Verify the control plane is active
Wait until the controlPlaneManagement.state
is ACTIVE
. This might
take up to 15 minutes.
watch -n 30 gcloud container fleet mesh describe
The output is similar to:
membershipSpecs:
projects/746296320118/locations/us-central1/memberships/asm-cluster:
mesh:
management: MANAGEMENT_AUTOMATIC
membershipStates:
projects/746296320118/locations/us-central1/memberships/asm-cluster:
servicemesh:
controlPlaneManagement:
details:
- code: REVISION_READY
details: 'Ready: asm-managed'
state: ACTIVE
dataPlaneManagement:
details:
- code: PROVISIONING
details: Service is provisioning.
state: PROVISIONING
state:
code: OK
description: 'Revision(s) ready for use: asm-managed.'
The dataPlaneManagement
section remains in the PROVISIONING
state until
you deploy the ingress gateway, because Autopilot clusters don't
provision any nodes until you deploy a workload.
Deploy a mesh ingress gateway
In this section, you deploy a mesh ingress gateway to handle incoming traffic for the sample application. An ingress gateway is a load balancer operating at the edge of the mesh, receiving incoming or outgoing HTTP/TCP connections.
You deploy the gateway to a dedicated namespace and label the deployment to ensure that your gateway can be securely managed and automatically upgraded by the Cloud Service Mesh control plane.
Download credentials so that you can access the cluster:
gcloud container clusters get-credentials asm-cluster --location=us-central1
Create a namespace for the gateway deployment:
kubectl create namespace bank-gateways
Add a label to the namespace so that the Cloud Service Mesh control plane automatically injects the gateway configuration into the deployment.
kubectl label namespace bank-gateways istio-injection=enabled
Deploy the ingress gateway to the namespace:
Helm
helm repo add istio https://istio-release.storage.googleapis.com/charts helm repo update helm install --wait --namespace bank-gateways \ --set resources.requests.cpu=250m \ --set resources.requests.memory=512Mi \ --set resources.requests.ephemeral-storage=1Gi \ --set resources.limits.cpu=250m \ --set resources.limits.memory=512Mi \ --set resources.limits.ephemeral-storage=1Gi \ istio-ingressgateway istio/gateway
kubectl
git clone https://github.com/GoogleCloudPlatform/anthos-service-mesh-packages kubectl apply -n bank-gateways \ -f ./anthos-service-mesh-packages/samples/gateways/istio-ingressgateway kubectl -n bank-gateways wait "deployment/istio-ingressgateway" \ --for=condition=available --timeout=240s
Ensure that you set adequate resource requests when you deploy to a production environment. GKE Autopilot only considers resource values set in
requests
and notlimits
. The Istio project publishes information on performance and scalability.
Deploy the sample application
Create a Kubernetes namespace for the deployment:
kubectl create namespace bank-sample
Add a label to the namespace so that Cloud Service Mesh automatically injects sidecar proxies into the sample Pods:
kubectl label namespace bank-sample istio-injection=enabled
Deploy the sample application:
git clone https://github.com/GoogleCloudPlatform/bank-of-anthos.git kubectl apply -n bank-sample -f bank-of-anthos/extras/jwt/jwt-secret.yaml kubectl apply -n bank-sample -f bank-of-anthos/kubernetes-manifests/
Wait for the application to be ready. It will take several minutes.
watch kubectl -n bank-sample get pods
When the application is ready, the output is similar to the following:
NAME READY STATUS RESTARTS AGE accounts-db-0 2/2 Running 0 2m16s balancereader-5c695f78f5-x4wlz 2/2 Running 0 3m8s contacts-557fc79c5-5d7fg 2/2 Running 0 3m7s frontend-7dd589c5d7-b4cgq 2/2 Running 0 3m7s ledger-db-0 2/2 Running 0 3m6s ledgerwriter-6497f5cf9b-25c6x 2/2 Running 0 3m5s loadgenerator-57f6896fd6-lx5df 2/2 Running 0 3m5s transactionhistory-6c498965f-tl2sk 2/2 Running 0 3m4s userservice-95f44b65b-mlk2p 2/2 Running 0 3m4s
Create Istio
Gateway
andVirtualService
resources to expose the application behind the ingress gateway:kubectl apply -n bank-sample -f bank-of-anthos/extras/istio/frontend-ingress.yaml
Get a link to the sample application:
INGRESS_HOST=$(kubectl -n bank-gateways get service istio-ingressgateway \ -o jsonpath='{.status.loadBalancer.ingress[0].ip}') echo "http://$INGRESS_HOST"
In a browser, follow the link to open the sample application. Login with the default username and password to view the application.
Enforce mutual TLS
Make sure that STRICT mutual TLS (mTLS) mode is enabled. Apply a default
PeerAuthentication
policy for the mesh in the istio-system namespace.
Save the following manifest as
mesh-peer-authn.yaml
:apiVersion: "security.istio.io/v1beta1" kind: "PeerAuthentication" metadata: name: "default" namespace: "istio-system" spec: mtls: mode: STRICT
Apply the manifest to the cluster:
kubectl apply -f mesh-peer-authn.yaml
You can override this configuration by creating PeerAuthentication
resources
in specific namespaces.
Explore the Cloud Service Mesh dashboards
In Google Cloud console, go to Cloud Service Mesh to view the dashboards for your mesh:
Select the project from the drop-down list on the menu bar.
You see an overview table with all of the microservices in your mesh and a graphical visualization of the connections between the microservices. For each microservice, the table shows three of the SRE "golden signals":
- Traffic - requests per second
- Error rate - a percentage
- Latency - milliseconds
These metrics are based on the actual traffic being handled by the microservices. Constant test traffic is automatically sent to the
frontend
service by aloadgenerator
client deployed as part of the sample application. Cloud Service Mesh automatically sends metrics, logs, and (optionally) traces to Google Cloud Observability.Click the
frontend
service in the table to see an overview dashboard for the service. You see additional metrics for the service and a visualization of inbound and outbound connections. You can also create a Service Level Object (SLO) for monitoring and alerting on the service.
Verify that mTLS is enabled
Click the security link
in the panel to see a security overview for the frontend
service.
The table and the visualization show a green lock icon for each of the inbound
and outbound connections between microservices. This icon indicates that the
connection is using mTLS for authentication and encryption.