This guide explains how to join two clusters into a single Anthos Service Mesh using Mesh CA or Istio CA, and enable cross-cluster load balancing. You can easily extend this process to incorporate any number of clusters into your mesh.
A multi-cluster Anthos Service Mesh configuration can solve several crucial enterprise scenarios, such as scale, location, and isolation. For more information, see Multi-cluster use cases. In addition, you should optimize your applications to get the most benefit from a service mesh. For more information, see Preparing an application for Anthos Service Mesh.
Prerequisites
This guide assumes that you have two or more Google Cloud GKE clusters that meet the following requirements:
- Anthos Service Mesh version 1.6.8 or higher installed on the clusters.
- If your clusters are in the same project, see Installation overview to install or upgrade your clusters to the required version.
- If your clusters are in different projects, see Multi-project installation and migration to install or upgrade your clusters to the required version.
- If you join clusters that are not in the same project, they must be installed
using the
asm-gcp-multiproject
profile and the clusters must be in a shared VPC configuration together on the same network. In addition, we recommend that you have one project to host the shared VPC, and two service projects for creating clusters. For more information, see Setting up clusters with Shared VPC. - If you use Istio CA, use the same custom root certificate for both clusters.
- If your Anthos Service Mesh is built on private clusters, we recommend
creating a single subnet
in the same VPC, otherwise, you must ensure that:
- The control planes can reach the remote private cluster control planes via the cluster private IPs.
- You can add the calling control planes' IP ranges to the remote private clusters' authorized networks. For more information, see Configure endpoint discovery between private clusters.
Setting project and cluster variables
Set a working folder for convenience. This is the folder in which you downloaded and extracted the Anthos Service Mesh files in the prerequisite step, Preparing to install Anthos Service Mesh.
export PROJECT_DIR=YOUR_WORKING_FOLDER
Create the following environment variables for the project ID, cluster zone or region, cluster name, and context.
export PROJECT_1=PROJECT_ID_1 export LOCATION_1=CLUSTER_LOCATION_1 export CLUSTER_1=CLUSTER_NAME_1 export CTX_1="gke_${PROJECT_1}_${LOCATION_1}_${CLUSTER_1}" export PROJECT_2=PROJECT_ID_2 export LOCATION_2=CLUSTER_LOCATION_2 export CLUSTER_2=CLUSTER_NAME_2 export CTX_2="gke_${PROJECT_2}_${LOCATION_2}_${CLUSTER_2}"
If these are newly created clusters, ensure to fetch credentials for each cluster with the following
gcloud
commands otherwise their associatedcontext
will not be available for use in the next steps of this guide:gcloud container clusters get-credentials ${CLUSTER_1} gcloud container clusters get-credentials ${CLUSTER_2}
Create firewall rule
In some cases, you need to create a firewall rule to allow cross-cluster traffic. For example, you need to create a firewall rule if:
- You use different subnets for the clusters in your mesh.
- Your Pods open ports other than 443 and 15002.
GKE automatically adds firewall rules to each node to allow traffic within the same subnet. If your mesh contains multiple subnets, you must explicitly set up the firewall rules to allow cross-subnet traffic. You must add a new firewall rule for each subnet to allow the source IP CIDR blocks and targets ports of all the incoming traffic.
The following instructions allow communication between all clusters in your
project or only between $CLUSTER_1
and $CLUSTER_2
.
Gather information about your clusters' network.
All project clusters
If the clusters are in the same project, you can use the following command to allow communication between all clusters in your project. If there are clusters in your project that you don't want to expose, use the command in the Specific clusters tab.
function join_by { local IFS="$1"; shift; echo "$*"; } ALL_CLUSTER_CIDRS=$(gcloud container clusters list --project $PROJECT_1 --format='value(clusterIpv4Cidr)' | sort | uniq) ALL_CLUSTER_CIDRS=$(join_by , $(echo "${ALL_CLUSTER_CIDRS}")) ALL_CLUSTER_NETTAGS=$(gcloud compute instances list --project $PROJECT_1 --format='value(tags.items.[0])' | sort | uniq) ALL_CLUSTER_NETTAGS=$(join_by , $(echo "${ALL_CLUSTER_NETTAGS}"))
Specific clusters
The following command allows communication between
$CLUSTER_1
and$CLUSTER_2
and doesn't expose other clusters in your project.function join_by { local IFS="$1"; shift; echo "$*"; } ALL_CLUSTER_CIDRS=$(for P in $PROJECT_1 $PROJECT_2; do gcloud --project $P container clusters list --filter="name:($CLUSTER_1,$CLUSTER_2)" --format='value(clusterIpv4Cidr)'; done | sort | uniq) ALL_CLUSTER_CIDRS=$(join_by , $(echo "${ALL_CLUSTER_CIDRS}")) ALL_CLUSTER_NETTAGS=$(for P in $PROJECT_1 $PROJECT_2; do gcloud --project $P compute instances list --filter="name:($CLUSTER_1,$CLUSTER_2)" --format='value(tags.items.[0])' ; done | sort | uniq) ALL_CLUSTER_NETTAGS=$(join_by , $(echo "${ALL_CLUSTER_NETTAGS}"))
Create the firewall rule.
GKE
gcloud compute firewall-rules create istio-multicluster-pods \ --allow=tcp,udp,icmp,esp,ah,sctp \ --direction=INGRESS \ --priority=900 \ --source-ranges="${ALL_CLUSTER_CIDRS}" \ --target-tags="${ALL_CLUSTER_NETTAGS}" --quiet \ --network=YOUR_NETWORK
Autopilot
TAGS="" for CLUSTER in ${CLUSTER_1} ${CLUSTER_2} do TAGS+=$(gcloud compute firewall-rules list --filter="Name:$CLUSTER*" --format="value(targetTags)" | uniq) && TAGS+="," done TAGS=${TAGS::-1} echo "Network tags for pod ranges are $TAGS" gcloud compute firewall-rules create asm-multicluster-pods \ --allow=tcp,udp,icmp,esp,ah,sctp \ --network=gke-cluster-vpc \ --direction=INGRESS \ --priority=900 --network=VPC_NAME \ --source-ranges="${ALL_CLUSTER_CIDRS}" \ --target-tags=$TAGS
Configure endpoint discovery between clusters
Configure endpoint discovery for cross-cluster load balancing by using the following commands. This step performs these tasks:
- The
istioctl
command creates a secret that grants access to the Kube API Server for a cluster. - The
kubectl
command applies the secret to another cluster, so that the second cluster can read service endpoints from the first.
istioctl x create-remote-secret --context=${CTX_1} --name=${CLUSTER_1} | \ kubectl apply -f - --context=${CTX_2}
istioctl x create-remote-secret --context=${CTX_2} --name=${CLUSTER_2} | \ kubectl apply -f - --context=${CTX_1}
Configure endpoint discovery between private clusters
When using private clusters, you must configure the remote clusters' private IPs instead of the public IPs because the public IPs are not accessible.
Write the secrets with public IPs into temporary files:
istioctl x create-remote-secret --context=${CTX_1} --name=${CLUSTER_1} > ${CTX_1}.secret
istioctl x create-remote-secret --context=${CTX_2} --name=${CLUSTER_2} > ${CTX_2}.secret
Retrieve the private IPs for the private clusters, and replace the public IPs with them in the secrets in the temporary files:
PRIV_IP=`gcloud container clusters describe "${CLUSTER_1}" --project "${PROJECT_1}" \ --zone "${LOCATION_1}" --format "value(privateClusterConfig.privateEndpoint)"` ./istioctl x create-remote-secret --context=${CTX_1} --name=${CLUSTER_1} --server=https://${PRIV_IP} > ${CTX_1}.secret
PRIV_IP=`gcloud container clusters describe "${CLUSTER_2}" --project "${PROJECT_2}" \ --zone "${LOCATION_2}" --format "value(privateClusterConfig.privateEndpoint)"` ./istioctl x create-remote-secret --context=${CTX_2} --name=${CLUSTER_2} --server=https://${PRIV_IP} > ${CTX_2}.secret
Apply the new secrets into the clusters:
kubectl apply -f ${CTX_1}.secret --context=${CTX_2}
kubectl apply -f ${CTX_2}.secret --context=${CTX_1}
Configuring authorized networks for private clusters
Follow this section only if all of the following conditions apply to your mesh:
- You are using private clusters.
- The clusters do not belong to the same subnet.
- The clusters have enabled authorized networks.
When deploying multiple clusters in Anthos Service Mesh, the Istiod in each cluster needs to call the GKE control plane of the remote clusters. To allow traffic, you need to add the Pod address range in the calling cluster to the authorized networks of the remote clusters.
Get the Pod IP CIDR block for each cluster:
POD_IP_CIDR_1=`gcloud container clusters describe ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --format "value(ipAllocationPolicy.clusterIpv4CidrBlock)"`
POD_IP_CIDR_2=`gcloud container clusters describe ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --format "value(ipAllocationPolicy.clusterIpv4CidrBlock)"`
Add the Kubernetes cluster Pod IP CIDR blocks to the remote clusters:
EXISTING_CIDR_1=`gcloud container clusters describe ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --format "value(masterAuthorizedNetworksConfig.cidrBlocks.cidrBlock)"` gcloud container clusters update ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --enable-master-authorized-networks \ --master-authorized-networks ${POD_IP_CIDR_2},${EXISTING_CIDR_1//;/,}
EXISTING_CIDR_2=`gcloud container clusters describe ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --format "value(masterAuthorizedNetworksConfig.cidrBlocks.cidrBlock)"` gcloud container clusters update ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --enable-master-authorized-networks \ --master-authorized-networks ${POD_IP_CIDR_1},${EXISTING_CIDR_2//;/,}
For more information, see Creating a cluster with authorized networks.
Verify that the authorized networks are updated:
gcloud container clusters describe ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --format "value(masterAuthorizedNetworksConfig.cidrBlocks.cidrBlock)"
gcloud container clusters describe ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --format "value(masterAuthorizedNetworksConfig.cidrBlocks.cidrBlock)"
Enable control plane global access
Follow this section only if all of the following conditions apply to your mesh:
- You are using private clusters.
- You use different regions for the clusters in your mesh.
You must enable control plane global access
to allow istiod
in each cluster to call the GKE control
plane of the remote clusters.
Enable control plane global access:
gcloud container clusters update ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --enable-master-global-access
gcloud container clusters update ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --enable-master-global-access
Verify that control plane global access in enabled:
gcloud container clusters describe ${CLUSTER_1} --zone ${LOCATION_1}
gcloud container clusters describe ${CLUSTER_2} --zone ${LOCATION_2}
The
privateClusterConfig
section in the output displays the status ofmasterGlobalAccessConfig
.
Verify your deployment
This section explains how to deploy a sample HelloWorld
service to your multi-
cluster environment to verify that cross-cluster load balancing works.
Enable sidecar injection
Use the following command to locate the revision label value from the
istiod
service, which you use in later steps.kubectl -n istio-system get pods -l app=istiod --show-labels
The output looks similar to the following:
NAME READY STATUS RESTARTS AGE LABELS istiod-asm-173-3-5788d57586-bljj4 1/1 Running 0 23h app=istiod,istio.io/rev=asm-173-3,istio=istiod,pod-template-hash=5788d57586 istiod-asm-173-3-5788d57586-vsklm 1/1 Running 1 23h app=istiod,istio.io/rev=asm-173-3,istio=istiod,pod-template-hash=5788d57586
In the output, under the
LABELS
column, note the value of theistiod
revision label, which follows the prefixistio.io/rev=
. In this example, the value isasm-173-3
. Use the revision value in the steps in the next section.
Install the HelloWorld service
Create the sample namespace and the Service Definition in each cluster. In the
following command, substitute REVISION with the istiod
revision
label that you noted from the previous step.
for CTX in ${CTX_1} ${CTX_2} do kubectl create --context=${CTX} namespace sample kubectl label --context=${CTX} namespace sample \ istio-injection- istio.io/rev=REVISION --overwrite done
Replace REVISION with the istiod
revision label that you previously
noted.
The output is:
label "istio-injection" not found. namespace/sample labeled
You can safely ignore label "istio-injection" not found.
Create the HelloWorld service in both clusters:
kubectl create --context=${CTX_1} \ -f ${PROJECT_DIR}/samples/helloworld/helloworld.yaml \ -l service=helloworld -n sample
kubectl create --context=${CTX_2} \ -f ${PROJECT_DIR}/samples/helloworld/helloworld.yaml \ -l service=helloworld -n sample
Deploy HelloWorld v1 and v2 to each cluster
Deploy
HelloWorld v1
toCLUSTER_1
andv2
toCLUSTER_2
, which helps later to verify cross-cluster load balancing:kubectl create --context=${CTX_1} \ -f ${PROJECT_DIR}/samples/helloworld/helloworld.yaml \ -l version=v1 -n sample
kubectl create --context=${CTX_2} \ -f ${PROJECT_DIR}/samples/helloworld/helloworld.yaml \ -l version=v2 -n sample
Confirm
HelloWorld v1
andv2
are running using the following commands. Verify that the output is similar to that shown.:kubectl get pod --context=${CTX_1} -n sample
NAME READY STATUS RESTARTS AGE helloworld-v1-86f77cd7bd-cpxhv 2/2 Running 0 40s
kubectl get pod --context=${CTX_2} -n sample
NAME READY STATUS RESTARTS AGE helloworld-v2-758dd55874-6x4t8 2/2 Running 0 40s
Deploy the Sleep service
Deploy the
Sleep
service to both clusters. This pod generates artificial network traffic for demonstration purposes:for CTX in ${CTX_1} ${CTX_2} do kubectl apply --context=${CTX} \ -f ${PROJECT_DIR}/samples/sleep/sleep.yaml -n sample done
Wait for the
Sleep
service to start in each cluster. Verify that the output is similar to that shown:kubectl get pod --context=${CTX_1} -n sample -l app=sleep
NAME READY STATUS RESTARTS AGE sleep-754684654f-n6bzf 2/2 Running 0 5s
kubectl get pod --context=${CTX_2} -n sample -l app=sleep
NAME READY STATUS RESTARTS AGE sleep-754684654f-dzl9j 2/2 Running 0 5s
Verify cross-cluster load balancing
Call the HelloWorld
service several times and check the output to verify
alternating replies from v1 and v2:
Call the
HelloWorld
service:kubectl exec --context="${CTX_1}" -n sample -c sleep \ "$(kubectl get pod --context="${CTX_1}" -n sample -l \ app=sleep -o jsonpath='{.items[0].metadata.name}')" \ -- curl -sS helloworld.sample:5000/hello
The output is similar to that shown:
Hello version: v2, instance: helloworld-v2-758dd55874-6x4t8 Hello version: v1, instance: helloworld-v1-86f77cd7bd-cpxhv ...
Call the
HelloWorld
service again:kubectl exec --context="${CTX_2}" -n sample -c sleep \ "$(kubectl get pod --context="${CTX_2}" -n sample -l \ app=sleep -o jsonpath='{.items[0].metadata.name}')" \ -- curl -sS helloworld.sample:5000/hello
The output is similar to that shown:
Hello version: v2, instance: helloworld-v2-758dd55874-6x4t8 Hello version: v1, instance: helloworld-v1-86f77cd7bd-cpxhv ...
Congratulations, you've verified your load-balanced, multi-cluster Anthos Service Mesh!
Clean up HelloWorld service
When you finish verifying load balancing, remove the HelloWorld
and Sleep
service from your cluster.
kubectl delete ns sample --context ${CTX_1} kubectl delete ns sample --context ${CTX_2}