Upgrading Anthos Service Mesh on GKE

This guide explains how to upgrade Anthos Service Mesh from version 1.5.4+ or 1.6.4+ to version 1.6.14 on GKE. To upgrade from Anthos Service Mesh 1.4.5+, you first have to upgrade to Anthos Service Mesh 1.5. Direct upgrades from Anthos Service Mesh 1.4 to 1.6 aren't supported.

When upgrading, we recommend that you do a dual control plane upgrade (also referred to as a canary upgrade) where both the new and previous versions of the control plane are running as you test the new version with a small percentage of your workloads. This approach is safer than an in-place upgrade, where the new version of the control plane replaces the previous version. Note that the istio-ingressgateway is upgraded in place, so you should plan for some disruption on your cluster.

Redeploying the Anthos Service Mesh control plane components takes about 5 to 10 minutes to complete. Additionally, you need to inject new sidecar proxies in all of your workloads so they are updated with the current Anthos Service Mesh version. The time it takes to update the sidecar proxies depends on many factors, such as the number of pods, the number of nodes, deployment scaling settings, pod disruption budgets, and other configuration settings. A rough estimate of the time that it takes to update the sidecar proxies is 100 pods per minute.

Preparing for the upgrade

This section outlines the steps that you take to prepare to upgrade Anthos Service Mesh.

  1. Review the Supported features and this guide to become familiar with the features and the upgrade process.

  2. If you enabled optional features when you installed the previous version of Anthos Service Mesh, you need to enable the same features when you upgrade. You enable optional features by adding --set values flags or by specifying the -f flag with a YAML file when you run the istioctl install command.

  3. If you are installing Anthos Service Mesh on a private cluster, you must open port 15017 in the firewall to get the webhook used with automatic sidecar injection to work properly. For more information, see Opening a port on a private cluster.

  4. If you're upgrading from Anthos Service Mesh 1.5, do the following steps in case you need to rollback:

    1. Create a directory called asm-1-5.

    2. Download the 1.5 installation file to the asm-1-5 directory.

    3. Extract the contents of the file to the asm-1-5 directory.

    4. Ensure that you're in the Anthos Service Mesh 1.5 installation root directory.

    5. Download the 1.5 kpt package and configure the 1.5 istio-operator.yaml.

Setting up your environment

For installations on Google Kubernetes Engine, you can follow the installation guides using Cloud Shell, an in-browser command line interface to your Google Cloud resources, or your own computer running Linux or macOS.

Option A: Use Cloud Shell

Cloud Shell provisions a g1-small Compute Engine virtual machine (VM) running a Debian-based Linux operating system. The advantages to using Cloud Shell are:

  • Cloud Shell includes the gcloud, kubectl and helm command-line tools that you need.

  • Your Cloud Shell $HOME directory has 5GB persistent storage space.

  • You have your choice of text editors:

    • Code editor, which you access by clicking at the top of the Cloud Shell window.

    • Emacs, Vim, or Nano, which you access from the command line in Cloud Shell.

To use Cloud Shell:

  1. Go to the Google Cloud console.
  2. Select your Google Cloud project.
  3. Click the Activate Cloud Shell button at the top of the Google Cloud console window.

    Google Cloud Platform console

    A Cloud Shell session opens inside a new frame at the bottom of the Google Cloud console and displays a command-line prompt.

    Cloud Shell session

  4. Update the components:

    gcloud components update
    

    The command responds with output similar to the following:

    ERROR: (gcloud.components.update)
    You cannot perform this action because the gcloud CLI component manager
    is disabled for this installation. You can run the following command
    to achieve the same result for this installation:
    
    sudo apt-get update && sudo apt-get --only-upgrade install ...
  5. Copy the long command and paste it to update the components.

  6. Make sure that Git is in your path so that kpt can find it.

Option B: Use command-line tools locally

On your local machine, install and initialize the gcloud CLI.

If you already have the gcloud CLI installed:

  1. Authenticate with the gcloud CLI:

    gcloud auth login
    
  2. Update the components:

    gcloud components update
    
  3. Install kubectl:

    gcloud components install kubectl
    
  4. Install kpt:

    gcloud components install kpt
    
  5. Make sure that Git is in your path so that kpt can find it.

Setting environment variables

  1. Get the project ID for the project that the cluster was created in and the project number for the fleet host project.

    gcloud

    Run the following command:

    gcloud projects list
    

    Console

    1. Go to the Dashboard page in the Google Cloud console.

      Go to the Dashboard page

    2. Click the Select from drop-down list at the top of the page. In the Select from window that appears, select your project.

      The project ID is displayed on the project Dashboard Project info card.

  2. Create an environment variable for the project ID of the project that the cluster was created in:

    export PROJECT_ID=YOUR_PROJECT_ID

  3. Create an environment variable for the project number of the fleet host project:

    export FLEET_PROJECT_NUMBER=YOUR_FLEET_PROJECT_NUMBER

  4. Create the following environment variables:

    • Set the cluster name:

      export CLUSTER_NAME=YOUR_CLUSTER_NAME
    • Set the CLUSTER_LOCATION to either your cluster zone or cluster region:

      export CLUSTER_LOCATION=YOUR_ZONE_OR_REGION

Optionally change the mesh ID on the cluster

If your service mesh contains or will contain multiple clusters that are in different projects, all clusters must have the same mesh ID, which is based on the project number of the fleet host project. The mesh ID set on your cluster must match the mesh ID that you configure Anthos Service Mesh to use.

If you only have one cluster, or if your service mesh contains or will contain multiple clusters that are in the same project, skip the following steps and continue on to Setting credentials and permissions.

To set the new mesh ID label on the cluster:

  1. Create an environment variable for the mesh ID:

    export MESH_ID="proj-${FLEET_PROJECT_NUMBER}"

  2. If your cluster has existing labels that you want to keep, you must include those labels when adding the mesh_id label.

    1. To see if your cluster has existing labels:

      gcloud container clusters describe ${CLUSTER_NAME} \
        --project ${PROJECT_ID}

      Look for the resourceLabels field in the output. Each label is stored on a separate line under the resourceLabels field, for example:

      resourceLabels:
        csm: ''
        env: dev
        release: stable

      You don't need to preserve the existing mesh_id. Overwrite it with the new mesh_id label.

      For convenience, you can add the labels to an environment variable. In the following, replace YOUR_EXISTING_LABELS with a comma-separated list of the existing labels on your cluster in the format KEY=VALUE, for example: env=dev,release=stable

      export EXISTING_LABELS="YOUR_EXISTING_LABELS"
    2. Set the mesh_id label:

      • If your cluster has existing labels that you want to keep, update the cluster with the mesh_id and the existing labels:

        gcloud container clusters update ${CLUSTER_NAME} \
          --project ${PROJECT_ID}
          --update-labels=mesh_id=${MESH_ID},${EXISTING_LABELS}
      • If you cluster doesn't have any existing labels, update the cluster with only the new mesh_id label:

        gcloud container clusters update ${CLUSTER_NAME} \
          --project=${PROJECT_ID} \
          --update-labels=mesh_id=${MESH_ID}

Setting credentials and permissions

  1. Get authentication credentials to interact with the cluster:

    gcloud container clusters get-credentials ${CLUSTER_NAME} \
        --project=${PROJECT_ID}
    
  2. Grant cluster admin permissions to the current user. You need these permissions to create the necessary role based access control (RBAC) rules for Anthos Service Mesh:

    kubectl create clusterrolebinding cluster-admin-binding \
      --clusterrole=cluster-admin \
      --user="$(gcloud config get-value core/account)"

If you see the "cluster-admin-binding" already exists error, you can safely ignore it and continue with the existing cluster-admin-binding.

Downloading the installation file

    Linux

  1. Download the Anthos Service Mesh installation file to your current working directory:
    curl -LO https://storage.googleapis.com/gke-release/asm/istio-1.6.14-asm.2-linux-amd64.tar.gz
  2. Download the signature file and use openssl to verify the signature:
    curl -LO https://storage.googleapis.com/gke-release/asm/istio-1.6.14-asm.2-linux-amd64.tar.gz.1.sig
    openssl dgst -verify /dev/stdin -signature istio-1.6.14-asm.2-linux-amd64.tar.gz.1.sig istio-1.6.14-asm.2-linux-amd64.tar.gz <<'EOF'
    -----BEGIN PUBLIC KEY-----
    MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEWZrGCUaJJr1H8a36sG4UUoXvlXvZ
    wQfk16sxprI2gOJ2vFFggdq3ixF2h4qNBt0kI7ciDhgpwS8t+/960IsIgw==
    -----END PUBLIC KEY-----
    EOF

    The expected output is: Verified OK

  3. Extract the contents of the file to any location on your file system. For example, to extract the contents to the current working directory:
    tar xzf istio-1.6.14-asm.2-linux-amd64.tar.gz

    The command creates an installation directory in your current working directory named istio-1.6.14-asm.2 that contains:

    • Sample applications in the samples directory.
    • The istioctl command-line tool that you use to install Anthos Service Mesh is in the bin directory.
    • The Anthos Service Mesh configuration profiles are in the manifests/profiles directory.

  4. Mac OS

  5. Download the Anthos Service Mesh installation file to your current working directory:
    curl -LO https://storage.googleapis.com/gke-release/asm/istio-1.6.14-asm.2-osx.tar.gz
  6. Download the signature file and use openssl to verify the signature:
    curl -LO https://storage.googleapis.com/gke-release/asm/istio-1.6.14-asm.2-osx.tar.gz.1.sig
    openssl dgst -sha256 -verify /dev/stdin -signature istio-1.6.14-asm.2-osx.tar.gz.1.sig istio-1.6.14-asm.2-osx.tar.gz <<'EOF'
    -----BEGIN PUBLIC KEY-----
    MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEWZrGCUaJJr1H8a36sG4UUoXvlXvZ
    wQfk16sxprI2gOJ2vFFggdq3ixF2h4qNBt0kI7ciDhgpwS8t+/960IsIgw==
    -----END PUBLIC KEY-----
    EOF

    The expected output is: Verified OK

  7. Extract the contents of the file to any location on your file system. For example, to extract the contents to the current working directory:
    tar xzf istio-1.6.14-asm.2-osx.tar.gz

    The command creates an installation directory in your current working directory named istio-1.6.14-asm.2 that contains:

    • Sample applications in the samples directory.
    • The istioctl command-line tool that you use to install Anthos Service Mesh is in the bin directory.
    • The Anthos Service Mesh configuration profiles are in the manifests/profiles directory.

  8. Windows

  9. Download the Anthos Service Mesh installation file to your current working directory:
    curl -LO https://storage.googleapis.com/gke-release/asm/istio-1.6.14-asm.2-win.zip
  10. Download the signature file and use openssl to verify the signature:
    curl -LO https://storage.googleapis.com/gke-release/asm/istio-1.6.14-asm.2-win.zip.1.sig
    openssl dgst -verify - -signature istio-1.6.14-asm.2-win.zip.1.sig istio-1.6.14-asm.2-win.zip <<'EOF'
    -----BEGIN PUBLIC KEY-----
    MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEWZrGCUaJJr1H8a36sG4UUoXvlXvZ
    wQfk16sxprI2gOJ2vFFggdq3ixF2h4qNBt0kI7ciDhgpwS8t+/960IsIgw==
    -----END PUBLIC KEY-----
    EOF

    The expected output is: Verified OK

  11. Extract the contents of the file to any location on your file system. For example, to extract the contents to the current working directory:
    tar xzf istio-1.6.14-asm.2-win.zip

    The command creates an installation directory in your current working directory named istio-1.6.14-asm.2 that contains:

    • Sample applications in the samples directory.
    • The istioctl command-line tool that you use to install Anthos Service Mesh is in the bin directory.
    • The Anthos Service Mesh configuration profiles are in the manifests/profiles directory.

  12. Ensure that you're in the Anthos Service Mesh installation's root directory.
    cd istio-1.6.14-asm.2
  13. For convenience, add the tools in the /bin directory to your PATH:
    export PATH=$PWD/bin:$PATH

Preparing resource configuration files

When you run the istioctl install command, you specify -f istio-operator.yaml on the command line. This file contains information about your project and cluster that Anthos Service Mesh requires. You need to download a package that contains istio-operator.yaml and other resource configuration files so that you can set the project and cluster information.

To get started, choose a package to download based on the certificate authority (CA) that you want to use:

  • asm: This package enables Mesh CA, which we recommend for new installations.

  • asm-citadel: Optionally, you can enable Citadel as the CA. Before choosing this package, refer to Choosing a certificate authority for more information.

To prepare the resource configuration files:

  1. Create a new directory for the Anthos Service Mesh package resource configuration files. We recommend that you use the cluster name as the directory name.

  2. Change to the directory where you want to download the Anthos Service Mesh package.

  3. Download the package you want to use, based on the CA

    Mesh CA

    Download the asm package, which enables Mesh CA:

    kpt pkg get \
    https://github.com/GoogleCloudPlatform/anthos-service-mesh-packages.git/asm@release-1.6-asm asm
    

    Citadel

    Download the asm-citadel package, which enables Citadel as the CA:

    kpt pkg get \
    https://github.com/GoogleCloudPlatform/anthos-service-mesh-packages.git/asm-citadel@release-1.6-asm asm
    
  4. Set the project ID for the project that the cluster was created in:

    kpt cfg set asm gcloud.core.project ${PROJECT_ID}
    
  5. Set the project number for the fleet host project:

    kpt cfg set asm gcloud.project.environProjectNumber ${FLEET_PROJECT_NUMBER}
    
  6. Set the cluster name:

    kpt cfg set asm gcloud.container.cluster ${CLUSTER_NAME}
    
  7. Set the default zone or region:

    kpt cfg set asm gcloud.compute.location ${CLUSTER_LOCATION}
    
  8. Set the configuration profile that you plan to use:

    • If all of your clusters are in the same project, set the asm-gcp profile:

      kpt cfg set asm anthos.servicemesh.profile asm-gcp
      
    • If your service mesh contains or will contain multiple clusters that are in different projects, set the asm-gcp-multiproject profile (beta):

      kpt cfg set asm anthos.servicemesh.profile asm-gcp-multiproject
      
  9. If you set the asm-gcp-multiproject profile and downloaded the asm package, which enables Mesh CA, you need to configure the trust domain aliases for the other projects that will form the multi-cluster/multi-project service mesh. Otherwise, skip this step.

    1. Get the project ID of all clusters that will be in the multi-cluster/multi-project mesh.

    2. For each cluster's project ID, set the trust domain aliases. For example, if you have clusters in 3 projects, run the following command and replace PROJECT_ID_1, PROJECT_ID_2, and PROJECT_ID_3 with each cluster's project ID.

      kpt cfg set asm anthos.servicemesh.trustDomainAliases PROJECT_ID_1.svc.id.goog PROJECT_ID_2.svc.id.goog PROJECT_ID_3.svc.id.goog

      As you configure the clusters in the other projects, you can use the same command.

      The trust domain aliases enables Mesh CA to authenticate workloads on clusters in other projects. In addition to setting the trust domain aliases, after installing Anthos Service Mesh, you have to enable cross-cluster load balancing.

  10. Output the values of the kpt setters:

      kpt cfg list-setters asm
    

    In the output from the command verify that the values for the following setters are correct:

    • gcloud.compute.location
    • gcloud.container.cluster
    • gcloud.core.project
    • gcloud.project.environProjectNumber

Upgrading Anthos Service Mesh

To install a new version of Anthos Service Mesh, we recommend that you follow the dual control plane upgrade process (referred to as canary upgrades in the Istio documentation). With a dual control plane upgrade, you install a new version of the control plane alongside the existing control plane. When installing the new version, you include a revision label that identifies the version of the new control plane. Each revision is a full Anthos Service Mesh control plane implementation with its own Deployment and Service.

You then migrate to the new version by setting the same revision label on your workloads to point to the new control plane and performing a rolling restart to re-inject the proxies with the new Anthos Service Mesh version. With this approach, you can monitor the effect of the upgrade on a small percentage of your workloads. After testing your application, you can migrate all traffic to the new version. This approach is much safer than doing an in-place upgrade where a new control plane replaces the previous version of the control plane.

Updating the control plane

Run the following command to deploy the new control plane using the configuration profile that you set in the istio-operator.yaml file. If you want to enable a supported optional feature, include -f and the YAML filename on the following command line. See Enabling optional features for more information.

  istioctl install \
    -f asm/cluster/istio-operator.yaml \
    --set revision=asm-1614-2

The --set revision argument adds a istio.io/rev label to istiod. After running the command, you have two control plane Deployments and Services running side-by-side:

kubectl get pods -n istio-system

Example output:

NAME                                        READY   STATUS    RESTARTS   AGE
istio-ingressgateway-c56675fcd-86zdn        1/1     Running   0          2m9s
istio-ingressgateway-c56675fcd-vn4nv        1/1     Running   0          2m21s
istiod-asm-1614-2-6d5cfd4b89-xztlr           1/1     Running   0          3m44s
istiod-fb7f746f4-wcntn                      1/1     Running   0          50m

Redeploying workloads

Installing the new revision has no impact on the existing sidecar proxies. To upgrade these, you must configure them to point to the new control plane. This is controlled during sidecar injection based on the namespace label istio.io/rev.

  1. Update workloads to be injected with the new Anthos Service Mesh version:

    kubectl label namespace NAMESPACE istio-injection- istio.io/rev=asm-1614-2 --overwrite

    The istio-injection label must be removed because it takes precedence over the istio.io/rev label.

  2. Restart the Pods to trigger re-injection:

    kubectl rollout restart deployment -n NAMESPACE
  3. Verify that the Pods are configured to point to the istiod-asm-1614-2 control plane:

    kubectl get pods -n NAMESPACE -l istio.io/rev=asm-1614-2

  4. Test your application to verify that the workloads are working correctly.

  5. If you have workloads in other namespaces, repeat the previous steps for each namespace.

  6. If you are satisfied that your application is working as expected, skip to Complete the upgrade. Otherwise do the following steps to rollback to the previous version.

    1. Update workloads to be injected with the previous version of the control plane:

       kubectl label namespace NAMESPACE istio.io/rev- istio-injection=enabled --overwrite

    2. Restart the Pods to trigger re-injection so the proxies have the previous version:

       kubectl rollout restart deployment -n NAMESPACE

    3. Rollback the control plane components:

      Rollback to previous 1.6

      1. Redeploy the previous version of the istio-ingressgateway:

        kubectl -n istio-system rollout undo deploy istio-ingressgateway
        
      2. Remove the new control plane:

        kubectl delete Service,Deployment,HorizontalPodAutoscaler,PodDisruptionBudget istiod-asm-1614-2 -n istio-system --ignore-not-found=true
        

      Rollback to 1.5

      1. Change to the directory where you downloaded the 1.5 Anthos Service Mesh installation file.

      2. Reinstall the previous version of Anthos Service Mesh. In the following command, if you enabled optional features, be sure to include the applicable --set values flags or the -f flag with the YAML file name.

        bin/istioctl install \
        -f asm/cluster/istio-operator.yaml

Complete the upgrade

If you are satisfied that your application is working as expected, do the following steps to complete the upgrade:

  1. Remove the old control plane:

    kubectl delete Service,Deployment,HorizontalPodAutoscaler,PodDisruptionBudget istiod -n istio-system --ignore-not-found=true
    
  2. Run the following command to deploy the Canonical Service controller:

    kubectl apply -f asm/canonical-service/controller.yaml

    The command deploys the Canonical Service controller to your cluster. The Canonical Service controller groups workloads belonging to the same logical service, and it is required to unlock extra functionality in the Services dashboard in the Google Cloud console. For more information, refer to Enabling and disabling the Canonical Service controller.

What's next