Version 1.11

Configure managed Anthos Service Mesh

Overview

Managed Anthos Service Mesh is a Google-managed control plane and an optional data plane that you simply configure. Google handles their reliability, upgrades, scaling and security for you in a backward-compatible manner. This guide explains how to set up or migrate applications to managed Anthos Service Mesh in a single or multi-cluster configuration with asmcli. To learn how to use the preview experimental command asmcli x to install managed Anthos Service Mesh, see Configure managed Anthos Service Mesh with asmcli x.

To learn about the supported features and limitations of managed Anthos Service Mesh, see Managed Anthos Service Mesh supported features.

Prerequisites

As a starting point, this guide assumes that you have:

For a faster installation, your clusters must have Workload Identity enabled. If Workload Identity isn't enabled, the installation will automatically enable it.

Requirements

  • One or more clusters with a supported version of GKE, in one of the supported regions.

  • Your clusters must be registered to a fleet. This can be done separately prior to the installation, or as part of the installation by passing the --enable-registration and --fleet-id flags.

  • Your project must have the Service Mesh Feature enabled. You could enable it as part of the installation by passing --enable-gcp-components, or by running the following command:

    gcloud beta container hub mesh enable --project=FLEET_PROJECT_ID
    

    where FLEET_PROJECT_ID is the project-id of the fleet host project.

  • Managed Anthos Service Mesh can use multiple GKE clusters in a single-project single-network or a multi-project single-network environment. If you join clusters that are not in the same project, they must be registered to the same fleet host project, and the clusters must be in a shared VPC configuration together on the same network. In addition, we recommend that you have one project to host the shared VPC, and separate service projects for creating clusters. For more information, see Setting up clusters with Shared VPC.

Limitations

We recommend that you review the list of managed Anthos Service Mesh supported features and limitations. In particular, note the following:

  • The IstioOperator API isn't supported since its main purpose is to control in-cluster components.

    You can use the migration tool included with asmcli to automatically convert other IstioOperator optional features to be compatible with Google-managed control plane. For more information, see Enabling optional features on managed Anthos Service Mesh and Migrate from IstioOperator.

  • Managed data plane limitations:

    • This Preview release of the managed data plane is available only for new deployments of the managed control plane. If you previously deployed the managed control plane, and you want to deploy the managed data plane, you must rerun the installation tool as described in Apply the Google-managed control plane.

    • The managed date plane is available on the Regular and Rapid release channels.

Download the installation tool

  1. Download the latest version of the tool that installs Anthos Service Mesh to the current working directory:

    curl https://storage.googleapis.com/csm-artifacts/asm/asmcli_1.11 > asmcli
    
  2. Make the tool executable:

    chmod +x asmcli
    

Configure each cluster

Use the following steps to configure managed Anthos Service Mesh for each cluster in your mesh.

Get cluster credentials

Retrieve the appropriate credentials. The following command will also point the kubectl context to the target cluster.

gcloud container clusters get-credentials  CLUSTER_NAME \
    --zone LOCATION \
    --project PROJECT_ID

Apply the Google-managed control plane

Run the installation tool for each cluster that will use managed Anthos Service Mesh. We recommend that you include the following options:

  • `--enable-registration --fleet_id FLEET_PROJECT_ID ` These two flags register the cluster to a fleet, where the FLEET_ID is the project-id of the fleet host project. If using a single-project, the FLEET_PROJECT_ID is the same as PROJECT_ID, the fleet host project and the cluster project are the same. In more complex configurations like multi-project, we recommend using a separate fleet host project.

  • --enable-all. This flag enables both required components and registration.

These options are required if you also want to deploy the Google-managed data plane. For a full list of options, see the asmcli reference page.

The asmcli tool configures the managed control plane directly using tools and logic inside of the CLI tool. Use the set of instructions below depending on your preferred CA.

Mesh CA

Run the following command to install the control plane with default features and Mesh CA. Enter your values in the provided placeholders.

  ./asmcli install \
      -p PROJECT_ID \
      -l LOCATION \
      -n CLUSTER_NAME \
      --fleet_id FLEET_PROJECT_ID \
      --managed \
      --verbose \
      --output_dir CLUSTER_NAME \
      --enable-all

CA Service

  1. Follow the steps in Configure Certificate Authority Service.
  2. Run the following command to install the control plane with default features and Certificate Authority Service. Enter your values in the provided placeholders.
  ./asmcli install \
      -p PROJECT_ID \
      -l LOCATION \
      -n CLUSTER_NAME \
      --fleet_id FLEET_PROJECT_ID \
      --managed \
      --verbose \
      --output_dir CLUSTER_NAME \
      --enable-all \
      --ca gcp_cas \
      --ca_pool pool_name

To install managed Anthos Service Mesh with the experimental asmcli x command, see Configure managed Anthos Service Mesh with asmcli x.

The tool will download all the files for configuring the managed control plane to the specified --output_dir, installing the istioctl tool and sample applications. The steps in this guide assume that you run istioctl from the root of the installation directory, with istioctl present in its /bin subdirectory.

If you rerun asmcli on the same cluster, it overwrites the existing control plane configuration. Be sure to specify the same options and flags if want the same configuration.

Note that an ingress gateway isn't automatically deployed with the control plane. Decoupling the deployment of the ingress gateway and control plane allows you to more easily manage your gateways in a production environment. If the cluster needs an ingress gateway or an egress gateway, see Deploy gateways. To enable other optional features, see Enabling optional features on managed Anthos Service Mesh.

A note on zero-touch upgrades

Once the Google-managed control plane is installed, Google will automatically upgrade it when new releases or patches become available.

It is not mandatory to upgrade the data plane every time a control plane upgrade happens. The control plane continues to work with all the proxies in the support window, but it is recommended, for getting access to the latest data plane features, fixes, and performance improvements. To upgrade to the latest published proxy image in your channel, you can perform either a rolling restart, when convenient, or apply the Google-managed data plane which will do it automatically for you.

Apply the Google-managed data plane (optional)

If you want Google to manage upgrades of the proxies, enable the Google-managed data plane. If enabled, the sidecar proxies and injected gateways are automatically upgraded in conjunction with the managed control plane.

Note that the Google-managed data plane requires the Istio Container Network Interface (CNI) plugin, which is now enabled by default when you deploy the Google-managed control plane.

In the feature preview, managed data plane upgrades proxies by evicting Pods that are running older versions of the proxy. The evictions are done in an orderly manner honoring the Pod disruption budget and controlling the rate of change.

This Preview release of managed data plane doesn't manage the following:

  • Uninjected pods.
  • Manually injected pods using istioctl kube-inject.
  • Jobs
  • Stateful Sets
  • DaemonSet

The managed data plane is available on both the Rapid and Regular release channels.

To enable the Google-managed data plane:

  1. Enable data plane management:

    kubectl annotate --overwrite namespace NAMESPACE \
    mesh.cloud.google.com/proxy='{"managed":"true"}'
    

    Alternatively, you can enable the Google-managed data plane for a specific Pod by annotating it with the same annotation. When you annotate a specific Pod, that Pod uses the Google-managed sidecar proxy and the rest of the workloads use the unmanaged sidecar proxies.

  2. Repeat the previous step for each namespace that you want a managed data plane.

It could take up to ten minutes for the data plane controller to be ready to manage the proxies in the cluster. Run the following command to check the status:

if kubectl get dataplanecontrols -o custom-columns=REV:.spec.revision,STATUS:.status.state | grep rapid | grep -v none > /dev/null; then echo "Managed Data Plane is ready."; else echo "Managed Data Plane is NOT ready."; fi

When the data plane controller is ready, the command will output: Managed Data Plane is ready.

If the status for data plane controller doesn't become ready after waiting over ten minutes, see Managed data plane status for troubleshooting tips.

If you want to disable the Google-managed data plane and revert back to managing the sidecar proxies yourself, change the annotation:

kubectl annotate --overwrite namespace NAMESPACE \
  mesh.cloud.google.com/proxy='{"managed":"false"}'

Configure endpoint discovery (only for multi-cluster installations)

Anthos Service Mesh managed control plane supports multi-project, single-network (shared VPC) multi-primary configuration, with the difference that the control plane is not installed in the cluster.

Before you continue, you should have already run the install tool on each cluster as described in the previous steps. There is no need to indicate that a cluster is a primary cluster, this is the default behavior.

To configure endpoint discovery between GKE clusters, you run asmcli create-mesh. This command:

  • Registers all clusters to the same fleet.
  • Configures the mesh to trust the fleet workload identity.
  • Creates remote secrets.

You can either specify the URI for each cluster or the path the kubeconfig file.

Cluster URI

In the following command, replace FLEET_PROJECT_ID with the project ID of the fleet host project and the cluster URI with the cluster name, zone or region, and project ID for each cluster. This example only shows two clusters, but you can run the command to enable endpoint discovery on additional clusters, subject to the GKE Hub service limit.

./asmcli create-mesh \
    FLEET_PROJECT_ID \
    ${PROJECT_1}/${LOCATION_1}/${CLUSTER_1} \
    ${PROJECT_2}/${LOCATION_2}/${CLUSTER_2}

kubeconfig file

In the following command, replace FLEET_PROJECT_ID with the project ID of the fleet host project and PATH_TO_KUBECONFIG with the path to each kubeconfig file. This example only shows two clusters, but you can run the command to enable endpoint discovery on additional clusters, subject to the GKE Hub service limit.

./asmcli create-mesh \
    FLEET_PROJECT_ID \
    PATH_TO_KUBECONFIG_1 \
    PATH_TO_KUBECONFIG_2

For an example application with two clusters, see HelloWorld service example.

Deploy applications

Before you deploy applications, remove any previous istio-injection labels from their namespaces, and set the istio.io/rev=asm-managed-rapid label instead. If you want to use a different revision label, click asm-managed-rapid, and replace with it with asm-managed for Regular or asm-managed-stable for Stable.

The revision label corresponds to a release channel:

Revision label Channel
istio.io/rev=asm-managed Regular
istio.io/rev=asm-managed-rapid Rapid
istio.io/rev=asm-managed-stable Stable
kubectl label namespace NAMESPACE istio-injection- istio.io/rev=asm-managed-rapid --overwrite

At this point, you have successfully configured Anthos Service Mesh managed control plane. You are now ready to deploy your applications or you can deploy the Bookinfo sample application.

If you deploy an application in a multi-cluster setup, replicate the Kubernetes and control plane configuration in all clusters, unless you plan to limit that particular config to a subset of clusters. The configuration applied to a particular cluster is the source of truth for that cluster. In addition, if the cluster also runs Anthos Service Mesh or Certificate Authority Service with Mesh CA in other namespaces, verify the application can communicate with the other applications controlled by the in-cluster control plane.

Verify control plane metrics

You can view the version of the control plane and data plane in Metrics Explorer.

To verify that your configuration works correctly:

  1. In the Cloud Console, view the control plane metrics:

    Go to Metrics Explorer

  2. Choose your workspace and add a custom query using the following parameters:

    • Resource type: Kubernetes Container
    • Metric: Proxy Clients
    • Filter: container_name="cr-asm-managed-rapid"
    • Group By: revision label and proxy_version label
    • Aggregator sum
    • Period: 1 minute

    When you run Anthos Service Mesh with both a Google-managed and an in-cluster control plane, you can tell the metrics apart by their container name. For example, managed metrics have container_name="cr-asm-managed", while unmanaged metrics have container_name="discovery". To display metrics from both, remove the Filter on container_name="cr-asm-managed".

  3. Verify the control plane version and proxy version by inspecting the following fields in Metrics Explorer:

    • The revision field indicates the control plane version.
    • The proxy_version field indicates the proxy_version.
    • The value field indicates the number of connected proxies.

    For the current channel to Anthos Service Mesh version mapping, see Anthos Service Mesh versions per channel.

Migrate applications to managed Anthos Service Mesh

To migrate to managed Anthos Service Mesh, perform the following steps:

  1. Run the tool as indicated in the Apply the Google-managed control plane section.

  2. (Optional) If you want to use the Google-managed data plane, enable data plane management:

    kubectl annotate --overwrite namespace NAMESPACE \
    mesh.cloud.google.com/proxy='{"managed":"true"}'
    
  3. (Optional) If you want to use the Google-managed data plane, enable Anthos Service Mesh in the fleet:

    gcloud alpha container hub mesh enable --project=PROJECT_ID
    
  4. Replace the current namespace label with the istio.io/rev=asm-managed-rapid label:

    kubectl label namespace NAMESPACE istio-injection- istio.io/rev=asm-managed-rapid \
        --overwrite
    
  5. Perform a rolling upgrade of deployments in the namespace:

    kubectl rollout restart deployment -n NAMESPACE
    
  6. Test your application to verify that the workloads function correctly.

  7. If you have workloads in other namespaces, repeat the previous steps for each namespace.

  8. If you deployed the application in a multi-cluster setup, replicate the Kubernetes and Istio configuration in all clusters, unless there is a desire to limit that configuration to a subset of clusters only. The configuration applied to a particular cluster is the source of truth for that cluster.

  9. Check that the metrics appear as expected by following the steps in Verify control plane metrics.

If you are satisfied that your application works as expected, you can remove the in-cluster istiod after you switch all namespaces to the in-cluster control plane, or keep them as a backup - istiod will automatically scale down to use fewer resources. To remove, skip to Delete old control plane.

If you encounter problems, you can identify and resolve them by using the information in Resolving managed control plane issues and if necessary, roll back to the previous version.

Delete old control plane

After you install and confirm that all namespaces use the Google-managed control plane, you can delete the old control plane.

kubectl delete Service,Deployment,HorizontalPodAutoscaler,PodDisruptionBudget istiod -n istio-system --ignore-not-found=true

If you used istioctl kube-inject instead of automatic injection, or if you installed additional gateways, check the metrics for the control plane, and verify that the number of connected endpoints is zero.

Roll back

Perform the following steps if you need to roll back to the previous control plane version:

  1. Update workloads to be injected with the previous version of the control plane. In the following command, the revision value asm-191-1 is used only as an example. Replace the example value with the revision label of your previous control plane.

    kubectl label namespace NAMESPACE istio-injection- istio.io/rev=asm-191-1 --overwrite
    
  2. Restart the Pods to trigger re-injection so the proxies have the previous version:

    kubectl rollout restart deployment -n NAMESPACE
    

The managed control plane will automatically scale to zero and not use any resource when not in use. The mutating webhooks and provisioning will remain and do not affect cluster behavior.

The gateway is now set to the asm-managed revision. To roll back, re-run the Anthos Service Mesh install command, which will re-deploy gateway pointing back to your in-cluster control plane:

kubectl -n istio-system rollout undo deploy istio-ingressgateway

Expect this output on success:

deployment.apps/istio-ingressgateway rolled back

Uninstall

Google-managed control plane will auto-scale to zero when no namespaces are using it, therefore no uninstallation is required.

To unregister a cluster from the fleet, run the following commands:

  1. Get the MEMBERSHIP_NAME:

    gcloud container hub memberships list --project FLEET_PROJECT_ID FLEET_PROJECT_ID
    
  2. Unregister the cluster:

    gcloud container hub memberships unregister MEMBERSHIP_NAME --gke_uri=GKE_URI --project FLEET_PROJECT_ID
    

where:

  • MEMBERSHIP_NAME is the name of the membership
  • GKE_URI is the URI of a GKE cluster that you want to register to Hub. To obtain the URI, you can run 'gcloud container clusters list --uri'.

Troubleshooting

To identify and resolve problems when using managed control plane, see Resolving managed control plane issues.

What's next?