This page describes how control plane revisions work and the value of using them for safe service mesh upgrades (and rollbacks). Up until version 1.6.8, the default installation process for Anthos Service Mesh didn't use control plane revisions. Introducing revisions might require some effort and modifications to your installation procedures, but we highly recommend it since using revisions brings significant benefits.
Service mesh fundamentals
Anthos Service Mesh installation consists of two major parts: First you use the
istioctl command line tool and
YAML files to install the control plane and its configuration. The control
plane (also referred to as
istiod) consists of a set of system services that
are responsible for managing mesh configuration. Next, you deploy a special
sidecar proxy throughout your environment that intercepts network
communication to and from each workload. The proxies communicate with the
control plane to get their configuration, which allows you to direct and control
traffic (data plane traffic) around your mesh without making any changes to your
To deploy the proxies, you use a process called automatic sidecar injection (auto-injection) to run a proxy as an additional sidecar container in each of your workload Pods. You don't need to modify the Kubernetes manifests that you use to deploy your workloads, but you do need to add a label to your namespaces and restart the Pods.
Prior to Anthos Service Mesh 1.6, you upgraded by installing a new version of the control plane which immediately replaced the old version. This procedure is known as an in-place upgrade, and it is risky because if there are failures, rolling back can be difficult. To re-inject the proxies and have them communicate with the new control plane version, you had to restart all workloads in all of your namespaces. Depending on the number of workloads and namespaces in your mesh, the entire upgrade process could take an hour or more. In-place upgrades can lead to downtime and should be scheduled in maintenance windows.
How does auto-injection work?
Auto-injection uses a Kubernetes feature called admission control. A mutating admission webhook is registered to watch for newly created Pods. The webhook is configured with a namespace selector so that it only matches Pods that are being deployed to namespaces that have a particular label. When a Pod matches, the webhook consults an injection service provided by istiod to obtain a new, mutated configuration for the Pod, which contains the containers and volumes needed to run the sidecar.
- Webhook configuration is created during installation. Registers webhook with Kubernetes API server.
- Kubernetes API server watches for Pod deployments in namespaces that
match the webhook
- A namespace is labeled so that it will be matched by the
- Pods deployed to the namespace trigger the webhook.
injectservice provided by istiod mutates the Pod specifications to inject the sidecar.
What is a revision?
The label used for auto-injection is like any other user-defined Kubernetes label. A label is essentially a key-value pair which can be used to support the concept of tagging. Labels are widely used for tagging and for revisions—examples include Git tags, Docker tags, and Knative revisions.
Up until Anthos Service Mesh version 1.6.8, the default installation procedures have
established a convention for configuring the namespace selector in the webhook
to use the label:
The current Anthos Service Mesh installation process lets you tag the installed
control plane with a revision string, both as a
revision argument to
commands and as a field in the
IstioOperator custom resource. The
corresponding label key for namespaces is
istio.io/rev and the value is
typically set to indicate the version of the mesh. For example, a control plane
asm-173-6 selects Pods in namespaces with the label
istio.io/rev=asm-173-6 and injects sidecars.
Why are revisions important?
The ability to control traffic is one of the principal benefits of using a service mesh. For example, you can gradually shift traffic to a new version of an application when you first deploy it to production. If you detect problems during the upgrade, you can shift traffic back to the original version, providing a simple and low risk means of rolling back. This procedure is known as a canary release, and it greatly reduces the risk associated with new deployments.
Similarly, you can minimize the risk associated with upgrading the service mesh
itself. Anthos Service Mesh 1.6 and later supports canary upgrades by using control
plane revisions. When you install a new version of the control plane, you supply
a revision string. The installer labels every control plane object with the
revision, including the
istiod Service and Deployment. The revision becomes
part of the service name, for example,
With a canary upgrade, you roll out a new revision of the control plane alongside the existing version. You then gradually associate proxies with the new control plane by labelling your namespaces with a revision matching the new control plane revision. If there are problems, you can easily roll back by associating the service proxies with the previous revision.
The canary upgrade process
Revision labels make it possible to perform canary upgrades and easy rollbacks of the control plane.
The following steps describe how the process works:
- Start with an existing Anthos Service Mesh or open source Istio
installation. It doesn't matter whether the namespaces are using a revision
label or the
- Use a revision string when you install the new version of the control
plane. Because of the revision string, the new control plane is installed
alongside the existing version. The new installation includes a new webhook
configuration with a
namespaceSelectorconfigured to watch for namespaces with that specific revision label.
- You migrate sidecar proxies to the new control plane by removing the old
label from the namespace, adding the new revision label, and then
restarting the Pods. If you use revisions with Anthos Service Mesh, you
must stop using the
istio-injection=enabledlabel. A control plane with a revision does not select Pods in namespaces with an
istio-injectionlabel, even if there is a revision label. The webhook for the new control plane injects sidecars into the Pods.
- Carefully test the workloads associated with the upgraded control plane and either continue to roll out the upgrade or roll back to the original control plane.
After associating Pods with the new control plane, the existing control plane and webhook are still installed. The old webhook has no effect for Pods in namespaces that have been migrated to the new control plane. You can roll back the Pods in a namespace to the original control plane by removing the new revision label, adding back the original label and restarting the Pods. When you are certain that the upgrade is complete, you can remove the old control plane.
For detailed steps on upgrading using revisions, see the Upgrade guides.
A closer look at a mutating webhook configuration
The best way to understand the mutating webhook for automatic sidecar injection is to inspect the configuration yourself. Use the following command:
kubectl -n istio-system get mutatingwebhookconfiguration -l app=sidecar-injector -o yaml
You should see a separate configuration for each control plane that you have installed. A namespace selector for a revision-based control plane looks like this:
namespaceSelector: matchExpressions: - key: istio-injection operator: DoesNotExist - key: istio.io/rev operator: In values: - asm-173-6
The selector may vary depending on the version of Anthos Service Mesh or Istio
that you are running. This selector matches namespaces with a specific revision
label as long as they do not also have an
When a Pod is deployed to a namespace matching the selector, its Pod specification is submitted to the injector service for mutation. The injector service to be called is specified as follows:
service: name: istiod-asm-173-6 namespace: istio-system path: /inject port: 443
The service is exposed by
istiod on port 443 at the
inject URL path.
rules section specifies that the webhook should apply to Pod creation:
rules: - apiGroups: - "" apiVersions: - v1 operations: - CREATE resources: - pods scope: '*'
Although the change over to using revision labels on your namespaces to enable auto-injection might take some getting used to, the benefits that revision labels provide for safe, canary upgrades are well worth the effort.