This page describes how control plane revisions work and the value of using them for safe service mesh upgrades (and rollbacks). Up until version 1.6.8, the default installation process for Anthos Service Mesh didn't use control plane revisions. Introducing revisions might require some effort and modifications to your installation procedures, but we highly recommend it since using revisions brings significant benefits.
Service mesh fundamentals
Anthos Service Mesh installation consists of two major parts: First you use the
istioctl command line tool or the
install_asm script to install an
in-cluster control plane or configure
the Google-managed control plane. The
control plane consists of a set of system services that are responsible for
managing mesh configuration. Next, you deploy a special sidecar proxy
throughout your environment that intercepts network communication to and from
each workload. The proxies communicate with the control plane to get their
configuration, which allows you to direct and control traffic (data plane
traffic) around your mesh without making any changes to your workloads.
To deploy the proxies, you use a process called automatic sidecar injection (auto-injection) to run a proxy as an additional sidecar container in each of your workload Pods. You don't need to modify the Kubernetes manifests that you use to deploy your workloads, but you do need to add a label to your namespaces and restart the Pods.
Prior to Anthos Service Mesh 1.6, you upgraded by installing a new version of the control plane which immediately replaced the old version. This procedure is known as an in-place upgrade, and it is risky because if there are failures, rolling back can be difficult. To re-inject the proxies and have them communicate with the new control plane version, you had to restart all workloads in all of your namespaces. Depending on the number of workloads and namespaces in your mesh, the entire upgrade process could take an hour or more. In-place upgrades can lead to downtime and should be scheduled in maintenance windows.
Use revisions to upgrade your mesh safely
The ability to control traffic is one of the principal benefits of using a service mesh. For example, you can gradually shift traffic to a new version of an application when you first deploy it to production. If you detect problems during the upgrade, you can shift traffic back to the original version, providing a simple and low risk means of rolling back. This procedure is known as a canary release, and it greatly reduces the risk associated with new deployments.
Similarly, you can minimize the risk associated with upgrading the service mesh itself. Anthos Service Mesh 1.6 and later supports canary upgrades by using control plane revisions. With a canary upgrade, you install a new and separate control plane and configuration alongside the existing control plane. The installer assigns a string called a revision to identify the new control plane. At first, the sidecar proxies continue to receive configuration from the previous version of the control plane. You gradually associate workloads with the new control plane by labelling their namespaces with the new control plane revision. Once you have labelled a namespace with the new revision, you restart the workload Pods so that new sidecars are injected, and they receive their configuration from the new control plane. If there are problems, you can easily roll back by associating the workloads with the original control plane.
How does auto-injection work?
Auto-injection uses a Kubernetes feature called admission control. A mutating admission webhook is registered to watch for newly created Pods. The webhook is configured with a namespace selector so that it only matches Pods that are being deployed to namespaces that have a particular label. When a Pod matches, the webhook consults an injection service provided by the control plane to obtain a new, mutated configuration for the Pod, which contains the containers and volumes needed to run the sidecar.
- Webhook configuration is created during installation. Registers webhook with Kubernetes API server.
- Kubernetes API server watches for Pod deployments in namespaces that
match the webhook
- A namespace is labeled so that it will be matched by the
- Pods deployed to the namespace trigger the webhook.
injectservice provided by the control plane mutates the Pod specifications to inject the sidecar.
What is a revision?
The label used for auto-injection is like any other user-defined Kubernetes label. A label is essentially a key-value pair which can be used to support the concept of tagging. Labels are widely used for tagging and for revisions—examples include Git tags, Docker tags, and Knative revisions.
Up until Anthos Service Mesh version 1.6.8, the default installation procedures have
established a convention for configuring the namespace selector in the webhook
to use the label:
The current Anthos Service Mesh installation process lets you tag the installed
control plane with a revision string. The installer labels every control plane
object with the revision. The key in the key-value pair is
the value of the revision label differs for the Google-managed control plane and
the in-cluster control planes.
For in-cluster control planes, the
istiodService and Deployment typically have a revision label similar to
asm-1104-9identifies the Anthos Service Mesh version. The revision becomes part of the service name, for example:
For the Google-managed control plane, the revision label corresponds to a release channel:
Revision label Channel
To enable auto-injection, you add a revision label to your namespaces that
matches the revision label on the control plane. For example, a control plane
istio.io/rev=asm-1104-9 selects Pods in namespaces with
istio.io/rev=asm-1104-9 and injects sidecars.
The canary upgrade process
Revision labels make it possible to perform canary upgrades and easy rollbacks of the in-cluster control plane. The Google-managed control uses a similar process, but your cluster is automatically upgraded to the latest version within that channel.
The following steps describe how the process works:
- Start with an existing Anthos Service Mesh or open source Istio
installation. It doesn't matter whether the namespaces are using a revision
label or the
- Use a revision string when you install the new version of the control
plane. Because of the revision string, the new control plane is installed
alongside the existing version. The new installation includes a new webhook
configuration with a
namespaceSelectorconfigured to watch for namespaces with that specific revision label.
- You migrate sidecar proxies to the new control plane by removing the old
label from the namespace, adding the new revision label, and then
restarting the Pods. If you use revisions with Anthos Service Mesh, you
must stop using the
istio-injection=enabledlabel. A control plane with a revision does not select Pods in namespaces with an
istio-injectionlabel, even if there is a revision label. The webhook for the new control plane injects sidecars into the Pods.
- Carefully test the workloads associated with the upgraded control plane and either continue to roll out the upgrade or roll back to the original control plane.
After associating Pods with the new control plane, the existing control plane and webhook are still installed. The old webhook has no effect for Pods in namespaces that have been migrated to the new control plane. You can roll back the Pods in a namespace to the original control plane by removing the new revision label, adding back the original label and restarting the Pods. When you are certain that the upgrade is complete, you can remove the old control plane.
For detailed steps on upgrading using revisions, see the Upgrade guides.
A closer look at a mutating webhook configuration
The best way to understand the mutating webhook for automatic sidecar injection is to inspect the configuration yourself. Use the following command:
kubectl -n istio-system get mutatingwebhookconfiguration -l app=sidecar-injector -o yaml
You should see a separate configuration for each control plane that you have installed. A namespace selector for a revision-based control plane looks like this:
namespaceSelector: matchExpressions: - key: istio-injection operator: DoesNotExist - key: istio.io/rev operator: In values: - asm-173-6
The selector may vary depending on the version of Anthos Service Mesh or Istio
that you are running. This selector matches namespaces with a specific revision
label as long as they do not also have an
When a Pod is deployed to a namespace matching the selector, its Pod specification is submitted to the injector service for mutation. The injector service to be called is specified as follows:
service: name: istiod-asm-173-6 namespace: istio-system path: /inject port: 443
The service is exposed by the control plane on port 443 at the
rules section specifies that the webhook should apply to Pod creation:
rules: - apiGroups: - "" apiVersions: - v1 operations: - CREATE resources: - pods scope: '*'
Although the change over to using revision labels on your namespaces to enable auto-injection might take some getting used to, the benefits that revision labels provide for safe, canary upgrades are well worth the effort.