Containers & Kubernetes

What GKE users need to know about Kubernetes' new service account tokens

When you deploy an application on Kubernetes, it runs as a service account — a system user understood by the Kubernetes control plane. The service account is the basic tool for configuring what an application is allowed to do, analogous to the concept of an operating system user on a single machine. Within a Kubernetes cluster, you can use role-based access control to configure what a service account is allowed to do ("list pods in all namespaces", "read secrets in namespace foo"). When running on Google Kubernetes Engine (GKE), you can also use GKE Workload Identity and Cloud IAM to grant service accounts access to GCP resources ("read all objects in Cloud Storage bucket bar").

How does this work? How does the Kubernetes API, or Cloud Storage know that an HTTP request is coming from your application, and not Bob's? It's all about tokens: Kubernetes service account tokens, to be specific. When your application uses a Kubernetes client library to make a call to the Kubernetes API, it attaches a token in the Authorization header, which the server then validates to check your application's identity.

How does your application get this token, and how does the authentication process work? Let's dive in and take a closer look at this process, at some changes that arrived in Kubernetes 1.21 that will enhance Kubernetes authentication, and how to modify your applications to take advantage of the security capabilities.

Legacy tokens: Kubernetes 1.20 and below

Let's spin up a pod and poke around. If you're following along, make sure that you are doing this on a 1.20 (or lower) cluster.

  (dev) $ kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: basic-debian-pod
  namespace: default
spec:
  serviceAccountName: default
  containers:
  - image: debian
    name: main
    command: ["sleep", "infinity"]
EOF

(dev) $ kubectl exec -ti basic-debian-pod -- /bin/bash

(pod) $ ls /var/run/secrets/kubernetes.io/serviceaccount
ca.crt
namespace
token

What are these files? Where did they come from? They certainly don't seem like something that ships in the Debian base image:

  • ca.crt is the trust anchor needed to validate the certificate presented by the Kubernetes API Server in this cluster. Typically, it will contain a single, PEM-encoded certificate.
  • namespace contains the namespace that the pod is running in — in our case, default.
  • token contains the service account token — a bearer token that you can attach to API requests. Eagle-eyed readers may notice that it has the tell-tale structure of a JSON Web Token (JWT): <base64>.<base64>.<base64>.

An aside for security hygiene: Do not post these tokens anywhere. They are bearer tokens, which means that anyone who holds the token has the power to authenticate as your application's service account.

To figure out where these files come from, we can inspect our pod object as it exists on the API server:

  (dev) $ kubectl get pods basic-debian-pod -o yaml
apiVersion: v1
kind: Pod
metadata:
  name: basic-debian-pod
  namespace: default
  # Lots of stuff omitted here…
spec:
  serviceAccountName: default
  containers:
  - image: debian
    name: main
    command:
    - sleep
    - infinity
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-g9ggg
      readOnly: true
    # Lots of stuff omitted here…
  volumes:
  - name: default-token-g9ggg
    secret:
    - defaultMode: 420
      secretName: default-token-g9ggg
  # Lots of stuff omitted here…

The API server has added… a lot of stuff. But the relevant portion for us is:

  • When the pod was scheduled, an admission controller injected a secret volume into each container in our pod.
  • The secret contains keys and data for each file we saw inside the pod.

Let's take a closer look at the token. Here's a real example, from a cluster that no longer exists.

  eyJhbGciOiJSUzI1NiIsImtpZCI6ImtUMHZXUGVVM1dXWEV6d09tTEpieE5iMmZrdm1KZkZBSkFMeXNHQXVFNm8ifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImRlZmF1bHQtdG9rZW4tZzlnZ2ciLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGVmYXVsdCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImFiNzFmMmIwLWFiY2EtNGJjNy05MDVhLWNjOWIyZDY4MzJjZiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpkZWZhdWx0OmRlZmF1bHQifQ.UiLY98ETEp5-JmpgxaJyyZcTvw8AkoGvqhifgGJCFC0pJHySDOp9Zoq-ShnFMOA2R__MYbkeS0duCx-hxDu8HIbZfhyFME15yrSvMHZWNUqJ9SKMlHrCLT3JjLBqX4RPHt-K_83fJfp4Qn2E4DtY6CYnsGUbcNUZzXlN7_uxr9o0C2u15X9QAATkZL2tSwAuPJFcuzLWHCPjIgtDmXczRZ72tD-wXM0OK9ElmQAVJCYQlAMGJHMxqfjUQoz3mbHYfOQseMg5TnEflWvctC-TJd0UBmZVKD-F71x_4psS2zMjJ2eVirLPEhmlh3l4jOxb7RNnP2N_EvVVLmfA9YZE5A

As mentioned earlier, this is a JWT. If we pop it in to our favorite JWT inspector, we can see that the token has the following claims:

  {
  "iss": "kubernetes/serviceaccount",
  "kubernetes.io/serviceaccount/namespace": "default",
  "kubernetes.io/serviceaccount/secret.name": "default-token-g9ggg",
  "kubernetes.io/serviceaccount/service-account.name": "default",
  "kubernetes.io/serviceaccount/service-account.uid": "ab71f2b0-abca-4bc7-905a-cc9b2d6832cf",
  "sub": "system:serviceaccount:default:default"
}

Breaking them down:

  • iss ("issuer") is a standard JWT claim, meant to identify the party that issued the JWT. In Kubernetes legacy tokens, it's always hardcoded to the string "kubernetes/serviceaccount", which is technically compliant with the definition in the RFC, but not particularly useful.
  • sub ("subject") is a standard JWT claim that identifies the subject of the token (your service account, in this case). It's the standard string representation of your service account name (the one also used when referring to the serviceaccount in RBAC rules): system:serviceaccount:<namespace>:<name>. Note that this is technically not compliant with the definition in the RFC, since this is neither globally unique, nor is it unique in the scope of the issuer; two service accounts with the same namespace and name but from two unrelated clusters will have the same issuer and subject claims. This isn't a big problem in practice, though.
  • kubernetes.io/serviceaccount/namespace is a Kubernetes-specific claim; it contains the namespace of the serviceaccount.
  • kubernetes.io/serviceaccount/secret.name is a Kubernetes-specific claim; it names the Kubernetes secret that holds the token.
  • kubernetes.io/serviceaccount/service-account.name is a Kubernetes-specific claim; it names the service account.
  • kubernetes.io/serviceaccount/service-account.uid is a Kubernetes-specific claim; it contains the UID of the service account. This claim allows someone verifying the token to notice that a service account was deleted and then recreated with the same name. This can sometimes be important.

When your application talks to the API server in its cluster, the Kubernetes client library loads this JWT from the container filesystem and sends it in the Authorization header of all API requests. The API Server then validates the JWT signature and uses the token's claims to determine your application's identity.

This also works for authenticating to other services. For example, a common pattern is to configure Hashicorp Vault to be able to authenticate callers using service account tokens from your cluster. To make the task of the relying party (the service seeking to authenticate you) easier, Kubernetes provides the TokenReview API; the relying party just needs to call TokenReview, passing the token you provided. The return value indicates whether or not the token was valid; if so, it also contains the username of your serviceaccount (again, in the form system:serviceaccount:<namespace>:<name>).

Great. So what's the catch? Why did I ominously title this section "legacy" tokens? Legacy tokens have downsides:

  1. Legacy tokens don't expire. If one gets stolen, or logged to a file, or committed to Github, or frozen in an unencrypted backup, it remains dangerous until the end of time (or the end of your cluster).

  2. Legacy tokens have no concept of an audience. If your application passes a token to service A, then service A can just forward the token to service B and pretend to be your application. Even if you trust service A to be trustworthy and competent today, because of point 1, the tokens you pass to service A are dangerous forever. If you ever stop trusting service A, you have no practical recourse but to rotate the root of trust for your cluster.

  3. Legacy tokens are distributed via Kubernetes secret objects, which tend not to be very strictly access-controlled, and means that they usually aren't encrypted at rest or in backups.

  4. Legacy tokens require extra effort for third-party services to integrate with; they generally need to explicitly build support for Kubernetes because of the custom token claims and the need to validate the token with the TokenReview API.

These issues motivated the design of Kubernetes' new token format called bound service account tokens.

Bound tokens: Kubernetes 1.21 and up

Launched in Kubernetes 1.13, and becoming the default format in 1.21, bound tokens address all of the limited functionality of legacy tokens, and more:

  • The tokens themselves are much harder to steal and misuse; they are time-bound, audience-bound, and object-bound.

  • They adopt a standardized format: OpenID Connect (OIDC), with full OIDC Discovery, making it easier for service providers to accept them.

  • They are distributed to pods more securely, using a new Kubelet projected volume type.

Let's explore each of these properties in turn.

We'll repeat our earlier exercise and dissect a bound token. It's still a JWT, but the structure of the claims has changed:

  {
 "aud": [
   "foobar.com"
 ],
 "exp": 1636151360,
 "iat": 1636147760,
 "iss": "https://container.googleapis.com/v1/projects/taahm-gke-dev/locations/us-central1-c/clusters/mesh-certs-test2",
 "kubernetes.io": {
   "namespace": "default",
   "pod": {
     "name": "basic-debian-pod-bound-token",
     "uid": "a593ded9-c93d-4ccf-b43f-bf33d2eb7635"
   },
   "serviceaccount": {
     "name": "default",
     "uid": "ab71f2b0-abca-4bc7-905a-cc9b2d6832cf"
   }
 },
 "nbf": 1636147760,
 "sub": "system:serviceaccount:default:default"
}

Time-binding is implemented by the exp ("expiration"), iat ("issued at"), and nbf ("not before") claims; these are standardized JWT claims. Any external service can use its own clock to evaluate these fields and reject tokens that have expired. Unless otherwise specified, bound tokens default to a one-hour lifetime. The Kubernetes TokenReview API automatically checks if a token is expired before deciding that it is valid.

Audience binding is implemented by the aud ("audience") claim; again, a standardized JWT claim. An audience strongly associates the token with a particular relying party. For example, if you send service A a token that is audience-bound to the string "service A", A can no longer forward the token to service B to impersonate you. If it tries, service B will reject the token because it expects an audience of "service B". The Kubernetes TokenReview API allows services to specify the audiences they accept when validating a token.

Object binding is implemented by the kubernetes.io group of claims. The legacy token only contained information about the service account, but the bound token contains information about the pod the token was issued to. In this case, we say that the token is bound to the pod (tokens can also be bound to secrets). The token will only be considered valid if the pod is still present and running according to the Kubernetes API server — sort of like a supercharged version of the expiration claim. This type of binding is more difficult for external services to check, since they don't have (and you don't want them to have) the level of access to your cluster necessary to check the condition. Fortunately, the Kubernetes TokenReview API also verifies these claims.

Bound service account tokens are valid OpenID Connect (OIDC) identity tokens. This has a number of implications, but the most consequential can be seen in the value of the iss ("issuer") claim. Not all implementations of Kubernetes surface this claim, but for those that do (including GKE), it points to a valid OIDC Discovery endpoint for the tokens issued by the cluster. The upshot of this is that the external services do not need to be Kubernetes-aware in order to authenticate clients using Kubernetes service accounts; they only need to support OIDC and OIDC Discovery. As an example of this type of integration, the OIDC Discovery endpoints underlie GKE Workload Identity, which integrates the Kubernetes and GCP identity systems.

As a final improvement, bound service account tokens are deployed to pods in a more scalable and secure way. Whereas legacy tokens are generated once per service account, stored in a secret, and mounted into pods via a secret volume, bound tokens are generated on-the-fly for each pod, and injected into pods using the new Kubelet serviceAccountToken volume type. To access them, you add the volume spec to your pod and mount it into the containers that need the token.

  (dev) $ kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: basic-debian-pod-bound-token
  namespace: default
spec:
  serviceAccountName: default
  containers:
  - image: debian
    name: main
    command: ["sleep", "infinity"]
    volumeMounts:
    - name: my-bound-token
      mountPath: /var/run/secrets/my-bound-token
  volumes:
  - name: my-bound-token
    projected:
      sources:
      - serviceAccountToken:
          path: token
          audience: foobar.com
          expirationSeconds: 3600
EOF

Note that we have to choose an audience for the token up front, and that we also have control over the token's validity period. The audience requirement means that it's fairly common to mount multiple bound tokens into a single pod, one for each external party that the pod will be communicating with.

Internally, the serviceAccountToken projected volume is implemented directly in Kubelet (the primary Kubernetes host agent). Kubelet handles communicating with kube-apiserver to request the appropriate bound token before the pod is started, and periodically refreshes the token when its expiry is approaching.

To recap, bound tokens are:

  • Significantly more secure than legacy tokens due to time, audience, and object binding, as well as using a more secure distribution mechanism to pods.

  • Easier to iterate with for external parties, due to OIDC compatibility.

However, the way you integrate with them has changed. Whereas there was a single legacy token per service account, always accessible at /var/run/secrets/kubernetes.io/serviceaccount/token, each pod may have multiple bound tokens. Because the tokens expire and are refreshed by Kubelet, applications need to periodically reload them from the filesystem.

Bound tokens have been available since Kubernetes 1.13, but the default token issued to pods continued to be a legacy token, with all the security downsides that implied. In Kubernetes 1.21, this changes: the default token is a bound service account token. Kubernetes 1.22 finishes off the migration by promoting bound service account tokens by default to GA.

In the next sections, we will take a look at what these changes mean for users of Kubernetes service account tokens, first for clients, and then for service providers.

Impacts on clients

In Kubernetes 1.21, the default token available at /var/run/secrets/kubernetes.io/serviceaccount/token is changing from a legacy token to a bound service account token. If you use this token as a client, by sending it as a bearer token to an API, you may need to make changes to your application to keep it working.

For clients, there are two primary differences in the new default token:

  • The new default token has a cluster-specific audience that identifies the cluster's API server. In GKE, this audience is the URL https://container.googleapis.com/v1/projects/PROJECT/locations/LOCATION/clusters/NAME.

  • The new default token expires periodically, and must be refreshed from disk.

If you only ever use the default token to communicate with the Kubernetes API server of the cluster your application is deployed in, using up-to-date versions of the official Kubernetes client libraries (for example, using client-go and rest.InClusterConfig), then you do not need to make any changes to your application. The default token will carry an appropriate audience for communicating with the API server, and the client libraries handle automatically refreshing the token from disk.

If your application currently uses the default token to authenticate to an external service (common with Hashicorp Vault deployments, for example), you may need to make some changes, depending on the precise nature of the integration between the external service and your cluster.

First, if the service requires a unique audience on its access tokens, you will need to mount a dedicated bound token with the correct audience into your pod, and configure your application to use that token when authenticating to the service. Note that the default behavior of the Kubernetes TokenReview API is to accept the default Kubernetes API server audience, so if the external service hasn't chosen a unique audience, it might still accept the default token. This is not ideal from a security perspective — the purpose of the audience claim is to protect yourself by ensuring that tokens stolen from (or used nefariously by) the external service cannot be used to impersonate your application to other external services.

If you do need to mount a token with a dedicated audience, you will need to create a serviceAccountToken projected volume, and mount it to a new path in each container that needs it. Don't try to replace the default token. Then, update your client code to read the token from the new path.

Second, you must ensure that your application periodically reloads the token from disk. It's sufficient to just poll for changes every five minutes, and update your authentication configuration if the token has changed. Services that provide client libraries might already handle this task in their client libraries.

Let's look at some concrete scenarios:

Your application uses an official Kubernetes client library to read and write Kubernetes objects in the local cluster: Ensure that your client libraries are up-to-date. No further changes are required; the default token already carries the correct audience, and the client libraries automatically handle reloading the token from disk.

Your application uses Google Cloud client libraries and GKE Workload Identity to call Google Cloud APIs: No changes are required. While Kubernetes service account tokens are required in the background, all of the necessary token exchanges are handled by gke-metadata-server.

Your application uses the default Kubernetes service account token to authenticate to Vault: Some changes are required. Vault integrates with your cluster by calling the Kubernetes TokenReview API, but performs an additional check on the issuer claim. By default, Vault expects the legacy token issuer of kubernetes/serviceaccount, and will reject the new default bound token. You will need to update your vault configuration to specify the new issuer. On GKE, the issuer follows the pattern https://container.googleapis.com/v1/projects/PROJECT/locations/LOCATION/clusters/NAME.

Currently, Vault does not expect a unique audience on the token, so take care to protect the default token. If it is compromised, it can be used to retrieve your secrets from Vault.

Your application uses the default Kubernetes service account token to authenticate to an external service: In general, no immediate changes are required, beyond ensuring that your application periodically reloads the default token from disk. The default behavior of the Kubernetes TokenReview API ensures that authentication keeps working across the transition. Over time, the external service may update to require a unique audience on tokens, which will require you to mount a dedicated bound token as described above.

Impacts on services

Services that authenticate clients using the default service account token will continue to work as clients upgrade their clusters to Kubernetes 1.21, due to the default behavior of the Kubernetes TokenReview API. Your service will begin receiving bound tokens with the default audience, and your TokenReview requests will default to validating the default audience. However, bound tokens open up two new integration options for you.

First, you should coordinate with your clients to start requiring a unique audience on the tokens you accept. This benefits both you and your clients by limiting the power of stolen tokens:

  • Your clients no longer need to trust you with a token that can be used to authenticate to arbitrary third parties (for example, their bank or payment gateways).
  • You no longer need to worry about holding these powerful tokens, and potentially being held responsible for breaches. Instead, the tokens you accept can only be used to authenticate to your service.

To do this, you should first decide on a globally-unique audience value for your service. If your service is accessible at a particular DNS name, that's a good choice. Failing that, you can always generate a random UUID and use that. All that matters is that you and your clients agree on the value.

Once you have decided on the audience, you need to update your TokenReview calls to begin validating the audience. In order to give your clients time to migrate, you should conduct a phased migration:

  1. Update your TokenReview calls to specify both your new audience and the default audience in the spec.audiences list. Remember that the default audience is different for every cluster, so you will either need to obtain it from your client, or guess it based on the kube-apiserver endpoint they provide you. As a reminder, for GKE cluster, the default audience is https://container.googleapis.com/v1/projects/PROJECT/locations/LOCATION/clusters/NAME. At this point, your service will accept both the old and the new audience.

  2. Have your clients begin sending tokens with the new audience, by mounting a dedicated bound token into their pods and configuring their client code to use it.

  3. Update your TokenReview calls to specify only your new audience in the spec.audiences list.

Second, if you have certain requirements, you can consider integrating with Kubernetes using the OpenID Connect Discovery standard. If instances of your service integrate with thousands of individual clusters, need to support high authentication rates, or aim to federate with many non-Kubernetes identity sources, you can consider integrating with Kubernetes using the OpenID Connect Discovery standard, rather than the Kubernetes TokenReview API.

This approach has benefits and downsides: The benefits are:

  • You do not need to manage Kubernetes credentials for your service to authenticate to each federated cluster (in general, OpenID Discovery documents are served publicly).
  • Your service will cache the JWT validation keys for federated clusters, allowing you to authenticate clients even if kube-apiserver is down or overloaded in their clusters.
  • This cache also allows your service to handle higher call rates from clients, with lower latency, by taking the federated kube-apiservers off of the critical path for authentication.
  • Supporting OpenID Connect gives you the ability to federate with additional identity providers beyond Kubernetes clusters.

The downsides are:

  • You will need to operate a cache for the JWT validation keys for all federated clusters, including proper expiry of cached keys (clusters can change their keys without advance warning).
  • You lose some of the security benefits of the TokenReview API; in particular, you will likely not be able to validate the object binding claims.

In general, if the TokenReview API can be made to work for your use case, you should prefer it; it's much simpler operationally, and sidesteps the deceptively difficult problem of properly acting as an OpenID Connect relying party.