Workload Identity


Workload Identity is the recommended way for your workloads running on Google Kubernetes Engine (GKE) to access Google Cloud services in a secure and manageable way.

For more information on how to enable and use Workload Identity in GKE, see Use Workload Identity.

You can use fleet workload identity to provide workload identity federation support for clusters registered in fleets, including Anthos clusters.

Terminology

This document distinguishes between Kubernetes service accounts and Identity and Access Management (IAM) service accounts.

Kubernetes service accounts
Kubernetes resources that provide an identity for processes running in your GKE pods.
IAM service accounts
Google Cloud resources that allow applications to make authorized calls to Google Cloud APIs.

What is Workload Identity?

Applications running on GKE might need access to Google Cloud APIs such as Compute Engine API, BigQuery Storage API, or Machine Learning APIs.

Workload Identity allows a Kubernetes service account in your GKE cluster to act as an IAM service account. Pods that use the configured Kubernetes service account automatically authenticate as the IAM service account when accessing Google Cloud APIs. Using Workload Identity allows you to assign distinct, fine-grained identities and authorization for each application in your cluster.

How Workload Identity works

When you enable Workload Identity on a cluster, GKE automatically creates a fixed workload identity pool for the cluster's Google Cloud project. A workload identity pool allows IAM to understand and trust Kubernetes service account credentials. The workload identity pool has the following format:

PROJECT_ID.svc.id.goog

GKE uses this pool for all clusters in the project that use Workload Identity.

When you configure a Kubernetes service account in a namespace to use Workload Identity, IAM authenticates the credentials using the following member name:

serviceAccount:PROJECT_ID.svc.id.goog[KUBERNETES_NAMESPACE/KUBERNETES_SERVICE_ACCOUNT]

In this member name:

  • PROJECT_ID: your Google Cloud project ID.
  • KUBERNETES_NAMESPACE: the namespace of the Kubernetes service account.
  • KUBERNETES_SERVICE_ACCOUNT: the name of the Kubernetes service account making the request.

The process of configuring Workload Identity includes using an IAM policy binding to bind the Kubernetes service account member name to an IAM service account that has the permissions your workloads need. Any Google Cloud API calls from workloads that use this Kubernetes service account are authenticated as the bound IAM service account.

Identity sameness

The member name that IAM uses to verify a Kubernetes service account with Workload Identity uses the following variables:

  • The Kubernetes service account name.
  • The namespace of the Kubernetes service account.
  • The Google Cloud project ID.

If your project has multiple clusters that have the same name and namespace for a Kubernetes service account, all the accounts resolve to the same member name. This common identity allows you to grant access to Google Cloud resources to the workload identity pool instead of individual clusters.

For example, consider the following diagram. Clusters A, B, and C belong to the same Google Cloud project, and therefore to the same workload identity pool. Applications in the backend namespace of both Cluster A and Cluster B can authenticate as the back IAM service account when accessing Google Cloud resources. IAM doesn't distinguish between the clusters making the calls.

Diagram illustrating identity sameness within a workload identity pool
Identity sameness accessing Google Cloud APIs with Workload Identity

This identity sameness also means that you must be able to trust every cluster in a specific workload identity pool. For example, if Cluster C in the previous example was owned by an untrusted team, they could create a backend namespace and access Google Cloud APIs using the back IAM service account, just like Cluster A and Cluster B.

To avoid untrusted access, place your clusters in separate projects to ensure that they get different workload identity pools, or ensure that the namespace names are distinct from each other to avoid a common member name.

Understanding the GKE metadata server

Every node in a GKE with Workload Identity enabled stores its metadata on the GKE metadata server. The GKE metadata server is a subset of the Compute Engine metadata server endpoints required for Kubernetes workloads.

The GKE metadata server runs as a DaemonSet, with one Pod on every Linux node or a native Windows service on every Windows node in the cluster. The metadata server intercepts HTTP requests to http://metadata.google.internal (169.254.169.254:80). For example, the GET /computeMetadata/v1/instance/service-accounts/default/token request retrieves a token for the IAM service account that the Pod is configured to impersonate. Traffic to the GKE metadata server never leaves the VM instance that hosts the Pod.

The following tables describe the subset of Compute Engine metadata server endpoints available with the GKE metadata server. For a full list of endpoints available in the Compute Engine metadata server, see Default VM metadata values.

Instance metadata

Instance metadata is stored under the following directory. For a full list of Compute Engine instance metadata entries, see VM instance metadata.

http://metadata.google.internal/computeMetadata/v1/instance/

Entry Description
hostname

The hostname of your node.

id

The unique ID of your node.

service-accounts/

A directory of service accounts associated with the node. For each service account, the following information is available:

  • aliases
  • email: the service account email address.
  • identity: a JSON Web Token (JWT) unique to the node. You must include the audience parameter in your request. For example, ?audience=http://www.example.com.
  • scopes: the access scopes assigned to the service account.
  • token: the OAuth 2.0 access token to authenticate your workloads.

Instance attributes

Instance metadata is stored under the following directory. For a full list of Compute Engine instance attribute entries, see Instance attributes.

http://metadata.google.internal/computeMetadata/v1/instance/attributes/

Entry Description
cluster-location

The Compute Engine zone or region of your cluster.

cluster-name

The name of your GKE cluster.

cluster-uid

The UID of your GKE cluster.

Project metadata

Cluster project metadata is stored under the following directory. For a full list of Compute Engine project metadata entries, see Project metadata.

http://metadata.google.internal/computeMetadata/v1/project/

Entry Description
project-id

Your Google Cloud project ID.

numeric-project-id

Your Google Cloud project number.

Alternatives to Workload Identity

You can use one of the following alternatives to Workload Identity to access Google Cloud APIs from GKE.

  • Export service account keys and store them as Kubernetes Secrets. Google service account keys do not expire and require manual rotation. Exporting service account keys has the potential to expand the scope of a security breach if it goes undetected. If an exported key is stolen, an attacker can use it to authenticate as that service account until you notice and manually revoke the key.

  • Use the Compute Engine default service account of your nodes. You can run node pools as any IAM service account in your project. If you do not specify a service account during node pool creation, GKE uses the Compute Engine default service account for the project. The Compute Engine service account is shared by all workloads deployed on that node. This can result in over-provisioning of permissions, which violates the principle of least privilege and is inappropriate for multi-tenant clusters.

What's next