Using Workload Identity

This document shows you how to enable and configure Workload Identity on your Google Kubernetes Engine (GKE) clusters. Workload Identity allows workloads in your GKE clusters to impersonate Identity and Access Management (IAM) service accounts to access Google Cloud services.

To learn more about how Workload Identity works, see Workload Identity.

Limitations

  • GKE creates a fixed workload identity pool for each Google Cloud project, with the format PROJECT_ID.svc.id.goog.

  • Workload Identity replaces the need to use Metadata Concealment. The sensitive metadata protected by Metadata Concealment is also protected by Workload Identity.

  • When GKE enables the GKE metadata server on a node pool, Pods can no longer access the Compute Engine metadata server. Instead, the GKE metadata server intercepts requests made from these pods to metadata endpoints, with the exception of Pods running on the host network.

  • Workload Identity can't be used by Pods running on the host network. Requests made from these pods to metadata endpoints are routed to the Compute Engine metadata server.

  • The GKE metadata server takes a few seconds to start accepting requests on a newly created Pod. Therefore, attempts to authenticate using Workload Identity within the first few seconds of a Pod's life might fail. Retrying the call will resolve the problem. See the Troubleshooting section for more details.

  • GKE built-in logging and monitoring agents continue to use the node's service account.

  • Workload Identity requires manual setup for Cloud Run for Anthos to continue releasing request metrics.

  • Workload Identity installs ip-masq-agent if the cluster is created without the --disable-default-snat flag.

  • Workload Identity sets a limit of 200 connections to the GKE metadata server for each node to avoid memory issues. You may experience timeouts if your nodes exceed this limit.

Before you begin

Before you start, make sure you have performed the following tasks:

  • Ensure that you have enabled the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • Ensure that you have installed the Cloud SDK.
  • Set up default gcloud command-line tool settings for your project by using one of the following methods:
    • Use gcloud init, if you want to be walked through setting project defaults.
    • Use gcloud config, to individually set your project ID, zone, and region.

    gcloud init

    1. Run gcloud init and follow the directions:

      gcloud init

      If you are using SSH on a remote server, use the --console-only flag to prevent the command from launching a browser:

      gcloud init --console-only
    2. Follow the instructions to authorize the gcloud tool to use your Google Cloud account.
    3. Create a new configuration or select an existing one.
    4. Choose a Google Cloud project.
    5. Choose a default Compute Engine zone.
    6. Choose a default Compute Engine region.

    gcloud config

    1. Set your default project ID:
      gcloud config set project PROJECT_ID
    2. Set your default Compute Engine region (for example, us-central1):
      gcloud config set compute/region COMPUTE_REGION
    3. Set your default Compute Engine zone (for example, us-central1-c):
      gcloud config set compute/zone COMPUTE_ZONE
    4. Update gcloud to the latest version:
      gcloud components update

    By setting default locations, you can avoid errors in gcloud tool like the following: One of [--zone, --region] must be supplied: Please specify location.

  • Ensure that you have enabled the IAM Service Account Credentials API.

    Enable IAM Credentials API

  • Ensure that you have the following IAM roles:

    • roles/container.admin
    • roles/iam.serviceAccountAdmin

Enabling Workload Identity on a cluster

You can enable Workload Identity on a new or existing GKE Standard cluster by using the gcloud tool. By default, Workload Identity is enabled on GKE Autopilot clusters.

  1. Ensure that you have enabled the IAM Service Account Credentials API.

    Enable IAM Credentials API

  2. To create a new cluster with Workload Identity enabled, use the following command:

    gcloud container clusters create CLUSTER_NAME \
        --workload-pool=PROJECT_ID.svc.id.goog
    

    Replace the following:

    • CLUSTER_NAME: the name of your cluster.
    • PROJECT_ID: the ID of your Google Cloud project.
  3. To enable Workload Identity on an existing cluster, modify the cluster with the following command:

    gcloud container clusters update CLUSTER_NAME \
        --workload-pool=PROJECT_ID.svc.id.goog
    

    Existing node pools are unaffected; new node pools default to --workload-metadata=GKE_METADATA.

Migrate applications to Workload Identity

Select the migration strategy that is ideal for your environment. Node pools can be migrated in place or you can create new node pools with Workload Identity enabled. We recommend creating new node pools if you also need to modify your application to be compatible with this feature.

Add a new node pool to the cluster with Workload Identity enabled and manually migrate workloads to that pool. This succeeds only if Workload Identity is enabled on the cluster.

gcloud container node-pools create NODEPOOL_NAME \
    --cluster=CLUSTER_NAME \
    --workload-metadata=GKE_METADATA

If a cluster has Workload Identity enabled, you can selectively disable it on a specific node pool by explicitly specifying --workload-metadata=GCE_METADATA. See Protecting cluster metadata for more information.

Option 2: Node pool modification

Modify an existing node pool to use the GKE metadata server. This update succeeds only if Workload Identity is enabled on the cluster. Modifying the node pool immediately enables Workload Identity for workloads deployed to the node pool. This change prevents workloads from using the Compute Engine service account and must be carefully rolled out.

gcloud container node-pools update NODEPOOL_NAME \
    --cluster=CLUSTER_NAME \
    --workload-metadata=GKE_METADATA

Authenticating to Google Cloud

This section explains how an application can authenticate to Google Cloud using Workload Identity. Assign a Kubernetes service account to the application and configure the Kubernetes service account to act as an IAM service account.

  1. Configure kubectl to communicate with the cluster:

    gcloud container clusters get-credentials CLUSTER_NAME
    

    Replace CLUSTER_NAME with the name of the cluster you created in the previous step.

  2. Like most other resources, Kubernetes service accounts live in a namespace. Create the namespace to use for the Kubernetes service account.

    kubectl create namespace K8S_NAMESPACE
    
  3. Create the Kubernetes service account to use for your application:

    kubectl create serviceaccount KSA_NAME \
        --namespace K8S_NAMESPACE
    

    Replace the following:

    • KSA_NAME: the name of your new Kubernetes service account.
    • K8S_NAMESPACE: the name of the Kubernetes namespace you created in the previous step.
  4. Configure your application to use the Kubernetes service account:

    spec:
      serviceAccountName: KSA_NAME
    
  5. Create an IAM service account for your application or use an existing IAM service account instead. You can use any IAM service account in any project in your organization. For Config Connector, apply the IAMServiceAccount object for your selected service account.

    gcloud

    To create a new IAM service account using the gcloud tool, run the following command.

    gcloud iam service-accounts create GSA_NAME
    

    Replace GSA_NAME with the name of the new IAM service account.

    Config Connector

    To use a new or existing IAM service account with Config Connector, apply the following configuration file.

    Note: This step requires Config Connector. Follow the installation instructions to install Config Connector on your cluster.

    apiVersion: iam.cnrm.cloud.google.com/v1beta1
    kind: IAMServiceAccount
    metadata:
      name: [GSA_NAME]
    spec:
      displayName: [GSA_NAME]
    To deploy this manifest, download it to your machine as service-account.yaml.

    Use kubectl to apply the manifest:

    kubectl apply -f service-account.yaml
    

    For information on authorizing IAM service accounts to access Google Cloud APIs, see Understanding service accounts.

  6. Ensure that your Google service account has the IAM roles you need. You can grant additional roles using the following command:

    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member "serviceAccount:GSA_NAME@PROJECT_ID.iam.gserviceaccount.com" \
        --role "ROLE_NAME"
    

    Replace the following:

    • PROJECT_ID: your Google Cloud project ID.
    • GSA_NAME: the name of your Google service account.
    • ROLE_NAME: the IAM role to assign to your service account, like roles/spanner.viewer.
  7. Allow the Kubernetes service account to impersonate the Google service account by creating an IAM policy binding between the two. This binding allows the Kubernetes service account to act as the IAM service account.

    gcloud

    gcloud iam service-accounts add-iam-policy-binding GSA_NAME@PROJECT_ID.iam.gserviceaccount.com \
        --role roles/iam.workloadIdentityUser \
        --member "serviceAccount:PROJECT_ID.svc.id.goog[K8S_NAMESPACE/KSA_NAME]"
    

    Config Connector

    Note: This step requires Config Connector. Follow the installation instructions to install Config Connector on your cluster.

    apiVersion: iam.cnrm.cloud.google.com/v1beta1
    kind: IAMPolicy
    metadata:
      name: iampolicy-workload-identity-sample
    spec:
      resourceRef:
        apiVersion: iam.cnrm.cloud.google.com/v1beta1
        kind: IAMServiceAccount
        name: [GSA_NAME]
      bindings:
        - role: roles/iam.workloadIdentityUser
          members:
            - serviceAccount:[PROJECT_ID].svc.id.goog[[K8S_NAMESPACE]/[KSA_NAME]]
    To deploy this manifest, download it to your machine as policy-binding.yaml. Replace GSA_NAME, PROJECT_ID, K8S_NAMESPACE and KSA_NAME the values for your environment. Then, run:

    kubectl apply -f policy-binding.yaml
    
  8. Add the iam.gke.io/gcp-service-account=GSA_NAME@PROJECT_ID annotation to the Kubernetes service account, using the email address of the IAM service account.

    kubectl

    kubectl annotate serviceaccount KSA_NAME \
        --namespace K8S_NAMESPACE \
        iam.gke.io/gcp-service-account=GSA_NAME@PROJECT_ID.iam.gserviceaccount.com
    

    yaml

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      annotations:
        iam.gke.io/gcp-service-account:GSA_NAME@PROJECT_ID.iam.gserviceaccount.com
      name: KSA_NAME
      namespace: K8S_NAMESPACE
    
  9. Verify the service accounts are configured correctly by creating a Pod with the Kubernetes service account that runs the OS-specific container image, then connect to it with an interactive session.

    For Linux nodes

    1. Create a Pod with the Kubernetes service account that runs the cloud-sdk container image:

      Save the following configuration as wi-test.yaml:

      apiVersion: v1
      kind: Pod
      metadata:
        name: workload-identity-test
        namespace: K8S_NAMESPACE
      spec:
        containers:
        - image: google/cloud-sdk:slim
          name: workload-identity-test
          command: ["sleep","infinity"]
        serviceAccountName: KSA_NAME
      

      Create a Pod:

      kubectl apply -f wi-test.yaml
      

      Open an interactive session in the Pod:

      kubectl exec -it workload-identity-test \
          --namespace K8S_NAMESPACE -- /bin/bash
      

      The google/cloud-sdk image includes the gcloud command-line tool which is a convenient way to consume Google Cloud APIs. It might take some time to download the image.

    2. You are now connected to an interactive shell within the created Pod. Run the following command inside the Pod:

      curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/
      

      If the service accounts are correctly configured, the IAM service account email address is listed as the active (and only) identity. This demonstrates that by default, the Pod acts as the IAM service account's authority when calling Google Cloud APIs.

    For Windows Server nodes

    1. Create a Pod with the Kubernetes service account that runs the servercore container image:

      apiVersion: v1
      kind: Pod
      metadata:
        name: workload-identity-test
        namespace: K8S_NAMESPACE
      spec:
        containers:
        - image: IMAGE_NAME
          name: workload-identity-test
          command: ["powershell.exe", "sleep", "3600"]
        serviceAccountName: KSA_NAME
        nodeSelector:
          kubernetes.io/os: windows
          cloud.google.com/gke-os-distribution: windows_ltsc
      

      Replace IMAGE_NAME with one of the following container servercore image values:

      Windows Server node image Container servercore image
      WINDOWS_LTSC,
      WINDOWS_LTSC_CONTAINERD
      mcr.microsoft.com/windows/servercore:ltsc2019
      WINDOWS_SAC,
      WINDOWS_SAC_CONTAINERD
      Check the version mapping between the GKE node version and the Windows SAC version. For Windows Server version 1909, specify mcr.microsoft.com/windows/servercore:1909; otherwise, specify mcr.microsoft.com/windows/servercore:20H2.

      Open an interactive session in the Pod:

      kubectl exec -it workload-identity-test \
          --namespace K8S_NAMESPACE -- powershell
      
    2. You are now connected to an interactive shell within the created Pod. Run the following powershell command inside the Pod:

      Invoke-WebRequest  -Headers @{"Metadata-Flavor"="Google"} -Uri  http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/email  -UseBasicParsing
      

      If the service accounts are correctly configured, the IAM service account email address is listed as the active (and only) identity. This demonstrates that by default, the Pod uses the IAM service account's authority when calling Google Cloud APIs.

Using Workload Identity from your code

Authenticating to Google Cloud services from your code is the same process as authenticating using the Compute Engine metadata server. When you use Workload Identity, your requests to the instance metadata server are routed to the GKE metadata server. Existing code that authenticates using the instance metadata server (like code using the Google Cloud client libraries) should work without modification.

Revoking access

  1. Revoke access to the IAM service account:

    gcloud

    gcloud iam service-accounts remove-iam-policy-binding GSA_NAME@GSA_PROJECT_ID.iam.gserviceaccount.com \
        --role roles/iam.workloadIdentityUser \
        --member "serviceAccount:PROJECT_ID.svc.id.goog[K8S_NAMESPACE/KSA_NAME]"
    

    Replace the following:

    • PROJECT_ID: the project ID of the GKE cluster.
    • K8S_NAMESPACE: the name of the Kubernetes namespace where your Kubernetes service account is located.
    • KSA_NAME: the name of the Kubernetes service account that will have its access revoked.
    • GSA_NAME: the name of the IAM service account.
    • GSA_PROJECT_ID: the project ID of the IAM service account.

    Config Connector

    If you used Config Connector to create the service account, delete the service account with kubectl.

    kubectl delete -f service-account.yaml
    

    This action requires iam.serviceAccounts.setIamPolicy permissions on the service account.

    It can take up to 30 minutes for cached tokens to expire. You can check whether the cached tokens have expired with this command:

    gcloud auth list
    

    The cached tokens have expired if the output of that command no longer includes GSA_NAME@PROJECT_ID.iam.gserviceaccount.com.

  2. Remove the annotation from the Kubernetes service account. This step is optional because access has been revoked by IAM.

    kubectl annotate serviceaccount KSA_NAME \
        --namespace K8S_NAMESPACE iam.gke.io/gcp-service-account-
    

Troubleshooting

Pod can't authenticate to Google Cloud

If your application can't authenticate to Google Cloud, ensure these settings are configured properly:

  1. Ensure that you have enabled the IAM Service Account Credentials API in the project containing the GKE cluster.

    Enable IAM Credentials API

  2. Ensure that Workload Identity is enabled on the cluster by verifying that it has a workload identity pool set:

    gcloud container clusters describe CLUSTER_NAME \
        --format="value(workloadIdentityConfig.workloadPool)"
    

    If you haven't already specified a default zone or region for gcloud, you may also need to specify a --region or --zone flag when running this command.

  3. Ensure that GKE metadata server (GKE_METADATA) is configured on the node pool where your application is running:

    gcloud container node-pools describe NODEPOOL_NAME \
        --cluster=CLUSTER_NAME \
        --format="value(config.workloadMetadataConfig.mode)"
    
  4. Ensure that the Kubernetes service account is annotated correctly

    kubectl describe serviceaccount \
        --namespace K8S_NAMESPACE KSA_NAME
    

    There should be an annotation in the following format:

    iam.gke.io/gcp-service-account: GSA_NAME@PROJECT_ID.iam.gserviceaccount.com
    
  5. Ensure the IAM service account is configured correctly:

    gcloud iam service-accounts get-iam-policy \
        GSA_NAME@PROJECT_ID.iam.gserviceaccount.com
    

    Verify that there is a binding with in the following format:

    - members:
      - serviceAccount:PROJECT_ID.svc.id.goog[K8S_NAMESPACE/KSA_NAME]
      role: roles/iam.workloadIdentityUser
    
  6. If you have a cluster network policy, ensure that you allowed egress to 127.0.0.1/32 on port 988 for clusters running GKE versions prior to 1.21.0-gke.1000, or to 169.254.169.252/32 on port 988 for clusters running GKE version 1.21.0-gke.1000 and later.

    kubectl describe networkpolicy NETWORK_POLICY_NAME
    

Timeout errors at Pod start up

The GKE metadata server needs a few seconds before it can start accepting requests on a newly created Pod. Therefore, attempts to authenticate using Workload Identity within the first few seconds of a Pod's life may fail for applications and Google Cloud client libraries configured with a short timeout.

If you encounter timeout errors, you can change the application code to wait a few seconds and retry. Alternatively, you can deploy an initContainer that waits until the GKE metadata server is ready before running the Pod's main container.

Here is a Pod with an example initContainer:

apiVersion: v1
kind: Pod
metadata:
  name: pod-with-initcontainer
spec:
  serviceAccountName: KSA_NAME
  initContainers:
  - image:  gcr.io/google.com/cloudsdktool/cloud-sdk:326.0.0-alpine
    name: workload-identity-initcontainer
    command:
    - '/bin/bash'
    - '-c'
    - |
      curl -s -H 'Metadata-Flavor: Google' 'http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token' --retry 30 --retry-connrefused --retry-max-time 30 > /dev/null || exit 1
  containers:
  - image: gcr.io/your-project/your-image
    name: your-main-application-container

Workload Identity fails due to control plane unavailability

The metadata server is not able to return the Workload Identity when the cluster control plane is unavailable. Calls to the metadata server return status code 500.

The log entry might appear similar to the following in the Logs Explorer:

dial tcp 35.232.136.58:443: connect: connection refused

This will lead to expected unavailability of Workload Identity.

The control plane might be unavailable on zonal clusters on cluster maintenance like rotating IPs, upgrading control plane VMs, or resizing clusters or node pools. See more on Choosing a regional or zonal control plane to learn about control plane availability. Switching to regional cluster will eliminate this issue.

Workload Identity fails

If the GKE metadata server is blocked for any reason, Workload Identity will fail.

If you are using Istio, you should add the following application-level annotation to all workloads that use Workload Identity:

"traffic.sidecar.istio.io/excludeOutboundIPRanges=169.254.169.254/32"

Alternatively, you can change the global.proxy.excludeIPRanges Istio ConfigMap key to do the same thing.

Disabling Workload Identity on a cluster

You can only disable Workload Identity on GKE Standard clusters.

  1. Disable Workload Identity on each node pool:

    gcloud container node-pools update NODEPOOL_NAME \
        --cluster=CLUSTER_NAME \
        --workload-metadata=GCE_METADATA
    

    Repeat this command for every node pool in the cluster.

  2. Disable Workload Identity in the cluster:

    gcloud container clusters update CLUSTER_NAME --disable-workload-identity
    

    This action requires container.clusters.update permissions on the cluster.

Disabling Workload Identity in your organization

From a security perspective, Workload Identity allows GKE to assert Kubernetes service account identities that can be authenticated and authorized to Google Cloud resources. Administrators who have taken actions to isolate workloads from Google Cloud resources, like disabling service account creation or disabling service account key creation, might also want to disable Workload Identity for your organization.

See these instructions for disabling Workload Identity for your organization.

What's next