Set up and use cluster keyless mode

This document describes how to set up and use the cluster keyless mode capability for Google Distributed Cloud (software only) on bare metal. Instead of service account keys, keyless mode uses short-lived tokens and Workload Identity Federation to create and secure your clusters. The short-lived credentials for the service account are in the form of OAuth 2.0 access tokens. The access tokens expire after 1 hour by default, except for image pull tokens, which expire after 12 hours.

Keyless mode is available for version 1.30 and higher clusters only.

By contrast, keyed mode, the standard method for creating and securing clusters, uses downloaded service account keys. When you create a self-managed (admin, hybrid, or standalone) cluster, you specify the path to the downloaded keys. The keys are then stored as Secrets in the cluster and any managed user clusters. By default, service account keys don't expire and are a security risk if not managed correctly.

Keyless mode provides two main benefits over using service account keys:

  • Improved security: Service account keys are a security risk if not managed correctly. OAuth 2.0 tokens and Workload Identity Federation are considered best practice alternatives to service account keys. For more information on service account tokens, see Short-lived service account credentials. For more information about Workload Identity Federation, see Workload Identity Federation.

  • Reduced maintenance: Service account keys require more maintenance. Regularly rotating and securing these keys can be a significant administrative burden.

While this feature is in Preview, there are some known limitations.

Before you begin

In the following sections, you create service accounts and grant roles needed for keyless mode. The setup instructions in this document aren't a replacement for the instructions in Set up Google Cloud resources, they are required in addition to the standard Google Distributed Cloud software-only installation prerequisites. The service accounts required for keyless mode are similar to the service accounts described in Set up Google Cloud resources, but they are uniquely named, so they don't interfere with clusters that use the default service account keys.

This page is for Admins and architects and Operators who set up, monitor, and manage the lifecycle of the underlying tech infrastructure. To learn more about common roles and example tasks that we reference in Google Cloud content, see Common GKE Enterprise user roles and tasks.

The following table describes the service accounts required for keyless mode:

Service account Purpose Roles
ADMIN_SA You use this service account to generate tokens. Each token has the privileges associated with the service account roles. roles/gkehub.admin
roles/logging.admin
roles/monitoring.admin
roles/monitoring.dashboardEditor
roles/iam.serviceAccountAdmin
roles/iam.serviceAccountTokenCreator
baremetal-controller Connect Agent uses this service account to maintain a connection between your cluster and Google Cloud and to register your clusters with a fleet. This service account also refreshes tokens for the baremetal-gcr service account. roles/gkehub.admin
roles/monitoring.dashboardEditor
roles/serviceusage.serviceUsageViewer
baremetal-cloud-ops Stackdriver Agent uses this service account to export logs and metrics from clusters to Cloud Logging and Cloud Monitoring. roles/logging.logWriter
roles/monitoring.metricWriter
roles/stackdriver.resourceMetadata.writer
roles/opsconfigmonitoring.resourceMetadata.writer
roles/monitoring.dashboardEditor
roles/monitoring.viewer
roles/serviceusage.serviceUsageViewer
roles/kubernetesmetadata.publisher
baremetal-gcr Google Distributed Cloud uses this service account to download container images from Container Registry. None

Create and configure service accounts for keyless mode

The following sections contain instructions to create the required service accounts and grant them the necessary roles for keyless mode. For a list of the service accounts and their required roles, see the table in the preceding section.

Create service accounts

To create the service accounts for keyless mode, use the following steps:

  1. On your admin workstation, log in to Google Cloud CLI:

    gcloud auth login
    
  2. Optionally, create the administrative service account:

    The name of the ADMIN_SA service account is arbitrary. You can even use an existing service account, if it has the roles identified in the table in the previous section, but that's not recommended because it goes against the principle of least privilege.

    gcloud iam service-accounts create ADMIN_SA \
        --project=PROJECT_ID
    

    Replace PROJECT_ID with the ID of your Google Cloud project.

  3. Create the standard service accounts for the keyless feature:

    The standard service accounts for the keyless feature have predetermined names that can be customized, if you want.

    gcloud iam service-accounts create baremetal-controller \
        --project=PROJECT_ID
    
    gcloud iam service-accounts create baremetal-cloud-ops \
        --project=PROJECT_ID
    
    gcloud iam service-accounts create baremetal-gcr \
        --project=PROJECT_ID
    

    Replace PROJECT_ID with the ID of your Google Cloud project.

Add Identity and Access Management policy bindings for service accounts

  1. Add IAM policy bindings for required roles for the ADMIN_SA service account:

    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/gkehub.admin
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/logging.admin
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/monitoring.admin
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/monitoring.dashboardEditor
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/iam.serviceAccountAdmin
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/iam.serviceAccountTokenCreator
    
  2. Add IAM policy bindings for required roles for the baremetal-controller service account:

    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-controller@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/gkehub.admin
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-controller@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/monitoring.dashboardEditor
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-controller@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/serviceusage.serviceUsageViewer
    
  3. Add IAM policy bindings for required roles for the baremetal-cloud-ops service account:

    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/logging.logWriter
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/monitoring.dashboardEditor
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/monitoring.metricWriter
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/opsconfigmonitoring.resourceMetadata.writer
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/stackdriver.resourceMetadata.writer
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/monitoring.viewer
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/serviceusage.serviceUsageViewer
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/kubernetesmetadata.publisher
    
  4. Grant the baremetal-controller service account the ability to generate access tokens on behalf of the baremetal-gcr service account:

    gcloud iam service-accounts add-iam-policy-binding \
        baremetal-gcr@PROJECT_ID.iam.gserviceaccount.com \
        --member=serviceAccount:baremetal-controller@PROJECT_ID.iam.gserviceaccount.com \
        --role=roles/iam.serviceAccountTokenCreator
    

Configure Workload Identity Federation for your clusters

To provide Google Cloud access with Workload Identity Federation for GKE, you create an IAM allow policy that grants access on a specific Google Cloud resource to a principal that corresponds to your application's identity. In this case, Workload Identity Federation grants access to specific operators in the cluster. For more information on Workload Identity Federation for GKE, see Workload Identity Federation in the IAM documentation.

Add IAM policy bindings for the cluster operator

The following commands grant the anthos-cluster-operator Kubernetes service account the ability to impersonate the baremetal-controller service account and interact with Google Cloud resources on behalf of the cluster:

  1. For each keyless cluster (or planned keyless cluster), including the bootstrap cluster, grant anthos-cluster-operator in the cluster the ability to impersonate the baremetal-controller service account:

    In the following command, the principalSet consists of the workload identity pool and a Kubernetes service account, anthos-cluster-operator, in the kube-system namespace.

    gcloud iam service-accounts add-iam-policy-binding \
        baremetal-controller@PROJECT_ID.iam.gserviceaccount.com \
        --member=principalSet://iam.googleapis.com/projects/PROJECT_NUM/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/attribute.fleetclusteridentity/projects/PROJECT_ID/locations/REGION/memberships/CLUSTER_NAME/ns/kube-system/sa/anthos-cluster-operator \
        --role=roles/iam.workloadIdentityUser \
        --project=PROJECT_ID
    

    Replace the following:

    • PROJECT_NUM: the automatically generated unique identifier for your project.

    • REGION: the fleet membership location for your cluster, which is global, by default. For more information, see Fleet membership location.

    • CLUSTER_NAME: the name of the cluster. By default, the bootstrap cluster name is bmctl-MACHINE_NAME.

  2. Verify the policy bindings for the baremetal-controller service account:

    gcloud iam service-accounts get-iam-policy \
        baremetal-controller@PROJECT_ID.iam.gserviceaccount.com
    

    The response should look similar to the following:

    bindings:
    - members:
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/bmctl-admin-ws/kube-system/anthos-cluster-operator
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/admin-cluster/kube-system/anthos-cluster-operator
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/user-cluster/kube-system/anthos-cluster-operator
      role: roles/iam.workloadIdentityUser
    etag: BwYoN3QLig0=
    version: 1
    

Add IAM policy bindings for the Google Cloud Observability operators

The following commands grant the following Google Cloud Observability Kubernetes service accounts the ability to impersonate the baremetal-cloud-ops service account and interact with Google Cloud resources on behalf of the cluster:

  • cloud-audit-logging
  • gke-metrics-agent
  • kubestore-collector
  • metadata-agent
  • stackdriver-log-forwarder
  1. For each keyless cluster (or planned keyless cluster), including the bootstrap cluster, grant the Google Cloud Observability operators in the cluster the ability to impersonate the baremetal-cloud-ops service account:

    In each of the following commands, the principalSet consists of the workload identity pool and a Kubernetes service account, such as cloud-audit-logging, in the kube-system namespace.

    gcloud iam service-accounts add-iam-policy-binding \
        baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --member=principalSet://iam.googleapis.com/projects/PROJECT_NUM/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/attribute.fleetclusteridentity/projects/PROJECT_ID/locations/REGION/memberships/CLUSTER_NAME/ns/kube-system/sa/cloud-audit-logging \
        --role=roles/iam.workloadIdentityUser \
        --project=PROJECT_ID
    
    gcloud iam service-accounts add-iam-policy-binding \
        baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --member=principalSet://iam.googleapis.com/projects/PROJECT_NUM/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/attribute.fleetclusteridentity/projects/PROJECT_ID/locations/REGION/memberships/CLUSTER_NAME/ns/kube-system/sa/gke-metrics-agent \
        --role=roles/iam.workloadIdentityUser \
        --project=PROJECT_ID
    
    gcloud iam service-accounts add-iam-policy-binding \
        baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --member=principalSet://iam.googleapis.com/projects/PROJECT_NUM/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/attribute.fleetclusteridentity/projects/PROJECT_ID/locations/REGION/memberships/CLUSTER_NAME/ns/kube-system/sa/kubestore-collector \
        --role=roles/iam.workloadIdentityUser \
        --project=PROJECT_ID
    
    gcloud iam service-accounts add-iam-policy-binding \
        baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --member=principalSet://iam.googleapis.com/projects/PROJECT_NUM/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/attribute.fleetclusteridentity/projects/PROJECT_ID/locations/REGION/memberships/CLUSTER_NAME/ns/kube-system/sa/metadata-agent \
        --role=roles/iam.workloadIdentityUser \
        --project=PROJECT_ID
    
    gcloud iam service-accounts add-iam-policy-binding \
        baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com \
        --member=principalSet://iam.googleapis.com/projects/PROJECT_NUM/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/attribute.fleetclusteridentity/projects/PROJECT_ID/locations/REGION/memberships/CLUSTER_NAME/ns/kube-system/sa/stackdriver-log-forwarder \
        --role=roles/iam.workloadIdentityUser \
        --project=PROJECT_ID
    
  2. Verify the policy bindings for the baremetal-cloud-ops service account:

    gcloud iam service-accounts get-iam-policy \
        baremetal-cloud-ops@PROJECT_ID.iam.gserviceaccount.com
    

    The response should look similar to the following:

    bindings:
    - members:
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/bmctl-admin-ws/kube-system/cloud-audit-logging
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/bmctl-admin-ws/kube-system/gke-metrics-agent
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/bmctl-admin-ws/kube-system/kubestore-collector
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/bmctl-admin-ws/kube-system/metadata-agent
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/bmctl-admin-ws/kube-system/stackdriver-log-forwarder
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/admin-cluster/kube-system/cloud-audit-logging
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/admin-cluster/kube-system/gke-metrics-agent
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/admin-cluster/kube-system/kubestore-collector
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/admin-cluster/kube-system/metadata-agent
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/admin-cluster/kube-system/stackdriver-log-forwarder
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/user-cluster/kube-system/cloud-audit-logging
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/user-cluster/kube-system/gke-metrics-agent
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/user-cluster/kube-system/kubestore-collector
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/user-cluster/kube-system/metadata-agent
      - principalSet://iam.googleapis.com/projects/112233445566/locations/global/workloadIdentityPools/my-project.svc.id.goog/attribute.fleetclusteridentity/user-cluster/kube-system/stackdriver-log-forwarder
      role: roles/iam.workloadIdentityUser
    etag: BwYhT4gL-dY=
    version: 1
    

Cluster configuration

The most obvious cluster configuration difference for clusters that use keyless mode is that you don't specify paths to downloaded service account keys.

  1. When you fill in your cluster settings in the configuration file, leave the service account key paths in the credential section blank as shown in the following example:

    gcrKeyPath:
    sshPrivateKeyPath: /home/USERNAME/.ssh/id_rsa
    gkeConnectAgentServiceAccountKeyPath:
    gkeConnectRegisterServiceAccountKeyPath:
    cloudOperationsServiceAccountKeyPath:
    ---
    apiVersion: v1
    kind: Namespace
    metadata:
      name: cluster-CLUSTER_NAME
    ---
    apiVersion: baremetal.cluster.gke.io/v1
    kind: Cluster
    metadata:
      name: CLUSTER_NAME
      namespace: cluster-CLUSTER_NAME
    spec:
      type: admin
      profile: default
      anthosBareMetalVersion: 1.30.0-gke.1930
      ...
    
  2. Optionally, set custom names for the keyless mode service accounts:

    Specifying custom names lets you use existing service accounts. By specifying the same custom name for more than one service account, you can consolidate to fewer service accounts.

    apiVersion: baremetal.cluster.gke.io/v1
    kind: Cluster
    metadata:
      name: CLUSTER_NAME
      namespace: cluster-CLUSTER_NAME
      annotations:
        baremetal.cluster.gke.io/controller-service-account: "CUSTOM_CONTROLLER_GSA"
        baremetal.cluster.gke.io/cloud-ops-service-account: "CUSTOM_CLOUD_OPS_GSA"
        baremetal.cluster.gke.io/gcr-service-account: "CUSTOM_GCR_GSA"
    spec:
      type: admin
      profile: default
      anthosBareMetalVersion: 1.30.0-gke.1930
        ...
    

Cluster operation

When you're ready to create, upgrade, or delete a keyless mode cluster, use the following steps:

  1. Log in to Google Cloud CLI:

    gcloud auth login
    
  2. On your admin workstation, create and download a key for the ADMIN_SA service account:

    gcloud iam service-accounts keys create TMP_KEY_FILE_PATH \
        --iam-account=ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com
    

    Replace TMP_KEY_FILE_PATH with the path, including the filename, of the downloaded key file.

  3. Authorize access to Google Cloud with the ADMIN_SA service account:

    gcloud auth activate-service-account ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com \
        --key-file=TMP_KEY_FILE_PATH
    
  4. Delete the downloaded JSON key file:

    rm TMP_KEY_FILE_PATH
    
  5. On your admin workstation, create a GCP_ACCESS_TOKEN environment variable with the value of an access token created by the ADMIN_SA service account:

    export GCP_ACCESS_TOKEN=$(gcloud auth print-access-token \
        --impersonate-service-account=ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com)
    

    By default, the access token has a lifetime of 1 hour.

  6. Verify that the token is generated by the ADMIN_SA service account with the correct expiration:

    curl "https://oauth2.googleapis.com/tokeninfo?access_token=$GCP_ACCESS_TOKEN"
    

    The response should include lines that look similar to the following:

    ...
    "expires_in": "3582",
    "email": "ADMIN_SA@PROJECT_ID.iam.gserviceaccount.com)",
    ...
    

    The expiration value is in seconds and should be less than 3600, indicating that the token expires in less than an hour.

  7. Run a bmctl command to create, upgrade, or delete a keyless mode cluster:

    If bmctl detects that the GCP_ACCESS_TOKEN environment variable has been set, it performs token validation. If the token is valid, bmctl uses it for keyless mode cluster operations.

    For clusters that use keyless mode, the following commands require that the GCP_ACCESS_TOKEN environment variable be set to a valid, active access token:

    • bmctl create cluster -c CLUSTER_NAME
    • bmctl reset cluster -c CLUSTER_NAME
    • bmctl upgrade cluster -c CLUSTER_NAME

Limitations

While keyless mode is in Preview, the following features aren't support for use with clusters running in keyless mode:

  • Using a proxy server
  • VPC Service Controls

What's next