Best practices for working with service accounts

Service accounts represent non-human users. They're intended for scenarios where a workload, such as a custom application, needs to access resources or perform actions without end-user involvement.

Service accounts differ from normal user accounts in multiple ways:

  • They don't have a password and can't be used for browser-based sign-in.
  • They're created and managed as a resource that belongs to a Google Cloud project. In contrast, users are managed in a Cloud Identity or Google Workspace account.
  • They're specific to Google Cloud. In contrast, the users managed in Cloud Identity or Google Workspace work across a multitude of Google products and services.
  • They're both a resource and a principal:
    • As a principal, a service account can be granted access to resources, like a Cloud Storage bucket.
    • As a resource, a service account can be accessed and possibly impersonated by other principals, like a user or group.

Although service accounts are a useful tool, there are several ways in which a service account can be abused:

  • Privilege escalation: A bad actor might gain access to resources they otherwise wouldn't have access to by impersonating the service account.
  • Spoofing: A bad actor might use service account impersonation to obscure their identity.
  • Non-repudiation: A bad actor might conceal their identity and actions by using a service account to carry out operations on their behalf. In some cases, it might not be possible to trace these actions to the bad actor.
  • Information disclosure: A bad actor might derive information about your infrastructure, applications, or processes from the existence of certain service accounts.

To help secure service accounts, consider their dual nature:

  • Because a service account is a principal, you must limit its privileges to reduce the potential harm that can be done by a compromised service account.
  • Because a service account is a resource, you must protect it from being compromised.

This guide presents best practices for managing, using, and securing service accounts.

Choose when to use service accounts

Service accounts can be used for many different purposes, but they aren't always the best choice. The following section provides guidance about when to use service accounts, and when to avoid them.

Use service accounts for unattended scenarios

Not every application interacts with human users—instead, an application might be running in the background unattended. Unattended applications include batch jobs, worker processes that dispatch messages read from a queue, or a resource-monitoring agent.

Whenever an unattended application needs to access a resource, like a Cloud Storage bucket, it must act on its own behalf, not on behalf of any end user. To act on its own behalf, an application needs its own identity that's unrelated to any end-user identity.

To equip an application with its own identity, create a service account for the application, and grant the service account access to the resources that the application needs to access. By letting an application use its own service account, you help ensure that the application can work without user interaction. In addition, you also ensure that any resource accesses initiated by the application can be attributed back to the same application.

Use service accounts to perform a transition between principals

An application that interacts with end users can use Google Sign-In to authenticate those users, but it isn't obliged to do so: Instead, an application might rely on a third-party identity provider or might implement a custom authentication scheme to authenticate its users.

If an application uses third-party or custom identities and needs to access a resource, such as a BigQuery dataset or a Cloud Storage bucket, it must perform a transition between principals. Because Google Cloud APIs don't recognize third-party or custom identities, the application can't propagate the end-user's identity to BigQuery or Cloud Storage. Instead, the application has to perform the access by using a different Google identity.

To let an application perform a transition between principals, create a service account for the application and grant it access to the resources that the application needs to access. Whenever the application needs to access a Google Cloud resource, make sure that it has authenticated the end user first, then let it use the service account to access the resource.

Transitions between principals can limit the usefulness of Cloud Audit Logs of affected Google Cloud resources: Because the application uses a service account to access resources, Cloud Audit Logs might not contain a clear indication of whether or not an action was done on behalf of a particular end user. To help ensure non-repudiability, extend your application to write a custom log record whenever a transition between principals occurs. That way, you can trace back which end user triggered a resource access.

An application might require access to sensitive or personal user data. Examples of such data include a user's mailbox or calendar, documents stored on Drive, or a BigQuery dataset that contains sensitive data.

Using a service account to access user data can be appropriate if the application performs unattended background tasks, such as indexing or data loss prevention (DLP) scans, or if the end user hasn't authenticated with a Google identity. In all other scenarios where an application acts on an end user's behalf, it's best to avoid using service accounts.

Instead of using a service account to access user data (possibly performing a principal transition), use the OAuth consent flow to request the end user's consent. Then let the application act under the end user's identity. By using OAuth instead of a service account, you help ensure that:

  • Users can review which resources they're about to grant the application access to, and can explicitly express or deny their consent.
  • Users can revoke their consent on their My Account page at any time.
  • You don't need a service account that has unfettered access to all user's data.

By letting the application use end-user credentials, you defer permission checks to Google Cloud APIs. This approach limits the risk of accidentally exposing data that the user shouldn't be allowed to access because of a coding error (confused deputy problem).

Don't use service accounts during development

During your daily work, you might use tools such as the Google Cloud CLI, gsutil, or terraform. Don't use a service account to run these tools. Instead, let them use your credentials by running gcloud auth login (for the gcloud CLI and gsutil) or gcloud auth application-default login (for terraform and other third-party tools) first.

You can use a similar approach for developing and debugging applications that you plan to deploy on Google Cloud. Once deployed, the application might require a service account—but if you run it on your local workstation, you can let it use your personal credentials.

To help ensure that your application supports both personal credentials and service account credentials, use the Cloud Client Libraries to find credentials automatically.

Choose how to authenticate with service accounts

The typical way for users to authenticate to Google Cloud is to sign in by using a username and password or to use single sign-on (SSO). Service accounts do not have a password and can't use SSO. Instead, service accounts support a different set of authentication methods. The following section provides best practices for selecting an authentication method.

How to use service accounts

Use attached service accounts when possible

To allow an application deployed on Google Cloud to use a service account, attach the service account to the underlying compute resource. By attaching the service account, you enable the application to obtain tokens for the service account and to use the tokens to access Google Cloud APIs and resources.

To obtain access tokens in the application, use the client libraries if possible. The client libraries automatically detect if the application is running on a compute resource with an attached service account.

In situations when using the client libraries isn't practical, adjust your application to programmatically obtain tokens from the metadata server. Compute resources that support access to the metadata server include:

For a full list of compute resources that let you attach a service account, see Managing service account impersonation.

Use Workload Identity to attach service accounts to Kubernetes pods

If you use Google Kubernetes Engine, then you might be running a combination of different applications on a single GKE cluster. The individual applications are likely to differ in which resources and APIs they need to access.

If you attach a service account to a GKE cluster or one of its node pools, then, by default, all pods running on the cluster or node pool can impersonate the service account. Sharing a single service account across different applications makes it difficult to assign the correct set of privileges to the service account:

  • If you only grant access to resources that all applications require, then some applications might fail to work because they lack access to certain resources.
  • If you grant access to all resources that any particular application needs, then you might be over-granting access.

A better approach to manage access to resources in a GKE environment is to use Workload Identity:

  1. Don't attach service accounts to GKE clusters or node pools.
  2. Create a dedicated service account for each Kubernetes pod that requires access to Google APIs or resources.
  3. Create a Kubernetes service account for each Kubernetes pod that requires access to Google APIs or resources and attach it to the pod.
  4. Use Workload Identity to create a mapping between the service accounts and their corresponding Kubernetes service accounts.

Use workload identity federation to let applications running on-premises or on other cloud providers use a service account

If your application runs on-premises or on another cloud provider, then you can't attach a service account to the underlying compute resources. However, the application might have access to environment-specific credentials such as:

  • AWS temporary credentials
  • Azure Active Directory access tokens
  • OpenID access tokens or ID tokens issued by an on-premises identity provider like Active Directory Federation Services (AD FS) or KeyCloak

If your application has access to one of these credentials and needs access to Google Cloud APIs or resources, use Workload Identity federation.

Workload identity federation lets you create a one-way trust relationship between a Google Cloud project and an external identity provider. Once you've established the trust, applications can use credentials issued by the trusted identity provider to impersonate a service account by following a three-step process:

  1. Obtain a credential from the trusted identity provider, for example an OpenID Connect ID token.
  2. Use the Security Token Service (STS) API to exchange the credential against a short-lived Google STS token.
  3. Use the STS token to authenticate to the IAM Service Account Credentials API and obtain short-lived Google access tokens for a service account.

By using workload identity federation, you can let applications use the authentication mechanisms that the external environment provides and you avoid having to store and manage service account keys.

Use the IAM Credentials API to broker credentials

Some applications only require access to certain resources at specific times or under specific circumstances. For example:

  • An application might require access to configuration data during startup, but might not require that access once it's initialized.
  • A supervisor application might periodically start background jobs where each job has different access requirements.

In such scenarios, using a single service account and granting it access to all resources goes against the principle of least privilege: At any point in time, the application is likely to have access to more resources than it actually needs.

To help ensure that the different parts of your application only have access to the resources they need, use the IAM Credentials API to broker short-lived credentials:

  • Create dedicated service accounts for each part of the application or use case and only grant the service account access to the necessary resources.
  • Create another service account that acts as the supervisor. Grant the supervisor service account the Service Account Token Creator role on the other service accounts so that it can request short-lived access tokens for these service accounts.
  • Split your application so that one part of the application serves as token broker and only let this part of the application use the supervisor service accounts.
  • Use the token broker to issue short-lived service accounts to the other parts of the application.

Use service account keys if there's no viable alternative

Occasionally, you might encounter a situation where attaching a service account isn't possible, and using Workload Identity or workload identity federation aren't viable options either. For example, one of your on-premises applications might need access to Google Cloud resources, but your on-premises identity provider isn't compatible with OpenID Connect and therefore can't be used for workload identity federation.

In situations where other authentication approaches aren't viable, create a service account key for the application. A service account key lets an application authenticate as a service account, similar to how a user might authenticate with a username and password. Service account keys are a type of secret and must be protected from unauthorized access. It's best to store them in a secure location, like a key vault, and to rotate them frequently.

Manage service accounts

Service accounts differ from normal user accounts, not only in how they're used, but also in how they must be managed. The following sections provide best practices for managing service accounts.

Manage service accounts as resources

Regular user accounts are typically managed according to an organization's joiner-mover-leaver processes: When an employee joins, a new user account is created for them. When they move departments, their user account is updated. And when they leave the company, their user account is suspended or deleted.

In contrast, service accounts aren't associated with any particular employee. Instead, it's best to think of service accounts as resources that belong to—or are part of—another resource, such as a particular VM instance or an application.

To effectively manage service accounts, don't look at service accounts in isolation. Instead, consider them in the context of the resource they're associated with and manage the service account and its associated resource as one unit: Apply the same processes, same lifecycle, and same diligence to the service account and its associated resource, and use the same tools to manage them.

Create single-purpose service accounts

Sharing a single service account across multiple applications can complicate the management of the service account:

  • The applications might have different life cycles. If an application is decommissioned, it might not be clear whether the service account can be decommissioned as well or whether it's still needed.
  • Over time, the access requirements of applications might diverge. If applications use the same service account, then you might need to grant the service account access to an increasing number of resources, which in turn increases the overall risk.
  • Cloud Audit Logs include the name of the service account that performed a change or accessed data, but they don't show the name of the application that used the service account. If multiple applications share a service account, you might not be able to trace activity back to the correct application.

To avoid these complications, create dedicated service accounts for each application and avoid using default service accounts.

Follow a naming and documentation convention

To help track the association between a service and an application or resource, follow a naming convention when creating new service accounts:

  • Add a prefix to the service account email address that identifies how the account is used. For example:
    • vm- for service accounts attached to a VM instance.
    • wi- for service accounts used by Workload Identity.
    • wif- for service accounts used by workload identity federation.
    • onprem- for service accounts used by on-premises applications.
  • Embed the name of the application in the service account email address, for example: vm-travelexpenses@ if the VM runs a travel expenses application.
  • Use the description field to add a contact person, links to relevant documentation, or other notes.

Don't embed sensitive information or terms in the email address of a service account.

Identify and disable unused service accounts

When a service account isn't used anymore, disable the service account. By disabling unused service accounts, you reduce the risk of the service accounts being abused for lateral movement or for privilege escalation by a bad actor.

For single-purpose service accounts that are associated with a particular resource, such as a VM instance, disable the service account as soon as the associated resource is disabled or deleted.

For service accounts that are used for multiple purposes or shared across multiple resources, it can be more difficult to identify whether the service account is still used. In these cases, use one of the tools for understanding service account usage to identify unused service accounts.

Disable unused service accounts before deleting them

If you delete a service account and then create a new service account with the same name, the new service account is assigned a different identity. As a result, none of the original IAM bindings apply to the new service account. In contrast, if you disable and re-enable a service account, all IAM bindings stay intact.

To avoid inadvertently losing IAM bindings, it's best to not delete service accounts immediately. Instead, disable a service account if it isn't needed anymore and only delete it after a certain period has elapsed.

Never delete default service accounts such as the App Engine or Compute Engine default service account. These service accounts can't be recreated without disabling and reenabling the respective API, which might break your existing deployment. If you don't use the default service accounts, disable them instead.

Limit service account privileges

Service accounts are principals and can be granted access to a resource like a regular user account.

Don't use automatic role grants for default service accounts

Some Google Cloud services create default service accounts when you first enable their API in a Google Cloud project. By default, these service accounts are granted the Editor role (roles/editor) on your Cloud project, which allows them to read and modify all resources in the Cloud project. The role is granted for your convenience, but isn't essential for the services to work: To access resources in your Cloud project, Google Cloud services use service agents, not the default service accounts.

To prevent default service accounts from automatically being granted the Editor role, enable the Disable Automatic IAM Grants for Default Service Accounts (constraints/iam.automaticIamGrantsForDefaultServiceAccounts) constraint to your organization. To apply the constraint to multiple Cloud projects, configure it on the folder or the organization node. Applying the constraint doesn't remove the Editor role from existing default service accounts.

If you apply this constraint, then default service accounts in new projects will not have any access to your Google Cloud resources. You must grant appropriate roles to the default service accounts so that they can access your resources.

Don't rely on access scopes when attaching a service account to a VM instance

When you attach a service account to a VM instance, you can specify one or more access scopes. Access scopes let you restrict which services the VM can access. The restrictions are applied in addition to allow policies.

Access scopes are coarse-grained. For example, by using the https://www.googleapis.com/auth/devstorage.read_only scope, you can restrict access to Cloud Storage read-only operations, but you can't restrict access to specific buckets. Therefore, access scopes aren't a suitable replacement for fine-grained allow policies.

Instead of relying on access scopes, create a dedicated service account and use fine-grained allow policies to restrict which resources the service account has access to.

Avoid using groups for granting service accounts access to resources

In an organization, it's common that multiple employees perform similar or overlapping job functions and therefore require similar access to resources. By using groups, you can take advantage of these similarities to reduce administrative overhead.

Service accounts are intended to be used by applications. It's rare that multiple applications perform the same function and therefore have similar or identical access requirements. Instead, applications tend to be unique and the resources they require access to are typically different for each application.

Using groups to grant service accounts access to resources can lead to a few bad outcomes:

  • A proliferation of groups, with each group containing only one or a few service accounts.
  • Permission creep: Over time, a group is granted access to an increasing number of resources although each member of the group only requires access to a subset of the resources.

Unless the purpose of a group is narrowly defined, it's best to avoid using groups. Instead, directly grant service accounts access to the resources they need.

Avoid using domain-wide delegation

Domain-wide delegation enables a service account to impersonate any user in a Cloud Identity or Google Workspace account. Domain-wide delegation enables a service account to perform certain administrative tasks in Google Workspace and Cloud Identity, or to access Google APIs that don't support service accounts from outside of Google Cloud.

Domain-wide delegation doesn't restrict a service account to impersonate a particular user, but allows it to impersonate any user in a Cloud Identity or Google Workspace account, including super-admins. Allowing a service account to use domain-wide delegation can therefore make the service account an attractive target for privilege escalation attacks.

Avoid using domain-wide delegation if you can accomplish your task directly with a service account or by using the OAuth consent flow.

If you can't avoid using domain-wide delegation, restrict the set of OAuth scopes that the service account can use. Although OAuth scopes don't restrict which users the service account can impersonate, they restrict the types of user data that the service account can access.

Use Credential Access Boundaries to downscope access tokens

Google access tokens are bearer tokens, which means that their use isn't tied to any particular application. If your application passes an access token to a different application, then that other application can use the token in the same way your application can. Similarly, if an access token is leaked to a bad actor, they can use the token to gain access.

Because access tokens are bearer tokens, you must protect them from being leaked or becoming visible to unauthorized parties. You can limit the potential damage a leaked access token can cause by restricting the resources it grants access to. This process is called downscoping.

Use Credential Access Boundaries to downscope access tokens whenever you pass an access token to a different application, or to a different component of your application. Set the access boundary so that the token grants enough access to the required resources, but no more.

Use role recommendations to identify unused permissions

When you first deploy an application, you might be unsure about which roles and permissions the application really needs. As a result, you might grant the application's service account more permissions that it requires.

Similarly, an application's access requirements might evolve over time, and some of the roles and permissions you granted initially might not be needed.

Use role recommendations to identify which permissions an application is actually using, and which permissions might be unused. Adjust the allow policies of affected resources to help ensure that an application isn't granted more access than it actually needs.

Use lateral movement insights to limit lateral movement

Lateral movement is when a service account in one project has permission to impersonate a service account in another project. For example, a service account might have been created in project A, but have permissions to impersonate a service account in project B.

These permissions can result in a chain of impersonations across projects that gives principals unintended access to resources. For example, a principal could impersonate a service account in project A, and then use that service account to impersonate a service account in project B. If the service account in project B has permission to impersonate other service accounts in other projects in your organization, the principal could continue to use service account impersonation to move from project to project, gaining permissions as they go.

Recommender provides lateral movement insights to help you mitigate this issue. Lateral movement insights identify roles that allow a service account in one project to impersonate a service account in another project. To learn how to view and manage lateral movement insights directly, see Manage lateral movement insights.

Some lateral movement insights are associated with role recommendations. You can apply those recommendations to reduce lateral movement across your projects. To learn how, see Review and apply recommendations.

Protect against privilege-escalation threats

A service account that hasn't been granted any roles, does not have access to any resources, and isn't associated with any firewall rules is typically of limited value. After you grant a service account access to resources, the value of the service account increases: The service account becomes more useful to you, but it also becomes a more attractive target for privilege-escalation attacks.

As an example, consider a service account that has full access to a Cloud Storage bucket which contains sensitive information. In this situation, the service account is effectively as valuable as the Cloud Storage bucket itself: Instead of trying to access the bucket directly, a bad actor might attempt to take control of the service account. If that attempt is successful, the bad actor can escalate their privileges by impersonating the service account, which in turn gives them access to the sensitive information in the bucket.

Privilege-escalation techniques involving service accounts typically fall into these categories:

  • Direct impersonation: You might inadvertently grant a user permission to impersonate a service account or to create a service account key for a service account. If the service account is more privileged than the user itself, then the user can use that capability to escalate their privileges and gain access to resources they otherwise couldn't access.
  • Indirect impersonation: If a user can't directly impersonate a service account, they might be able to do so indirectly if the service account is used by a CI/CD pipeline, VM instance, or a different automation system they can access: If the system doesn't implement appropriate access restrictions, the user could let the system carry out operations they wouldn't be allowed to perform themselves, escalating their privileges.
  • Allow policy, group, or custom role modifications: A user who doesn't have access to a privileged service account might still have permission to modify the allow policies of the service account, enclosing Cloud project, or folder. The user could then extend one of these allow policies to grant themselves permission to (directly or indirectly) impersonate the service account.

The following sections provide best practices for protecting service accounts from privilege-escalation threats.

Avoid letting users impersonate service accounts that are more privileged than the users themselves

By impersonating a service account, a user gains access to some or all of the resources the service account can access. If the service account has more extensive access than the user, then it's effectively more privileged than the user.

Granting a user permission to impersonate a more privileged service account can be a way to deliberately let users temporarily elevate their privileges – similar to using the sudo tool on Linux, or using process elevation on Windows. Unless you're dealing with a scenario where such temporary elevation of privilege is necessary, it's best to avoid letting users impersonate a more privileged service account.

Permissions that enable a user to impersonate a service account include:

  • iam.serviceAccounts.getAccessToken
  • iam.serviceAccounts.getOpenIdToken
  • iam.serviceAccounts.actAs
  • iam.serviceAccounts.implicitDelegation
  • iam.serviceAccounts.signBlob
  • iam.serviceAccounts.signJwt
  • iam.serviceAccountKeys.create
  • deploymentmanager.deployments.create
  • cloudbuild.builds.create

Roles that contain some of these permissions include (but aren't limited to):

  • Owner (roles/owner)
  • Editor (roles/editor)
  • Service Account User (roles/iam.serviceAccountUser)
  • Service Account Token Creator (roles/iam.serviceAccountTokenCreator)
  • Service Account Key Admin (roles/iam.serviceAccountKeyAdmin)
  • Service Account Admin (roles/iam.serviceAccountAdmin)
  • Workload Identity User (roles/iam.workloadIdentityUser)
  • Deployment Manager Editor (roles/deploymentmanager.editor)
  • Cloud Build Editor (roles/cloudbuild.builds.editor)

Before you assign any of these roles to a user, ask yourself:

  • Which resources inside and outside the current Cloud project could the user gain access to by impersonating the service account?
  • Is this level of access justified?
  • Are there sufficient protections in place that control under which circumstances the user can impersonate the service account?

Don't assign the role if you can't confirm all questions. Instead, consider giving the user a different, less privileged service account.

Avoid letting users change the allow policies of more-privileged service accounts

Which users are allowed to use or impersonate a service account is captured by the service account's allow policy. The allow policy can be modified or extended by users who have the iam.serviceAccounts.setIamPolicy permission on the particular service account. Roles that contain that permission include:

  • Owner (roles/owner)
  • Security Admin (roles/iam.securityAdmin)
  • Service Account Admin (roles/iam.serviceAccountAdmin)

Roles that include the iam.serviceAccounts.setIamPolicy permission give a user full control over a service account:

  • The user can grant themselves permission to impersonate the service account, which gives the user the ability to access the same resources as the service account.
  • The user can grant other users the same or a similar level of access to the service account.

Before you assign any of these roles to a user, ask yourself which resources inside and outside the current Cloud project the user could gain access to by impersonating the service account. Don't let a user change the allow policy of a service account if the service account has more privileges than the user.

Don't let users create or upload service account keys

Service account keys let applications or users authenticate as a service account. Unlike other forms of service account impersonation, using a service account key doesn't require any previous form of authentication – anyone who possesses a service account key can use it.

The net effect of using a service account key to authenticate is similar to other forms of impersonation. If you give a user access to a service account key, or give them permission to create a new service account key, the user can impersonate the service account and access all resources that service account can access.

Creating or uploading a service account key requires the iam.serviceAccountKeys.create permission, which is included in the Service Account Key Admin (roles/iam.serviceAccountKeyAdmin) and Editor (roles/editor) roles.

Before you assign any role that includes the iam.serviceAccountKeys.create permission to a user, ask yourself which resources inside and outside the current Cloud project the user could gain access to by impersonating the service account. Don't let a user create service account keys for service accounts that have more privileges than they do.

If your Cloud project doesn't require service account keys at all, apply the Disable service account key creation and Disable service account key upload organization policy constraints to the Cloud project or the enclosing folder. These constraints prevent all users from creating and uploading service account keys, including those with iam.serviceAccountKeys.create permission on a service account.

Don't grant access to service accounts at the Cloud project or folder level

Service accounts are resources and part of the resource hierarchy. You can therefore manage access to service accounts at any of the following levels:

  • The individual service account
  • The enclosing Cloud project
  • A folder in the Cloud project's ancestry
  • The organization node

Managing access at the Cloud project level or a higher level of the resource hierarchy can help reduce administrative overhead, but can also lead to over-granting of privileges. For example, if you grant a user the Service Account Token Creator role in a Cloud project, the user can impersonate any service account in the Cloud project. Being able to impersonate any service account implies that the user can potentially gain access to all resources that those service accounts can access, including resources outside that Cloud project.

To avoid such over-granting, don't manage access to service accounts at the Cloud project or folder level. Instead, individually manage access for each service account.

Don't run code from less protected sources on compute resources that have a privileged service account attached

When you attach a service account to a compute resource, such as a VM instance or a Cloud Run application, processes running on that resource can use the metadata server to request access tokens and ID tokens. These tokens let the process impersonate the service account and access resources on its behalf.

By default, access to the metadata server isn't restricted to specific processes or users. Instead, any code that is executed on the compute resource can access the metadata server and obtain an access token. Such code might include:

  • The code of your application.
  • Code submitted by end users, if your application permits any server-side script evaluation.
  • Code read from a remote source repository, if the compute resource is part of a CI/CD system.
  • Startup and shutdown scripts served by a Cloud Storage bucket.
  • Guest policies distributed by VM Manager.

If code is submitted by users or is read from a remote storage location, you must ensure that it's trustworthy and that the remote storage locations are at least as well secured as the attached service account. If a remote storage location is less well protected than the service account, a bad actor might be able to escalate their privileges. They could do so by injecting malicious code that uses the service account's privileges into that location.

Limit shell access to VMs that have a privileged service account attached

Some compute resources support interactive access and allow users to obtain shell access to the system. For example:

  • Compute Engine lets you use SSH or RDP to log in to a VM instance.
  • Google Kubernetes Engine lets you use kubectl exec to run a command or start a shell in a Kubernetes container.

If a VM instance has a privileged service account attached, then any user with shell access to the system can impersonate the service account and access resources on its behalf. To prevent users from abusing this capability to escalate their privileges, you must ensure that shell access is at least as well secured as the attached service account.

For Linux instances, you can enforce that SSH access is more restrictive than access to the attached service account by using OS Login: To connect to a VM instance that has OS Login enabled, a user must not only be allowed to use OS Login, but must also have the iam.serviceAccounts.actAs permission on the attached service account.

The same level of access control doesn't apply to VM instances that use metadata-based keys or to Windows instances: Publishing an SSH key to metadata or requesting Windows credentials requires access to the VM instance's metadata and the iam.serviceAccounts.actAs permission on the attached service account. However, after the SSH key has been published or the Windows credentials have been obtained, subsequent logins are not subject to any additional IAM permission checks.

Similarly, if a VM instance uses a custom Linux pluggable authentication module for authentication, or is a member of an Active Directory domain, it's possible that users who wouldn't otherwise have permission to impersonate the service account are allowed to log in.

Particularly for VM instances that don't use OS Login, consider gating shell access by Identity-Aware Proxy. Only grant the IAP-Secured Tunnel User role to users who should be allowed to impersonate the service account attached to the VM instance.

Limit metadata server access to selected users and processes

When you attach a service account to a VM instance, workloads deployed on the VM can access the metadata server to request tokens for the service accounts. By default, access to the metadata server isn't limited to any specific process or user on the VM: Even processes running as a low-privilege user, such as nobody on Linux or LocalService on Windows, have full access to the metadata server and can obtain tokens for the service account.

To limit metadata server access to specific users, configure the guest operating system's host firewall to only allow these users to open outbound connections to the metadata server.

On Linux, you can use the --uid-owner and --gid-owner options to set up an iptables rule that only applies to specific users or groups. On Windows, the Set-NetFirewallSecurityFilter command lets you customize a firewall rule so that it applies to selected users or groups.

Protect against spoofing threats

Workload identity federation lets you create a one-way trust relationship between a Cloud project and an external identity provider. Once you've established this trust relationship, you can exchange credentials obtained from the external identity provider against a Security Token Service (STS) token. You can then use that STS token to access or impersonate a service account.

STS tokens differ from access tokens in that they don't correspond to one specific Google identity. STS tokens also don't necessarily correspond to one specific identity in the external identity provider. Instead, an STS token represents a set of attributes. Depending on what these attributes are, they might correspond to a single or to multiple identities in the external identity provider.

If the attributes represented by an STS token are ambiguous or used improperly, they might allow bad actors to spoof their own identity. The following section describes best practices that help you protect against such spoofing threats.

Don't allow attribute mappings to be modified

Workload identity federation uses attribute mappings to select which of the attributes provided by the external identity provider should be embedded into an STS token, and how the attribute names should translate. Configuring attribute mappings is a key step to setting up the trust relationship between the external identity provider and Google Cloud.

Attribute mappings are also crucial to the security of using Workload identity federation: If you've granted a federated principal or principal set permission to impersonate a service account and later change the attribute mapping, you might change which users have access to the service account.

Modifying attribute mappings requires the iam.googleapis.com/workloadIdentityPoolProviders.update permission. Roles containing this permission include:

  • Owner (roles/owner)
  • IAM Workload Identity Pool Admin (roles/iam.workloadIdentityPoolAdmin)

If a bad actor has permission to modify attribute mappings, they might be able to change the mapping rules in a way that allows them to spoof their identity and gain access to a service account. To prevent such malicious modifications, make sure only a few administrative users have the permission to modify attribute mappings.

Consider creating a dedicated Cloud project for managing workload identity pools to limit the risk of users inadvertently being assigned one of these roles at a higher level in the resource hierarchy.

Don't rely on attributes that aren't stable or authoritative

When you use workload identity federation, you're trusting an external identity provider to authenticate users and to report accurate information about the authenticated user back to Google Cloud.

An identity provider uses attributes to communicate information about authenticated users. While some of these attributes are typically guaranteed to be authoritative, others might not be: For example, an external identity provider might embed both a username and a user ID in an OpenID Connect ID token. Both attributes uniquely identify a user and might seem interchangeable. However, the external identity provider might allow users to change their username while it guarantees their user ID to be stable and authoritative.

If your attribute mappings rely on attributes that aren't stable or authoritative, then a bad actor might be able to spoof their identity by modifying their user profile in the external identity provider. For example, they might change their username to that of a user that has been recently deleted in the external identity provider, but still has access to a service account on Google Cloud.

To prevent such spoofing attacks, make sure your attribute mappings only rely on attributes that the external identity provider guarantees to be stable and authoritative.

Protect against information disclosure threats

Avoid disclosing confidential information in service account email addresses

To grant a service account access to a resource in another Cloud project, you can add a role binding to the resource's allow policy. Like the resource itself, the allow policy is part of the other Cloud project and its visibility is also controlled by that other Cloud project.

Viewing allow policies is typically not considered a privileged operation. Many roles include the required *.getIamPolicy permission, including the basic Viewer role.

A user who can view an allow policy can also see the email addresses of principals who have been granted access to the resource. In the case of service accounts, email addresses can provide hints to bad actors.

For example, an allow policy might include a binding for a service account with the email address jenkins@deployment-project-123.gserviceaccount.com. To a bad actor, this email address not only reveals that there is a Cloud project with ID deployment-project-123, but also that the Cloud project runs a Jenkins server. By choosing a more generic name such as deployer@deployment-project-123.gserviceaccount.com, you avoid disclosing information about the type of software that you're running in deployment-project-123.

If you grant a service account access to resources in a Cloud project that has less tightly controlled access (such as a sandbox or a development Cloud project), make sure that the service account's email address doesn't disclose any information. In particular, don't disclose information that is confidential or that could provide hints to attackers.

Protect against non-repudiation threats

Whenever you notice suspicious activity affecting one of your resources on Google Cloud, Cloud Audit Logs are an important source of information to find out when the activity happened and which users were involved.

Whenever Cloud Audit Logs indicate that activity was performed by a service account, that information alone might not be sufficient to reconstruct the full chain of events: You must also be able to find out which user or application caused the service account to perform the activity.

This section contains best practices that can help you maintain a non-repudiable audit trail.

Enable data access logs for IAM APIs

To help you identify and understand service account impersonation scenarios, services such as Compute Engine include a serviceAccountDelegationInfo section in Cloud Audit Logs. This section indicates whether the service account was being impersonated, and by which user.

Not all services include impersonation details in their Cloud Audit Logs. To record all impersonation events, you must also enable data access logs for the following APIs:

  • Identity and Access Management (IAM) API in all Cloud projects that contain service accounts
  • Security Token Service API in all Cloud projects that contain workload identity pools

By enabling these logs, you make sure that an entry is added to the Cloud Audit Logs whenever a user requests an access token or an ID token for a service account.

Ensure that CI/CD history can be correlated with Cloud Audit Logs

Service accounts are commonly used by CI/CD systems to perform deployments after a code change has been successfully verified and approved for deployment. Typically, CI/CD systems maintain a history of events that lead to a deployment. This history might include the IDs of the corresponding code reviews, commits, and pipeline runs, and information about who approved the deployment.

If a deployment modifies any resources on Google Cloud, then these changes are tracked in the Cloud Audit Logs of the respective resources. Cloud Audit Logs contain information about the user or service account that initiated the change. But in a deployment triggered by a CI/CD system, the service account itself is often insufficient to reconstruct the entire chain of events that led to the change.

To establish a consistent audit trail across your CI/CD system and Google Cloud, you must ensure that Cloud Audit Logs records can be correlated with events in the CI/CD system's history. If you encounter an unexpected event in the Cloud Audit Logs, you can then use this correlation to determine whether the change was indeed performed by the CI/CD system, why it was performed, and who approved it.

Ways to establish a correlation between Cloud Audit Logs records and events in the CI/CD system's history include:

  • Log API requests performed by each CI/CD pipeline run.
  • Whenever the API returns an operation ID, record the ID in the CI/CD system's logs.
  • Add a X-Goog-Request-Reason HTTP header to API requests and pass the ID of the CI/CD pipeline run. Terraform can automatically add this header if you specify a request reason.

    Alternatively, embed the information in the User-Agent header so that it is captured in Cloud Audit Logs.

To help ensure non-repudiability, configure log files and commit histories so that they are immutable and a bad actor can't retroactively conceal their traces.

What's next