Jump to Content
Security & Identity

Preventing lateral movement in Google Compute Engine

July 30, 2020
https://storage.googleapis.com/gweb-cloudblog-publish/images/Google_Blog_Security-identity_heC2Ck4.max-2600x2600.jpg
Iulia Ion

Software Engineer, Google Cloud

Josh Strom

Software Engineer, Google Cloud

When organizations move to Google Cloud, a question we often hear from security operations teams is, “How can we prevent compromises in our deployments and better defend against lateral movement?” With lateral movement, an attacker can move within a system after an initial compromise and gain access to even more sensitive data. 

To prevent lateral movement and keep your organization secure, we recommend taking a “defense in depth” approach, which helps protect users and data with multiple layers of security that build upon and reinforce one another. In the event that one layer is circumvented or compromised, many more are in place to prevent potential attackers from accomplishing their objectives.

To implement a defense in depth approach for Compute Engine there are a few things you should do:

  • Isolate your production resources from the internet

  • Disable the use of default service accounts

  • Limit access to service account credentials

  • Use OS Login to manage access to your VMs 

  • Apply the principle of least-privilege

  • Collect logs and monitor your system

Let’s explore each of these recommended actions in more depth.

Isolate your production resources from the internet 

The most effective way to ensure that Compute Engine instances don’t get compromised is to minimize their exposure to the public internet. Compute Engine VMs can have internal or external IP addresses. To minimize your attack surface, you should use Cloud NAT to assign your VMs only internal IP addresses and use Identity Aware Proxy (IAP) to allow curated access from the internet.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Using_IAP_to_protect_a_VM.max-1000x1000.jpg
Using IAP to protect a VM

When you do have to directly expose a VM with an external IP address, ensure that your firewall rules restrict network access to only the ports and IP addresses that your application needs. 

Don’t

  • Assign external IP addresses to your VMs.

  • Configure permissive firewall rules that allow anyone on the internet to connect to your VMs.

Do

  • Assign private IP addresses to your VMs; don’t give them public IP addresses at all. Use IAP TCP forwarding to connect to your VMs for administration and Cloud NAT to allow your VMs to access the internet. IAP works by verifying a user’s identity and the context of the request to determine if a user should be allowed to access an application or a VM. Follow these instructions to set up IAP for your Compute Engine instances. 

  • Use Organization Policies to define allowed external IPs for VM instances, so new VM instances don’t get created or configured with an external IP address.

  • Use Security Health Analytics to detect VMs that have external IP addresses and firewall rules that are too permissive. (Note: Some Security Heath Analytics features are available only in the Premium edition of Security Command Center.) Identify and resolve the following security findings:

    • PUBLIC_IP_ADDRESS: Indicates that a Compute Engine instance is assigned an external IP address.

    • OPEN_FIREWALL: Indicates that a firewall rule is configured to allow access from any IP address or on any port.

  • Use VPC Service Controls to configure security perimeters that isolate Google Cloud resources and prevent sensitive data exfiltration—even by authorized clients.

  • Disable legacy Compute Engine instance metadata APIs and migrate your application to the v1 metadata API to help protect it against Server-Side Request Forgery (SSRF) attacks. 

But what if, despite your preventative efforts, an attacker still manages to compromise a VM? Let’s look at some ways to minimize the impact and ensure that the attacker isn’t able to move laterally and gain access to more resources.

Disable the default service accounts

Configuring identity and API access is a critical step in creating a VM. This configuration includes specifying which service account should be used by applications running on the VM. Google Cloud offers two approaches for granting privileges to your application: using a Compute Engine default service account or a user-created service account.

The Compute Engine default service account is automatically created by the Google Cloud Console project and has an auto-generated name and email address. To simplify customer onboarding, it is automatically granted the Project Editor IAM role, which means that it has read and write access to almost all resources in the project (including the ability to impersonate other service accounts). Because these privileges are so permissive, we recommend using Organization Policies to disable the automatic granting of this role. Instead, remove any access grants given to the Compute Default Service Account, and use a new service account that’s granted only the permissions your VM needs. 

Don’t

  • Use the Compute Engine default service account with the primitive Editor role.

  • Use the same service account for different applications running on different VMs.

Do

  • Revoke the Editor role for the Compute Engine default service account and create a new service account for your VM that has only the needed permissions.

  • Disable the default Compute Engine service account.

  • Use Organization Policies to “Disable Automatic IAM Grants for Default Service Accounts” so the Compute Engine Default Service Account is not granted the Editor role by default.

  • Use Security Health Analytics to detect and resolve the following misconfigurations: 

    • FULL_API_ACCESS: Indicates that a VM instance is configured to use the default service account with full access to all Google Cloud APIs.

Limit access to service account credentials

To limit an attacker’s ability to impersonate your service accounts, you should avoid creating service account keys whenever possible and protect access to existing keys. Service account keys are intended to allow external, non-Google Cloud workloads to authenticate as the service account, but when you’re operating inside Google Cloud, it’s almost never necessary to use a service account key.

Don’t

  • Generate and download private keys for your service accounts. Your Compute Engine instance can automatically assume the identity of the configured service account. 

  • Grant the Service Account User or Service Account Token Creator roles at the project level. Instead, you should grant these roles on individual service accounts, when needed. The Service Account User role allows a user to start a long-running job on behalf of a service account. The Service Account Token Creator role allows a user to directly impersonate (or assert) the identity of that service account.

Do

  • Use Organization Policies to:

    • “Disable service account key creation” and ensure that users can’t create and download user-managed private keys for service accounts.

    • “Disable service account creation” for projects that shouldn’t host service accounts.

  • Use Upload Service Account key to authenticate on-prem services with Google Cloud using a key in a Hardware Security Module. This minimizes the possibility that the service account private key will be exposed.

  • Use the --impersonate-service-account flag to execute gcloud commands as a service account instead of using an exported service account key. This can be configured per-request or for all gcloud commands by running “gcloud config set auth/impersonate_service_account {service_account_email}”

  • Use Security Health Analytics to detect broad use of the Service Account User role and existing service account keys. Look for:

    • SERVICE_ACCOUNT_KEY_USER_MANAGED: Indicates that a user-managed private key for service accounts exists.

    • SERVICE_ACCOUNT_KEY_NOT_ROTATED: Indicates that a user-managed private key for a service account has not been rotated in 90 days.

    • OVER_PRIVILEGED_SERVICE_ACCOUNT_USER: Indicates that an IAM member has the Service Account User role at the project level, instead of for a specific service account.

Use OS Login to manage access to your VMs 

One way for an attacker to escalate privileges and gain access to additional VMs is by looking for SSH keys stored in the project’s metadata. Manually managing SSH keys used for VM access is time-consuming and risky. Instead, you should use OS Login to grant access to your VMs based on IAM identities. If OS Login is enabled, an attacker can’t obtain access to new VMs by uploading SSH keys to the instance metadata, because those keys get ignored. To preserve backwards compatibility for workflows that rely on configuring their own users and SSH keys, OS Login is not enabled by default. You can learn more about managing access to your VM instances here.

Don’t

  • Manually manage SSH keys in VM instance metadata.

  • Allow project-wide SSH keys that can be used to connect to all VMs in a project.

Do

  • Use OS Login to manage access to your VMs based on IAM identities. 

  • Use Organization Policies to Require OS Login: All VM instances created in new projects will have OS Login enabled. On new and existing projects, this constraint prevents metadata updates that disable OS Login at the project or instance level.

  • Use Security Health Analytics to ensure that OS Login is enabled and that project-wide SSH keys are not used. Look for the following misconfiguration types:

    • OS_LOGIN_DISABLED: Indicates that OS Login is disabled on a VM instance.

    • COMPUTE_PROJECT_WIDE_SSH_KEYS_ALLOWED: Indicates that project-wide SSH keys are used, allowing login to all Compute Engine instances in a project.

    • ADMIN_SERVICE_ACCOUNT: Indicates that a service account is configured with Owner access or administrator roles such as roles/compute.osAdminLogin, which may allow them to change OS Login settings or have sudo access.

Apply the principle of least privilege

Ensuring that every VM instance, service account, or user is able to access only the information and resources that are necessary for legitimate business operations can be a challenge, especially if you have a lot of VMs. Google Cloud’s IAM Recommender uses machine learning to help organizations right-size privilege management through monitoring a project’s actual permission usage and recommending specific, constrained roles to replace overly permissive ones. It recommends permissions that are safe to remove, and also predicts future access needs.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Reviewing_permissions_in_IAM_Recommender.max-2000x2000.jpg
Reviewing permissions in IAM Recommender

To learn more about how to review and apply such recommendations, check out our page on enforcing least privilege with recommendations. For more tips on how to apply the principle of least privilege read this blog post.

Don’t

  • Use the primitive IAM roles: Owner, Editor, Viewer.

Do

  • Set up an Organization Policy for “Domain Restricted Sharing” to prevent members from outside of configured organizations from receiving IAM policy grants.

  • Consider creating and using a custom role if the predefined IAM roles are broader than what you need.

Collect logs and monitor your system

Strong preventative controls need to be complemented with effective detection of malicious activity. Ensuring the right logs are collected is fundamental for security investigations and forensics. Make sure to turn on Data Access logs, which are part of Cloud Audit Logs and can help you answer questions like, "Who did what, where, and when?" within your Google Cloud resources.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Reviewing_recommendations_in_Security_Heal.max-1400x1400.jpg
Reviewing recommendations in Security Health Analytics

Do

  • Turn on Data Access logs. Data Access audit logs record API calls that read the configuration or metadata of resources, as well as user-driven API calls that create, modify, or read user-provided resource data. They are disabled by default because they can generate large volumes of log data.

  • Enable VPC Service Controls in dry run mode. Even if you can’t enforce service perimeters in your organization yet, you can enable dry run mode to log such requests. This will give you the ability to log information about cross-org or cross-project data movement. 

  • Use Security Command Center Premium capabilities including:

    • Security Health Analytics to ensure that logging is properly turned on across your organization: AUDIT_LOGGING_DISABLED: Indicates that data access logs are not enabled or that certain users are exempted; FLOW_LOGS_DISABLED: Indicates there is a VPC subnetwork that has flow logs disabled.

    • Use Security Health Analytics to detect if users outside of your organizations, such as those using Gmail addresses, have been granted access to your projects. NON_ORG_IAM_MEMBER: Indicates that there is an IAM member user who isn't using organizational credentials. 

    • Use Event Threat Detection to automatically scan logs and be alerted on suspicious activity such as overly permissive IAM grants to users outside your organization, cryptomining, connection to known bad IP addresses or domains, and outgoing DDoS attacks. 

Additional resources

We hope that these suggestions will help you defend against and detect lateral movement in your cloud environment. To learn more about security best practices on Google Cloud, follow the Center for Internet Security’s CIS Google Cloud Platform Foundation Benchmark and check out further recommendations provided by Security Health Analytics. For more information on how to secure Kubernetes workloads, see the CIS Google Kubernetes Engine (GKE) Benchmarks. If you use Terraform to manage your deployments, set up Config Validator to detect security misconfigurations at pre-deployment time. 

Finally, to better understand how we protect our infrastructure and to learn more about our security solutions, check out our security page.

Posted in