This document describes how to implement a secrets management system that you can use to manage Kubernetes Secrets across your Anthos clusters, and it describes the controls that you use for this task.
The document is part of a series of security blueprints that provide prescriptive guidance for working with Anthos.
Working with applications usually requires you to manage secrets. A secret is an object that contains a small amount of sensitive data such as a password, a token, or a key.
When you work with Anthos clusters, you need to manage Kubernetes Secrets. In Anthos, Kubernetes Secrets are used for accessing the Kubernetes API. They're also used at the application layer—for example, to allow your services to communicate with a backend database.
To help prevent unauthorized access to your applications and to sensitive data, you can use a secrets management system. A secrets management system stores secrets for use by your applications, and it manages the permissions to access the secrets.
For managing secrets and keys, you need to consider the following:
- Whether you're managing a hybrid environment that runs clusters on some combination of Google Cloud, on-premises, and another cloud.
- Whether you're running Anthos only on Google Cloud.
- What your threat model is and what the encryption requirements are based on that threat model—for example, do you require two layers of encryption, like full-disk encryption and application-layer encryption?
- How to rotate keys regularly to limit the impact of a key compromise.
- How to separate key management from secrets management and maintain a root of trust.
- How to audit key usage.
- What the implications are if your secrets management system or key management system is unavailable and whether you therefore need to implement a highly available secrets management system.
The GitHub repository that's associated with this blueprint contains a managing-secrets directory. The content of that directory provides instructions on how to configure a highly available HashiCorp Vault cluster on GKE as a secrets management system that you can use with your Anthos clusters.
Understanding the security controls you need
This section discusses the controls that you need in order to implement a secrets management system. The approach discussed in this document uses HashiCorp Vault for secrets management and Cloud Key Management Service (Cloud KMS) for key management. The approach accommodates the considerations listed in the previous section. This section also discusses the controls that you can use to manage service accounts by using the HashiCorp Google Cloud secrets engine. A secrets engine is a component that can store and read data, generate credentials, and encrypt data.
The following diagram illustrates how Vault and Cloud KMS interact to provide secrets management and key management.
The diagram shows how Cloud KMS and Vault work together to do the following:
- Act as the key management service provider to manage a key encryption key (KEK).
- Act as a secrets engine for keys and encryption.
- Act as a key management service provider to manage a KEK.
- Act as a secrets engine for service accounts management (Vault).
- Integrate secrets directly with HashiCorp Vault.
Centrally managing secrets
HashiCorp Vault is an open source secrets manager that can work with Kubernetes to manage your Kubernetes Secrets. Vault comes with multiple secrets engines, including secrets engines for Google Cloud and for Cloud KMS.
Vault operates in a client-server model where a central cluster of Vault servers store and maintain secrets, and that data is accessed by clients through an API, a command-line interface, or a web interface. Vault encrypts all data in transit by using TLS 1.2 or later and encrypts all data at rest using 256-bit CBC encryption. If you have an enterprise license, Vault can also be upgraded to help an organization comply with the US government FIPS 140-2 standard.
We recommend that you run Vault in its own dedicated GKE cluster in a dedicated project.
When you use Vault with your Anthos clusters applications, your applications can use the Kubernetes Auth Method in the following way:
- Pods submit their Kubernetes service account token.
- Vault verifies the token against the Kubernetes control plane.
- Vault maps the identity to a policy and to permissions.
- Vault returns a Vault token to the application.
The application can then use the token to authenticate with appropriate permissions.
The following diagram illustrates the process described in the preceding list.
Managing short-lived credentials for your applications and for local development
When you enable the HashiCorp Google Cloud secrets engine as the secrets engine for your clusters, you can dynamically generate Google Cloud service account keys and OAuth tokens that are based on Identity and Access Management (IAM) policies. Applications that are deployed on your Anthos clusters can then use these keys or tokens as short-term credentials.
In addition, you can use the HashiCorp Google Cloud secrets engine to automatically manage Google Cloud service accounts and the associated service account keys that your developers can use when they work on their local workstations. You must grant developers access to the service account JSON files so they can write and debug applications that use Google Cloud resources. (When people authenticate to Google Cloud services, they should always use their own accounts rather than using service accounts, which are designed for applications.)
Using HashiCorp Google Cloud secrets engine to manage service accounts and OAuth tokens provides the following benefits:
- Because each service account is associated with a Vault lease, the service account key is automatically revoked when the lease expires.
- Users do not need to create service accounts for short-term access credentials. Vault takes care of creating OAuth tokens and the associated service account.
- In hybrid configurations, users can authenticate to Vault by using their identity provider. The identity provider can generate Google Cloud credentials for the user without needing to create or manage a new service account for that user.
Managing keys for your cloud services
Cloud Key Management Service (Cloud KMS) is a cloud-hosted key management service that lets you manage cryptographic keys for your cloud services the same way that you do on-premises. You can generate, use, and rotate cryptographic keys that use popular key algorithms. Cloud KMS is integrated with IAM and Cloud Audit Logs so that you can manage permissions on individual keys and monitor how they are used. You can implement a key hierarchy by using a local data encryption key (DEK) that is in turn protected by a key encryption key (KEK) in Cloud KMS.
When you first configure HashiCorp Vault, you specify the physical storage for the data (secrets), but the software isn't configured to decrypt that data. In this state, Vault is referred to as sealed. To decrypt the data, Vault needs the master encryption key so that the key can be used to decrypt the encryption key that in turn is used to decrypt the data that's stored in Vault. The process of getting access to this master encryption key is referred to as unsealing. In this blueprint, the unseal keys are encrypted by using Cloud KMS, and they are stored in Cloud Storage.
Kubernetes uses the
key-value store as a backing store for a cluster's configuration data, state
data, and metadata. After you follow the hardening guidance for how to configure
your Anthos cluster, storage-layer encryption is enabled.
However, by default,
etcd data is not encrypted at the application level. If a
user could get unauthorized access to an offline copy of the
etcd store, the
data would be accessible in unencrypted form. Therefore, a best practice to help
protect against this attack vector is to add a further layer of
security at the application layer.
To do this, you encrypt secrets by using an external key management service.
The HashiCorp Google KMS secrets engine provides encryption and key management that uses Cloud KMS. The secrets engine supports key management, including creation, rotation, and revocation. It also supports encrypting and decrypting data by using managed keys. The Google KMS secrets engine lets you manage KMS keys through Vault policies and through IAM.
Cloud KMS is integrated with Cloud Logging. By default, Cloud KMS writes Admin Activity audit logs, which record operations that modify the configuration or metadata of a resource. For example, Admin Activity audit logs record when a key ring is created or destroyed, or when the IAM access control policy for a project is set. Admin Activity audit logs are always enabled; you can't disable them.
Cloud KMS can also write Data Access audit logs. Data Access audit logs record API calls that read the configuration or metadata of resources, as well as user-driven API calls that create, modify, or read user-provided resource data. For example, Data Access audit logs record when a key is decrypted, when a call is made to the action of decrypting a key, or when a call is made to read the IAM policy of a project. Data Access audit logs are disabled by default and are not written unless you explicitly enable them.
For more information about what operations can be audited for Cloud KMS, see audited operations in the Cloud KMS documentation.
Storing secrets on the backend
HashiCorp Vault has a pluggable storage system that lets you specify where you want to persist data at rest. For the Vault backend, we recommend Cloud Storage, Google's scalable and highly durable object storage service, because of its high performance, low cost, and support for high availability.
Cloud Spanner is Google's globally distributed, ACID-compliant database that automatically handles replicas, sharding, and transaction processing. For intense workload requirements, we recommend Cloud Spanner as an alternative storage backend to Cloud Storage for your Vault configuration.
Encryption at rest for your clusters
Your Anthos clusters are deployed to VMs. The VMs are configured
to provide storage-layer encryption. GKE VMs are
encrypted at the storage layer by default,
which includes the contents of the
etcd key-value store.
For Anthos clusters on VMware, we recommend that you configure clusters to use vSphere encryption. This provides encryption-at-rest protection for secrets.
By default, secrets are not further encrypted. You must use a secrets manager to get application-level encryption.
Bringing it all together
You can integrate the controls described earlier to achieve a highly available system that does the following:
- Separates key management from secrets management.
- Maintains a root of trust.
- Provides a way to rotate keys regularly to limit the impact of a key compromise.
- Provides a way to encrypt data at the application layer.
The following diagram shows the architecture for the system.
In this architecture, Vault servers are deployed to a dedicated GKE cluster. A Cloud Storage bucket is used to store the keys. Cloud KMS provides key management for keys that are used to encrypt secrets at the application layer and to manage the keys to encrypt the key used to seal and unseal Vault.
To implement this architecture, you do the following:
- Create a Google Cloud project where you configure Vault. This project is separate from the project for your applications.
- Create a Cloud KMS key ring and cryptographic key that's suitable for encrypting and decrypting Vault master keys and root tokens.
- Create a GKE cluster to deploy Vault by using the guidance in the applicable cluster hardening guide (GKE or Anthos clusters on VMware).
- Deploy Vault to the GKE cluster.
- Enable the HashiCorp Google Cloud secrets engine in order to manage short-lived credentials.
- Enable the HashiCorp Google KMS secrets engine to provide a system that can enable envelope encryption for application-layer encryption of your Kubernetes cluster secrets.