Security overview

This page describes the security architecture of GKE on Azure , including encryption and node configuration.

GKE clusters offer features to help secure your workloads, including the contents of your container image, the container runtime, the cluster network, and access to the cluster API server.

When you use GKE clusters, you agree to take on certain responsibilities for your clusters. For more information, see GKE clusters shared responsibilities.

At-rest data Encryption

At-rest data encryption is the encryption of stored data, as distinct from data in transit. By default, GKE on Azure encrypts data in etcd and storage volumes at rest using Azure platform-managed keys.

GKE on Azure clusters store data in Azure Disk volumes. These volumes are always encrypted at rest using Azure Key Vault keys. When you create clusters and node pools, you can provide a Customer-Managed Keyvault Key to encrypt the cluster's underlying Disk volumes. If you don't specify a key, Azure uses the default Azure-managed key within the Azure region where the cluster runs.

In addition, all GKE clusters enable Application-layer Secrets Encryption for sensitive data, such as Kubernetes Secret objects, which are stored in etcd. Even if attackers gain access to the underlying volume where etcd data is stored, these data is encrypted.

When you create a cluster, you can provide an Azure Key Vault key in the --database-encryption-kms-key-arn parameter. This key is used for encryption of your application's data. If you don't provide a key during cluster creation, GKE on Azure creates one for your cluster automatically. This resource field is immutable and cannot be modified once the cluster is created.

You can also manually create a Key Vault key or bring your own key (BYOK) with a hardware security module (HSM). For more information, see Bring your own key.

How application level encryption works

Kubernetes offers application-level encryption with a technique known as envelope encryption. A local key, commonly called a data encryption key (DEK), is used to encrypt a Secret. The DEK itself is then encrypted with a second key called the key encryption key (KEK). The KEK is not stored by Kubernetes. When you create a new Kubernetes Secret, your cluster does the following:

  1. The Kubernetes API server generates a unique DEK for the Secret using a random number generator.

  2. The Kubernetes API server encrypts the Secret locally with the DEK.

  3. The Kubernetes API server sends the DEK to Azure Key Vault for encryption.

  4. Azure Key Vault uses a pre-generated KEK to encrypt the DEK and returns the encrypted DEK to the Kubernetes API server's Azure Key Vault plugin.

  5. The Kubernetes API server saves the encrypted Secret and the encrypted DEK to etcd. The plaintext DEK is not saved to disk.

  6. The Kubernetes API server creates an in-memory cache entry to map the encrypted DEK to the plaintext DEK. This lets the API Server decrypt recently-accessed Secrets without querying the Azure Key Vault.

When a client requests a Secret from the Kubernetes API server, here's what happens:

  1. The Kubernetes API server retrieves the encrypted Secret and the encrypted DEK from etcd.

  2. The Kubernetes API server checks the cache for an existing map entry and if found, decrypts the Secret with it.

  3. If there is no matching cache entry, the API server sends the DEK to Azure Key Vault for decryption using the KEK. The decrypted DEK is then used to decrypt the Secret.

  4. Finally, the Kubernetes API server returns the decrypted Secret to the client.

Config Encryption with Key Vault Firewall

If you pass a public key for encryption, the service principal doesn't need the permission to encrypt, but it does need the permission to manage role assignments. The easiest way to do this is to assign the Azure User Access Administrator built-in role to your service principal.

To further secure your Azure Key Vault, you can enable Azure Key Vault firewall. GKE on Azure can then use a public key for encryption and avoid network access to the key vault.

To configure the firewall, you download the Key Vault Key with Azure CLI. You pass the key to the --config-encryption-public-key when you create a cluster with the Google Cloud CLI.

You still need to enable service endpoints for Key Vault in all the subnets used for your cluster. For more information, see Virtual network service endpoints for Azure Key Vault.

Key Rotation

In contrast to certificate rotation, key rotation is the act of changing the underlying cryptographic material contained in a key encryption key (KEK). It can be triggered automatically as part of a scheduled rotation, or manually, usually after a security incident where keys might have been compromised. Key rotation replaces only the single field in the key that contains the raw encryption/decryption key data.

For more information, see Key rotation.

Cluster trust

All cluster communication uses Transport Layer Security (TLS). Each cluster is provisioned with the following main self-signed root certificate authorities (CAs):

  • The cluster root CA is used to validate requests sent to the API server.
  • The etcd root CA is used to validate requests sent to etcd replicas.

Each cluster has a unique root CA. If one cluster's CA is compromised, no other cluster's CA is affected. All root CAs have a validity period of 30 years.

Node security

GKE on Azure deploys your workloads onto node pools of Azure VM instances. The following section explains security features of nodes.

Ubuntu

Your nodes run an optimized version of the Ubuntu OS to run the Kubernetes control plane and nodes. For more information, see security features in the Ubuntu documentation.

GKE clusters implement several security features, including the following:

Additional security guides are available for Ubuntu, such as the following:

Secure your workloads

Kubernetes allows users to quickly provision, scale, and update container-based workloads. This section describes tactics that you can use to limit the side-effects of running containers on cluster and Google Cloud services.

Limit Pod container process privileges

Limiting the privileges of containerized processes is important for your cluster's security. You can set security-related options with a security context. These settings let you change the security settings of your processes such as the following:

  • User and group running the process
  • Available Linux capabilities
  • Privilege escalation

The default GKE on Azure node operating system, Ubuntu, uses default Docker AppArmor security policies for all containers. You can view the profile's template on GitHub. Among other things, this profile denies containers the following abilities:

  • Writing to files directly in a process ID directory (/proc/)
  • Writing to files that are not in /proc/
  • Writing to files in /proc/sys other than /proc/sys/kernel/shm*
  • Mounting file systems

Restrict the ability for workloads to self-modify

Certain Kubernetes workloads, especially system workloads, have permission to self-modify. For example, some workloads vertically autoscale themselves. While convenient, this can allow an attacker who has already compromised a node to escalate further in the cluster. For example, an attacker could have a workload on the node change itself to run as a more privileged service account that exists in the same namespace.

Ideally, workloads should not be granted the permission to modify themselves in the first place. When self-modification is necessary, you can limit permissions by installing Policy Controller or Gatekeeper in your cluster and applying constraints, such as NoUpdateServiceAccount from the open source Gatekeeper library, which provides several useful security policies.

When you deploy policies, it is usually necessary to allow the controllers that manage the cluster lifecycle to bypass the policies. This is necessary so that the controllers can make changes to the cluster, such as applying cluster upgrades. For example, if you deploy the NoUpdateServiceAccount policy on GKE on Azure, you must set the following parameters in the Constraint:

parameters:
  allowedGroups: []
  allowedUsers:
  - service-PROJECT_NUMBER@gcp-sa-gkemulticloud.iam.gserviceaccount.com

Replace PROJECT_NUMBER with the number (not ID) of the project that hosts the cluster.

What's next