This document describes how to harden the security of your GKE on Bare Metal clusters.
Secure your containers using SELinux
You can secure your containers by enabling SELinux, which is supported for Red Hat Enterprise Linux (RHEL) and CentOS. If your host machines are running RHEL or CentOS and you want to enable SELinux for your cluster, you must enable SELinux in all of your host machines. See secure your containers using SELinux for details.
seccomp to restrict containers
Secure computing mode (
seccomp) is available in version 1.11 of GKE on Bare Metal.
Running containers with a
seccomp profile improves the security of your
cluster because it restricts the system calls that containers are allowed to
make to the kernel. This reduces the chance of kernel vulnerabilities being
seccomp profile contains a list of system calls that a container
is allowed to make. Any system calls not on the list are disallowed.
is enabled by default in version 1.11 of GKE on Bare Metal. This means that all
system containers and customer workloads are run with the container runtime's
seccomp profile. Even containers and workloads that don't specify a
seccomp profile in their configuration files are subject to
How to disable
seccomp cluster-wide or on particular workloads
You can disable
seccomp during cluster creation or cluster upgrade only.
bmctl update can't be used to disable this feature. If you want to disable
seccomp within a cluster, add the following
clusterSecurity section to the
cluster's configuration file:
apiVersion: baremetal.cluster.gke.io/v1 kind: Cluster metadata: name: example namespace: cluster-example spec: ... clusterSecurity: enableSeccomp: false ...
In the unlikely event that some of your workloads need to execute system
seccomp blocks by default, you don't have to disable
the whole cluster. Instead, you can single out particular workloads to run in
unconfined mode. Running a workload in
unconfined mode frees that workload
from the restrictions that the
seccomp profile imposes on the rest of the
To run a container in
unconfined mode, add the following
section to the Pod manifest:
apiVersion: v1 kind: Pod .... spec: securityContext: seccompProfile: type: Unconfined ....
Don't run containers as
By default, processes in containers execute as
root. This poses a potential
security problem, because if a process breaks out of the container, that process
runs as root on the host machine. It's therefore advisable to run all your
workloads as a non-root user.
The following sections describe two ways of running containers as a non-root user.
Method #1: add
USER instruction in
This method uses a
Dockerfile to ensure that containers don't run as a
user. In a
Dockerfile, you can specify which user the process inside a container
should be run as. The following snippet from a
Dockerfile shows how to do this:
.... #Add a user with userid 8877 and name nonroot RUN useradd −u 8877 nonroot #Run Container as nonroot USER nonroot ....
In this example, the Linux command
useradd -u creates a user called
inside the container. This user has a user ID (UID) of
The next line in the
Dockerfile runs the command
USER nonroot. This command
specifies that from this point on in the image, commands are run as the user
Grant permissions to UID
8877 so that the container processes can execute
Method #2: add securityContext fields in Kubernetes manifest file
This method uses a Kubernetes manifest file to ensure that containers don't run
root user. Security settings are specified for a Pod, and those security
settings are in turn applied to all containers within the Pod.
The following example shows an excerpt of a manifest file for a given Pod:
apiVersion: v1 kind: Pod metadata: name: name-of-pod spec: securityContext: runAsUser: 8877 runAsGroup: 8877 ....
runAsUser field specifies that for any containers in the Pod, all
processes run with user ID
runAsGroup field specifies that these
processes have a primary group ID (GID) of
8877. Remember to grant the
necessary and sufficient permissions to UID
8877 so that the container
processes can execute properly.
This ensures that processes within a container are run as UID
8877, which has
fewer privileges than root.
System containers in GKE on Bare Metal help install and manage clusters. The
UIDs and GIDs used by these containers can be controlled by the field
in the cluster specification. The
startUIDRangeRootlessContainers is an optional field which, if not specified,
will have a value of 2000. Allowed values for
are 1000–57000. The
startUIDRangeRootlessContainers value can be changed
during upgrades only. The system containers will use the UIDs and GIDs in
startUIDRangeRootlessContainers + 2999.
The following example shows an excerpt of a manifest file for a Cluster:
apiVersion: baremetal.cluster.gke.io/v1 kind: Cluster metadata: name: name-of-cluster spec: clusterSecurity: startUIDRangeRootlessContainers: 5000 ...
The value for
startUIDRangeRootlessContainers should be chosen in a way that
the UID and GID space used by the system containers do not overlap with those
assigned to user workloads.
How to disable rootless mode
Starting with GKE on Bare Metal release 1.10, Kubernetes control plane containers and system containers run as non-root users by default. GKE on Bare Metal assigns these users UIDs and GIDs in the range 2000–4999. However, this assignment can cause problems if those UIDs and GIDs have already been allocated to processes running inside your environment.
Starting with GKE on Bare Metal release 1.11, you can disable rootless mode when you upgrade your cluster. When rootless mode is disabled, Kubernetes control plane containers and system containers run as the root user.
To disable rootless mode, perform the following steps:
Add the following
clusterSecuritysection to the cluster's configuration file:
apiVersion: baremetal.cluster.gke.io/v1 kind: Cluster metadata: name: example namespace: cluster-example spec: ... clusterSecurity: enableRootlessContainers: false ...
Upgrade your cluster. For details, see Upgrade clusters.
Restrict the ability for workloads to self-modify
Certain Kubernetes workloads, especially system workloads, have permission to self-modify. For example, some workloads vertically autoscale themselves. While convenient, this can allow an attacker who has already compromised a node to escalate further in the cluster. For example, an attacker could have a workload on the node change itself to run as a more privileged service account that exists in the same namespace.
Ideally, workloads should not be granted the permission to modify themselves in the first place. When self-modification is necessary, you can limit permissions by applying Gatekeeper or Policy Controller constraints, such as NoUpdateServiceAccount from the open source Gatekeeper library, which provides several useful security policies.
When you deploy policies, it is usually necessary to allow the controllers that
manage the cluster lifecycle to bypass the policies. This is necessary so that
the controllers can make changes to the cluster, such as applying cluster
upgrades. For example, if you deploy the
NoUpdateServiceAccount policy on
GKE on Bare Metal, you must set the following parameters in the
parameters: allowedGroups: - system:masters allowedUsers: 
Disable kubelet read-only port
Starting with release 1.15.0, GKE on Bare Metal disables by default port 10255, the kubelet read-only port. Any customer workloads that are configured to read data from this insecure kubelet port 10255 should migrate to use the secure kubelet port 10250.
Only clusters created with version 1.15.0 or higher have this port disabled by default. The kubelet read-only port 10255 remains accessible for clusters created with a version lower than 1.15.0, even after a cluster upgrade to version 1.15.0 or higher.
This change was made because the kubelet leaks low sensitivity information over port 10255, which is unauthenticated. The information includes the full configuration information for all Pods running on a Node, which can be valuable to an attacker. It also exposes metrics and status information, which can provide business-sensitive insights.
Disabling the kubelet read-only port is recommended by the CIS Kubernetes Benchmark.
Monitoring security bulletins and upgrading your clusters are important security measures to take once your clusters are up and running.
Monitor security bulletins
The GKE Enterprise security team publishes security bulletins for high and critical severity vulnerabilities.
These bulletins follow a common Google Cloud vulnerability numbering
scheme and are linked from the main Google Cloud bulletins page and the
GKE on Bare Metal release notes. Use this XML feed to subscribe to security
bulletins for GKE on Bare Metal and related products:
When customer action is required to address these high and critical vulnerabilities, Google contacts customers by email. In addition, Google might also contact customers with support contracts through support channels.
For more information about how Google manages security vulnerabilities and patches for GKE and GKE Enterprise, see Security patching.
Kubernetes regularly introduces new security features and provides security patches. GKE on Bare Metal releases incorporate Kubernetes security enhancements that address security vulnerabilities that may affect your clusters.
You are responsible for keeping your GKE clusters up to date. For each release, review the release notes. To minimize security risks to your GKE clusters, plan to update to new patch releases every month and minor versions every three months.
One of the many advantages of upgrading a cluster is that it automatically
refreshes the cluster's kubeconfig file. The kubeconfig file authenticates a
user to a cluster. The kubeconfig file is added to your cluster directory when
you create a cluster with
bmctl. The default name and path is
When you upgrade a cluster, that cluster's kubeconfig file is automatically
renewed. Otherwise, the kubeconfig file expires one year after it was created.
For information about how to upgrade your clusters, see upgrade your clusters.
Use VPC Service Controls with Cloud Interconnect or Cloud VPN
Cloud Interconnect provides low latency, high availability connections that let you transfer data reliably between your on-premises bare metal machines and Google Cloud Virtual Private Cloud (VPC) networks. To learn more about Cloud Interconnect, see Dedicated Interconnect provisioning overview.
VPC Service Controls works with either Cloud Interconnect or Cloud VPN to provide additional security for your clusters. VPC Service Controls helps to mitigate the risk of data exfiltration. Using VPC Service Controls, you can add projects to service perimeters that protect resources and services from requests that originate outside the perimeter. To learn more about service perimeters, see Service perimeter details and configuration.
To fully protect GKE on Bare Metal, you need to use Restricted VIP and add the following APIs to the service perimeter:
- Artifact Registry API (
- Cloud Resource Manager API (
- Compute Engine API (
- Connect Gateway API (
- Google Container Registry API (
- GKE Connect API (
- GKE Hub API (
- GKE On-Prem API (
- Identity and Access Management (IAM) API (
- Cloud Logging API (
- Cloud Monitoring API (
- Config Monitoring for Ops API (
- Service Control API (
- Cloud Storage API (
When you use
bmctl to create or upgrade a cluster, use the
--skip-api-check flag to
bypass calling Service Usage API (
serviceusage.googleapis.com). Service Usage
API isn't supported by VPC Service Controls.