This page describes how to use GKE Sandbox to protect the host kernel on your nodes when containers in the Pod execute unknown or untrusted code, or need extra isolation from the node.
Enabling GKE Sandbox
You can enable GKE Sandbox on a new cluster or an existing cluster.
Before you begin
Before you start, make sure you have performed the following tasks:
- Ensure that you have enabled the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- Ensure that you have installed the Cloud SDK.
Set up default gcloud
settings using one of the following methods:
- Using
gcloud init
, if you want to be walked through setting defaults. - Using
gcloud config
, to individually set your project ID, zone, and region.
Using gcloud init
If you receive the error One of [--zone, --region] must be supplied: Please specify
location
, complete this section.
-
Run
gcloud init
and follow the directions:gcloud init
If you are using SSH on a remote server, use the
--console-only
flag to prevent the command from launching a browser:gcloud init --console-only
-
Follow the instructions to authorize
gcloud
to use your Google Cloud account. - Create a new configuration or select an existing one.
- Choose a Google Cloud project.
- Choose a default Compute Engine zone.
Using gcloud config
- Set your default project ID:
gcloud config set project PROJECT_ID
- If you are working with zonal clusters, set your default compute zone:
gcloud config set compute/zone COMPUTE_ZONE
- If you are working with regional clusters, set your default compute region:
gcloud config set compute/region COMPUTE_REGION
- Update
gcloud
to the latest version:gcloud components update
- GKE Sandbox requires GKE version 1.13.5-gke.15 or later, for the cluster control plane and nodes.
- Ensure that the
gcloud
command is version 243.0.0 or later.
On a new cluster
To enable GKE Sandbox, you configure a node pool. The default node pool (the first node pool in your cluster, created when the cluster is created) cannot use GKE Sandbox. To enable GKE Sandbox during cluster creation, you must add a second node pool when you create the cluster.
Console
To view your clusters, visit the Google Kubernetes Engine menu in Cloud Console.
Visit the Google Kubernetes Engine menu in Cloud Console.
Click add_box Create.
Optional but recommended: Enable Google Cloud's operations suite for GKE, so that gVisor messages are logged.
Click Add node pool.
Configure the following settings for the node pool:
- In the Nodes section, for Image type, select Container-Optimized OS with Containerd (cos_containerd).
- In the Security section, select the Enable sandbox with gVisor checkbox.
Configure other node pool settings as required.
Save the node pool settings and continue configuring your cluster.
gcloud
GKE Sandbox can't be enabled for the default node pool, and it isn't
possible to create additional node pools at the same time as you create a
new cluster using the gcloud
command. Instead, create your cluster as you
normally would. It is optional but recommended that you enable Stackdriver
Logging and Stackdriver Monitoring, by adding the flag
--enable-stackdriver-kubernetes
. gVisor messages are logged.
Next, use the gcloud container node-pools create
command, and set the
--sandbox
flag to type=gvisor
. Replace values in square brackets with your
own, and remember to specify a node version of v1.12.6-gke.8 or higher.
gcloud container node-pools create node-pool-name \
--cluster=cluster-name \
--node-version=node-version \
--machine-type=machine-type \
--image-type=cos_containerd \
--sandbox type=gvisor \
The gvisor
RuntimeClass is instantiated during node creation, before any
workloads are scheduled onto the node. You can check for the existence of the
gvisor
RuntimeClass using the following command:
kubectl get runtimeclasses
NAME AGE
gvisor 19s
If you are running a version earlier than 1.17.9-gke.1500, or a 1.18 version
earlier than 1.18.6-gke.600, you must also wait for gvisor.config.common-webhooks.networking.gke.io
to be instantiated. To check, use the following command:
kubectl get mutatingwebhookconfiguration gvisor.config.common-webhooks.networking.gke.io
NAME CREATED AT
gvisor.config.common-webhooks.networking.gke.io 2020-04-06T17:07:17Z
On an existing cluster
You can enable GKE Sandbox on an existing cluster by adding a new node pool and enabling the feature for that node pool, or by modifying an existing non-default node pool.
Console
Visit the Google Kubernetes Engine menu in Cloud Console.
Click the cluster's Edit button, which looks like a pencil.
If necessary, add an additional node pool by clicking Add node pool. To edit an existing node pool, click the node pool's Edit button. Do not enable Sandbox with gVisor on the default node pool.
Enable Sandbox with gVisor, then click Done.
If necessary, make additional configuration changes to the cluster, then click Save.
gcloud
To create a new node pool with GKE Sandbox enabled, use a command like the following:
gcloud container node-pools create node-pool-name \
--cluster=cluster-name \
--machine-type=machine-type \
--image-type=cos_containerd \
--sandbox type=gvisor
To enable GKE Sandbox on an existing node pool, use a command like
the following. Do not enable --sandbox type=gvisor
on the default node
pool.
gcloud container node-pools update node-pool-name \
--sandbox type=gvisor
The gvisor
RuntimeClass is instantiated during node creation, before any
workloads are scheduled onto the node. You can check for the existence of the
gvisor
RuntimeClass using the following command:
kubectl get runtimeclasses
NAME AGE
gvisor 19s
If you are running a version earlier than 1.17.9-gke.1500, or a 1.18 version
earlier than 1.18.6-gke.600, you must also wait for gvisor.config.common-webhooks.networking.gke.io
to be instantiated. To check, use the following command:
kubectl get mutatingwebhookconfiguration gvisor.config.common-webhooks.networking.gke.io
NAME CREATED AT
gvisor.config.common-webhooks.networking.gke.io 2020-04-06T17:07:17Z
Optional: Enable Stackdriver Logging and Stackdriver Monitoring
It is optional but recommended that you enable Stackdriver Logging and Stackdriver Monitoring on the cluster, so that gVisor messages are logged. You must use Google Cloud Console to enable these features on an existing cluster.
Visit the Google Kubernetes Engine menu in Cloud Console.
Click the cluster's Edit button, which looks like a pencil.
Enable Stackdriver Logging and Stackdriver Monitoring.
If necessary, make additional configuration changes to the cluster, then click Save.
Working with GKE Sandbox
Running an application in a sandbox
To force a Deployment to run on a node with GKE Sandbox enabled, set its
spec.template.spec.runtimeClassName
to gvisor
, as shown by this manifest
for a Deployment:
# httpd.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpd
labels:
app: httpd
spec:
replicas: 1
selector:
matchLabels:
app: httpd
template:
metadata:
labels:
app: httpd
spec:
runtimeClassName: gvisor
containers:
- name: httpd
image: httpd
To create the Deployment, use the kubectl create
command:
kubectl create -f httpd.yaml
The Pod is deployed to a node in a node pool with GKE Sandbox enabled. To verify the deployment, use the following command to find the node where the Pod is deployed:
kubectl get pods
The output is similar to:
NAME READY STATUS RESTARTS AGE
httpd-db5899bc9-dk7lk 1/1 Running 0 24s
From the output, find the name of the Pod in the output, and then run the following command to check its value for RuntimeClass:
kubectl get pods pod-name -o jsonpath='{.spec.runtimeClassName}'
The output is:
gvisor
Alternatively, you can list the RuntimeClass of each Pod, and look for the ones
where it is set to gvisor
:
kubectl get pods -o jsonpath=$'{range .items[*]}{.metadata.name}: {.spec.runtimeClassName}\n{end}'
Output is:
pod-name: gvisor
This method of verifying that the Pod is running in a sandbox is trustworthy because it does not rely on any data within the sandbox itself. Anything reported from within the sandbox is untrustworthy, because it could be defective or malicious.
Running a regular Pod along with sandboxed Pods
After enabling GKE Sandbox on a node pool, you can run trusted applications on those nodes without using a sandbox by using node taints and tolerations. These Pods are referred to as "regular Pods" to distinguish them from sandboxed Pods.
Regular Pods, just like sandboxed Pods, are prevented from accessing other Google Cloud services or cluster metadata. This prevention is part of the node's configuration. If your regular Pods or sandboxed Pods require access to Google Cloud services, use Workload Identity.
GKE Sandbox adds the following label and taint to nodes that can run sandboxed Pods:
labels:
sandbox.gke.io/runtime: gvisor
taints:
- effect: NoSchedule
key: sandbox.gke.io/runtime
value: gvisor
In addition to any node affinity and toleration settings in your Pod manifest,
GKE Sandbox applies the following node affinity and toleration to all
Pods with RuntimeClass
set to gvisor
:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: sandbox.gke.io/runtime
operator: In
values:
- gvisor
tolerations:
- effect: NoSchedule
key: sandbox.gke.io/runtime
operator: Equal
value: gvisor
To schedule a regular Pod on a node with GKE Sandbox enabled, manually apply the node affinity and toleration above in your Pod manifest.
- If your pod can run on nodes with GKE Sandbox enabled, add the toleration.
- If your pod must run on nodes with GKE Sandbox enabled, add both the node affinity and toleration.
For example, the following manifest modifies the manifest used in Running an application in a sandbox so that it runs as a regular Pod on a node with sandboxed Pods, by removing the runtimeClass and adding both the taint and toleration above.
# httpd-no-sandbox.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpd-no-sandbox
labels:
app: httpd
spec:
replicas: 1
selector:
matchLabels:
app: httpd
template:
metadata:
labels:
app: httpd
spec:
containers:
- name: httpd
image: httpd
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: sandbox.gke.io/runtime
operator: In
values:
- gvisor
tolerations:
- effect: NoSchedule
key: sandbox.gke.io/runtime
operator: Equal
value: gvisor
First, verify that the Deployment is not running in a sandbox:
kubectl get pods -o jsonpath=$'{range .items[*]}{.metadata.name}: {.spec.runtimeClassName}\n{end}'
The output is similar to:
httpd-db5899bc9-dk7lk: gvisor
httpd-no-sandbox-5bf87996c6-cfmmd:
The httpd
Deployment created earlier is running in a sandbox, because its
runtimeClass is gvisor
. The httpd-no-sandbox
Deployment has no value for
runtimeClass, so it is not running in a sandbox.
Next, verify that the non-sandboxed Deployment is running on a node with GKE Sandbox by running the following command:
kubectl get pod -o jsonpath=$'{range .items[*]}{.metadata.name}: {.spec.nodeName}\n{end}'
The name of the node pool is embedded in the value of nodeName
. Verify that
the Pod is running on a node in a node pool with GKE Sandbox enabled.
Verifying metadata protection
To validate the assertion that metadata is protected from nodes that can run sandboxed Pods, you can run a test:
Create a sandboxed Deployment from the following manifest, using
kubectl apply -f
. It uses thefedora
image, which includes thecurl
command. The Pod runs the/bin/sleep
command to ensure that the Deployment runs for 10000 seconds.# sandbox-metadata-test.yaml apiVersion: apps/v1 kind: Deployment metadata: name: fedora labels: app: fedora spec: replicas: 1 selector: matchLabels: app: fedora template: metadata: labels: app: fedora spec: runtimeClassName: gvisor containers: - name: fedora image: fedora command: ["/bin/sleep","10000"]
Get the name of the Pod using
kubectl get pods
, then usekubectl exec
to connect to the Pod interactively.kubectl exec -it pod-name /bin/sh
You are connected to a container running in the Pod, in a
/bin/sh
session.Within the interactive session, attempt to access a URL that returns cluster metadata:
curl -s "http://metadata.google.internal/computeMetadata/v1/instance/attributes/kube-env" -H "Metadata-Flavor: Google"
The command hangs and eventually times out, because the packets are silently dropped.
Press Ctrl+C to terminate the
curl
command, and typeexit
to disconnect from the Pod.Remove the
RuntimeClass
line from the YAML manifest and redeploy the Pod usingkubectl apply -f filename
. The sandboxed Pod is terminated and recreated on a node without GKE Sandbox.Get the new Pod name, connect to it using
kubectl exec
, and run thecurl
command again. This time, results are returned. This example output is truncated.ALLOCATE_NODE_CIDRS: "true" API_SERVER_TEST_LOG_LEVEL: --v=3 AUTOSCALER_ENV_VARS: kube_reserved=cpu=60m,memory=960Mi,ephemeral-storage=41Gi;... ...
Type
exit
to disconnect from the Pod.Remove the deployment:
kubectl delete deployment fedora
Disabling GKE Sandbox
It isn't currently possible to update a node pool to disable GKE Sandbox. To disable GKE Sandbox on an existing node pool, you can do one of the following:
- Delete the previously-sandboxed Pods. Otherwise, after you disable GKE Sandbox, those Pods run as regular Pods if no available nodes have GKE Sandbox enabled. Then delete the node pool where GKE Sandbox was enabled, or
- Resize the node pool to zero nodes, or
- Recreate the Pods without specifying a value for the RuntimeClassName.
What's next
- Learn more about managing node pools.
- Read the security overview.