This page shows you how to use Image streaming in Google Kubernetes Engine (GKE) to pull container images by streaming the image data as your applications need it.
New Autopilot clusters that run GKE version 1.25.5-gke.1000 and later automatically use Image streaming to pull eligible images. The instructions on this page only apply to Standard clusters.
Overview
Image streaming is a method of pulling container images in which GKE streams data from eligible images as requested by your applications. You can use Image streaming to allow your workloads to initialize without waiting for the entire image to download, which leads to significant improvements in initialization times. The shortened pull time provides you with benefits including the following:
- Faster autoscaling
- Reduced latency when pulling large images
- Faster Pod startup
With Image streaming, GKE uses a remote filesystem as the root filesystem for any containers that use eligible container images. GKE streams image data from the remote filesystem as needed by your workloads. Without Image streaming, GKE downloads the entire container image onto each node and uses it as the root filesystem for your workloads.
While streaming the image data, GKE downloads the entire container image onto the local disk in the background and caches it. GKE then serves future data read requests from the cached image.
When you deploy workloads that need to read specific files in the container image, the Image streaming backend serves only those requested files.
Requirements
You must meet the following requirements to use Image streaming in GKE Autopilot and Standard clusters:
You must enable the Container File System API.
New Autopilot clusters must run GKE version 1.25.5-gke.1000 or later to have Image streaming automatically enabled. For instructions, refer to Set the version and release channel of a new Autopilot cluster.
New and existing GKE Standard clusters must run version 1.18.6-gke.4801 or later.
You must use the Container-Optimized OS with containerd node image. Autopilot nodes always use this node image.
Your container images must be stored in Artifact Registry.
If you enable private nodes on your cluster, you must enable Private Google Access on the subnet for your nodes to access the Image streaming Service.
If VPC Service Controls protects your container images and you use Image streaming, you must also include the Image streaming API (
containerfilesystem.googleapis.com
) in the service perimeter.If the GKE nodes in the cluster don't use the default service account, you must ensure that your custom service account has the Service Usage Consumer (
roles/serviceusage.serviceUsageConsumer
) IAM role in the project that hosts the container image.
Limitations
- You can't use a Secret to pull container images on GKE versions prior to 1.23.5-gke.1900.
- Container images that use the V2 Image Manifest, schema version 1 are not eligible.
- Container images encrypted with customer-managed encryption keys (CMEK) are eligible for Image streaming on GKE version 1.25.3-gke.1000 or later. In previous versions, GKE downloads these images without streaming the data. You can still use CMEK to protect attached persistent disks and custom boot disks in clusters that use Image streaming.
- Container images with duplicate layers are not supported. GKE downloads these images without streaming the data. Check your container image for empty layers or duplicate layers.
- The Artifact Registry repository must be in the same
region
as your GKE nodes, or in a
multi-region
that corresponds with the region where your nodes are running. For example:
- If your nodes are in
us-east1
, Image streaming is available for repositories in theus-east1
region or theus
multi-region since both GKE and Artifact Registry are running in data center locations within the United States. - If your nodes are in the
northamerica-northeast1
region, the nodes are running in Canada. In this situation, Image streaming is only available for repositories in the same region.
- If your nodes are in
- If your workloads read many files in an image during initialization, you might notice increased initialization times because of the latency added by the remote file reads.
- You might not notice the benefits of Image streaming during the first pull of an eligible image. However, after Image streaming caches the image, future image pulls on any cluster benefit from Image streaming.
- GKE Standard clusters use the cluster-level configuration to determine whether to enable Image streaming on new node pools created using node auto-provisioning. However, you cannot use workload separation to create node pools with Image streaming enabled when Image streaming is disabled at the cluster level.
- Linux file capabilities such as
CAP_NET_RAW
are supported with Image streaming in GKE version 1.22.6-gke.300 and later. For previous GKE versions, these capabilities are not available when the image file is streamed, or when the image is saved to the local disk. To avoid potential disruptions, do not use Image streaming for containers with these capabilities in GKE versions prior to 1.22.6-gke.300. If your container relies on Linux file capabilities, it might fail to start with permission denied errors when running with Image streaming enabled.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
Enable the Container File System API.
Enable Image streaming on clusters
You can enable Image streaming on new or existing Standard clusters
by using the gcloud CLI --enable-image-streaming
flag, or using the
Google Cloud console. By default, node pools in the cluster inherit the
Image streaming setting at the cluster level. You can change this behaviour
by
enabling or disabling Image streaming on node pools
in the cluster.
All new Autopilot clusters that run GKE version 1.25.5-gke.1000 and later use Image streaming to pull eligible images. For instructions, refer to Set the version and release channel of a new Autopilot cluster. The following instructions only apply to GKE Standard clusters.
On a new cluster
You can enable Image streaming on new clusters using the gcloud CLI or the Google Cloud console.
gcloud
To create a new cluster with Image streaming enabled, run the following command:
gcloud container clusters create CLUSTER_NAME \
--zone=COMPUTE_ZONE \
--image-type="COS_CONTAINERD" \
--enable-image-streaming
Replace the following:
CLUSTER_NAME
: the name of your new cluster.COMPUTE_ZONE
: the Compute Engine zone for your new cluster. For regional clusters, use the--region=COMPUTE_REGION
flag instead. Ensure that the zone or region is the same region or is within the multi-region of the Artifact Registry repository that contains the image.
Console
Go to the Google Kubernetes Engine page in the Google Cloud console.
Click add_box Create.
In the GKE Standard section, click Configure.
From the navigation pane, under Cluster, click Features.
In the Other section, select the Enable Image streaming checkbox.
Configure the cluster as needed, and then click Create.
On an existing cluster
You can enable Image streaming on existing clusters that meet the requirements using either the gcloud CLI or the Google Cloud console.
gcloud
To update an existing cluster to use Image streaming, run the following command using the gcloud CLI:
gcloud container clusters update CLUSTER_NAME \
--enable-image-streaming
Console
Go to the Google Kubernetes Engine page in the Google Cloud console.
Click the name of the cluster you want to modify.
On the Clusters page, in the Features section, click edit next to Image streaming.
In the Edit Image streaming dialog box, select the Enable Image streaming checkbox.
Click Save changes.
After you modify the cluster, GKE enables Image streaming on your existing node pools automatically by default. If you explicitly enabled or disabled Image streaming on individual node pools, those node pools do not inherit the changes to the cluster-level setting.
Verify Image streaming is enabled on a cluster
You can check whether Image streaming is enabled at the cluster level using either the gcloud CLI or the Google Cloud console.
gcloud
Run the following command:
gcloud container clusters describe CLUSTER_NAME \
--flatten "nodePoolDefaults.nodeConfigDefaults"
The setting is enabled if the output is similar to the following:
gcfsConfig:
enabled: true
...
The setting is disabled if the output is similar to the following:
gcfsConfig: {}
...
Console
Go to the Google Kubernetes Engine page in the Google Cloud console.
Click the name of the cluster you want to check.
On the Clusters page, in the Features section, next to Image streaming it will show whether the setting is enabled.
Enable Image streaming on node pools
By default, node pools inherit the Image streaming setting at the cluster level. You can enable or disable Image streaming on specific node pools using the gcloud CLI.
On a new node pool
To create a new node pool with Image streaming enabled, run the following command:
gcloud container node-pools create NODE_POOL_NAME \
--cluster=CLUSTER_NAME \
--zone=COMPUTE_ZONE \
--image-type="COS_CONTAINERD" \
--enable-image-streaming
Replace the following:
NODE_POOL_NAME
: the name of your new node pool.CLUSTER_NAME
: the name of the cluster for the node pool.COMPUTE_ZONE
: the Compute Engine zone of your cluster. For regional clusters, use the--region=COMPUTE_REGION
flag instead.
On an existing node pool
You can enable Image streaming on existing node pools that meet the requirements.
To update an existing node pool to use Image streaming, run the following command:
gcloud container node-pools update POOL_NAME \
--cluster=CLUSTER_NAME \
--enable-image-streaming
Verify Image streaming is enabled on a node pool
Check whether Image streaming is enabled for a node pool:
gcloud container node-pools describe POOL_NAME \
--cluster=CLUSTER_NAME \
The setting is enabled if the output is similar to the following:
gcfsConfig:
enabled: true
...
The setting is disabled if the output is similar to the following:
gcfsConfig: {}
...
Schedule a workload using Image streaming
After you enable Image streaming on your cluster, GKE automatically uses Image streaming when pulling eligible container images from Artifact Registry without requiring further configuration.
GKE adds the cloud.google.com/gke-image-streaming: "true"
label
to nodes in node pools with Image streaming enabled. On GKE
Standard, if you enable or disable Image streaming on specific node
pools so that your cluster has a mix of nodes that use Image streaming and
nodes that don't, you can use
node selectors
in your deployments to control whether GKE schedules your
workloads on nodes that use Image streaming.
In the following example, you schedule a Deployment that uses a large container image on a cluster with Image streaming enabled. You can then optionally compare the performance to an image pull without Image streaming enabled.
Create a new cluster with Image streaming enabled:
gcloud container clusters create CLUSTER_NAME \ --zone=COMPUTE_ZONE \ --enable-image-streaming \ --image-type="COS_CONTAINERD"
Get credentials for the cluster:
gcloud container clusters get-credentials CLUSTER_NAME \ --zone=COMPUTE_ZONE
Save the following manifest as
frontend-deployment.yaml
:apiVersion: apps/v1 kind: Deployment metadata: name: frontend spec: replicas: 1 selector: matchLabels: app: guestbook tier: frontend template: metadata: labels: app: guestbook tier: frontend spec: containers: - name: php-redis image: us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5 env: - name: GET_HOSTS_FROM value: "dns" resources: requests: cpu: 100m memory: 100Mi ports: - containerPort: 80
The
gb-frontend
container image is 327 MB in size.Apply the manifest to your cluster:
kubectl apply -f frontend-deployment.yaml
Verify that GKE created the Deployment:
kubectl get pods -l app=guestbook
The output is similar to the following:
NAMESPACE NAME READY STATUS RESTARTS AGE default frontend-64bcc69c4b-pgzgm 1/1 Completed 0 3s
Get the Kubernetes event log to see image pull events:
kubectl get events --all-namespaces
The output is similar to the following:
NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE default 11m Normal Pulling pod/frontend-64bcc69c4b-pgzgm Pulling image "us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5" default 11m Normal Pulled pod/frontend-64bcc69c4b-pgzgm Successfully pulled image "us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5" in 1.536908032s default 11m Normal ImageStreaming node/gke-riptide-cluster-default-pool-f1552ec4-0pjv Image us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5 is backed by image streaming. ...
In this output:
- The
Pulled
event shows the time taken for Image streaming to pull the image. The
ImageStreaming
event shows that the node uses Image streaming to serve the container image.
- The
Compare performance with standard image pulls
In this optional example, you create a new cluster with Image streaming
disabled and deploy the frontend
Deployment to compare performance with
Image streaming.
Create a new cluster with Image streaming disabled:
gcloud container clusters create CLUSTER2_NAME\ --zone=COMPUTE_ZONE \ --image-type="COS_CONTAINERD"
Get credentials for the cluster:
gcloud container clusters get-credentials CLUSTER2_NAME \ --zone=COMPUTE_ZONE
Deploy the
frontend
Deployment from the previous example:kubectl apply -f frontend-deployment.yaml
Get the Kubernetes event log:
kubectl get events --all-namespaces
The output is similar to the following:
NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE default 87s Normal Pulled pod/frontend-64bcc69c4b-qwmfp Successfully pulled image "us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5" in 23.929723476s
Notice the time GKE took to pull the entire image. In this example output, GKE needed almost 24 seconds. With Image streaming enabled, GKE only needed 1.5 seconds to pull the image data that the workload required to start.
Clean up
To avoid charges, delete the clusters you created in the previous examples:
gcloud container clusters delete CLUSTER_NAME CLUSTER2_NAME
Disable Image streaming
If you use GKE Autopilot, you can't disable Image streaming on individual clusters. You can disable the Container File System API, which disables Image streaming for the entire project.
If you use GKE Standard clusters, you can disable Image streaming on individual clusters or specific node pools, as described in the following sections.
Disable Image streaming on a GKE Standard cluster
You can disable Image streaming on existing GKE Standard clusters using the gcloud CLI or the Google Cloud console.
gcloud
To disable Image streaming on an existing cluster, run the following command:
gcloud container clusters update CLUSTER_NAME \
--no-enable-image-streaming
Console
Go to the Google Kubernetes Engine page in the Google Cloud console.
Click the name of the cluster you want to modify.
On the Clusters page, under Features, click edit next to Image streaming.
In the Edit Image streaming dialog box, clear the Enable Image streaming checkbox.
Click Save changes.
On a new node pool
To disable Image streaming when creating a new node pool, specify the
--no-enable-image-streaming
flag, such as in the following command:
gcloud container node-pools create NODE_POOL_NAME \
--cluster=CLUSTER_NAME \
--zone=COMPUTE_ZONE \
--no-enable-image-streaming
On an existing node pool
To disable Image streaming on an existing node pool, run the following command:
gcloud container node-pools update NODE_POOL_NAME \
--cluster=CLUSTER_NAME \
--no-enable-image-streaming
Memory reservation for Image streaming
GKE reserves memory resources for Image streaming in addition to the memory that is reserved for node system components to run. GKE does not reserve additional CPU resources for Image streaming. In GKE Standard clusters, this reservation changes the memory resources that are available for you to request in your Pods. In GKE Autopilot, GKE manages system allocations, so there's no impact to scheduling your workloads.
For details about the memory reservations GKE makes for node components, see Standard cluster architecture.
In nodes that use Image streaming, GKE makes the following additional memory reservations for new reservations:
- No additional memory for machines with less than 1 GiB of memory
- 1% of the first 4 GiB of memory
- 0.8% of the next 4 GiB of memory (up to 8 GiB)
- 0.4% of the next 8 GiB of memory (up to 16 GiB)
- 0.24% of the next 112 GiB of memory (up to 128 GiB)
- 0.08% of any memory above 128 GiB
Troubleshooting
GKE doesn't use the Image streaming filesystem
If your GKE event log doesn't show the Image streaming events,
your image is not backed by the remote filesystem. If GKE previously
pulled the image on the node, this is expected behavior because GKE
uses the local cache of the image for subsequent pulls instead of using Image streaming.
You can verify this by looking for Container image IMAGE_NAME already present on machine
in the Message
field for the Pod Pulled
event.
If you don't see the Image streaming event during the first image pull on
the node, ensure that you meet the requirements for Image streaming. If you
meet the requirements, you can diagnose the issue by checking the logs of the
Image streaming Service (named gcfsd
):
Go to the Logs Explorer page in the Google Cloud console:
In the Query field, specify the following query:
logName="projects/PROJECT_ID/logs/gcfsd" resource.labels.cluster_name="CLUSTER_NAME"
Replace the following:
PROJECT_ID
: The name of your project.CLUSTER_NAME
: The name of your cluster.
Click Run query.
You can also check the gcfsd
logs using Logs Explorer:
Go to the Logs Explorer in the Google Cloud console:
In the Query field, specify the following query:
logName="projects/PROJECT_ID/logs/gcfsd"
Replace
PROJECT_ID
with your Google Cloud project ID.
PermissionDenied
If the gcfsd
logs display an error message similar to the following, the node
doesn't have the correct API scope. GKE pulls container images
for workloads without using Image streaming.
level=fatal msg="Failed to create a Container File System client: rpc error:
code = PermissionDenied desc = failed to probe endpoint: rpc error: code = PermissionDenied
desc = Request had insufficient authentication scopes."
You can fix this by granting the correct scope to the node to allow it to use
Image streaming. Add the devstorage.read_only
scope to the cluster or
node pool, similar to the following command:
gcloud container node-pools create NODE_POOL_NAME \
--cluster=CLUSTER_NAME \
--zone=COMPUTE_ZONE \
--image-type="COS_CONTAINERD" \
--enable-image-streaming \
--scope="https://www.googleapis.com/auth/devstorage.read_only"
FailedPrecondition
If you notice an error message with code = FailedPrecondition
, the image
wasn't imported to the Image streaming remote filesystem.
You might notice this error if you tried to use Image streaming with an existing node pool. If a node in the node pool already has the container image on-disk, GKE uses the local image instead of using Image streaming to get the image.
To fix this, try the following:
- Wait a few minutes and try to deploy your workload again.
- Add new nodes or a new node pool and schedule the workload on those nodes.
InvalidArgument
If you notice an error message with code=InvalidArgument
, the container image
your workload uses is not eligible for Image streaming. Ensure that the image
meets the requirements. If your image is not on Artifact Registry,
try migrating to Artifact Registry.
backend.FileContent failed
The following error might appear when reading container files with Image streaming enabled:
level=error msg="backend.FileContent failed" error="rpc error: code = ResourceExhausted desc = Quota exceeded for quota metric 'Content requests per project per region' and limit 'Content requests per project per region per minute per region' of service 'containerfilesystem.googleapis.com' for consumer 'project_number:PROJECT_NUMBER'." layer_id="sha256:1234567890" module=gcfs_backend offset=0 path=etc/passwd size=4096
This error indicates project has exceeded the quota required to read files from the remote container file system service. To resolve this issue, increase the following quotas:
- Content requests per project per region per minute per region
- Content requests per project per region
GKE downloads the image without streaming the data
Container images using customer-managed encryption keys (CMEK) are only eligible for Image streaming on GKE version 1.25.3-gke.1000 or later. Container images with duplicate layers are not eligible for Image streaming. See the Limitations for more information.
Checking for empty layers or duplicate layers
To check the container image for empty layers or duplicate layers, run the following command:
docker inspect IMAGE_NAME
Replace IMAGE_NAME
with the name of the container image.
In the output of the command, inspect the entries under "Layers"
.
If one of the entries exactly matches the following"sha256"
output, the
container image has an empty layer and is not eligible for Image streaming.
"Layers": [ ... "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4", ... ]
If there are duplicate entries like in the following example, the container image has duplicate layers and is not eligible for Image streaming.
"Layers": [
"sha256:28699c71935fe3ffa56533db44ad93e5a30322639f7be70d5d614e06a1ae6d9b",
...
"sha256:28699c71935fe3ffa56533db44ad93e5a30322639f7be70d5d614e06a1ae6d9b",
...
]
mv
command and renameat2
system calls fail on symlink files
For GKE nodes running version 1.25 and later, when
Image streaming is enabled, the mv
command and renameat2
system call
might fail on symlink files in container images with the error message "No such
device or address". The issue is caused by a regression on recent Linux kernels.
These system calls are not common, so the majority of images are not affected by this problem. The issue typically happens on container initialization stages when an application is being prepared to run and move around files. It is not possible to test the image locally, so GKE recommends to use Image streaming on test environments to find the issue before the image is used in production.
The fix is available in the following GKE patch versions:
- 1.25: 1.25.14-gke.1351000 and later
- 1.26: 1.26.9-gke.1345000 and later
- 1.27: 1.27.6-gke.100 and later
- 1.28: 1.28.1-gke.1157000 and later
Alternatively, to mitigate this issue for any affected workloads, you can try replacing the
code leading to the renameat2
system call. If you cannot modify the code, you
must disable Image streaming on the node pool to mitigate the
issue.