About Local SSD for GKE


Local solid-state drives (SSDs) are fixed-size SSD drives, which can be mounted to a single Compute Engine VM. You can use Local SSD on GKE to get highly performant storage that is not persistent (ephemeral) that is attached to every node in your cluster. Local SSDs also provide higher throughput and lower latency than standard disks.

In version 1.25.3-gke.1800 and later, you can configure a GKE node pool to use Local SSD with an NVMe interface for local ephemeral storage or raw block storage.

If you are using GKE with Standard clusters, you can attach Local SSD volumes to nodes when creating node pools. This improves the performance of ephemeral storage for your workloads that use emptyDir Volumes or local PersistentVolumes. You can create node pools with Local SSD within your cluster's machine type limits and your project's quotas.

A Local SSD disk is attached to only one node, and nodes themselves are ephemeral. A workload can be scheduled onto a different node at any point in time.

For more information on the benefits and limitations of Local SSD, such as performance and the allowed number of SSD disks per machine type, see Local SSDs in the Compute Engine documentation.

Why choose Local SSD for GKE

Local SSD is a good choice if your workloads have any of the following requirements:

  • You want to run applications that download and process data, such as AI or machine learning, analytics, batch processing, local caching, and in-memory databases.
  • Your applications have specialized storage needs and you want raw block access on high-performance ephemeral storage.
  • You want to run specialized data applications and want to operate Local SSD volumes like a node level cache for your Pods. You can use this approach to drive better performance for in-memory database applications like Aerospike or Redis.

Ephemeral storage

We recommend using the --ephemeral-storage-local-ssd option to provision Local SSD for node ephemeral storage (this is the default if you are using a third generation machine series). This approach puts the emptyDir volumes, container-writable layers, and images that together constitute your node ephemeral storage on Local SSD. The benefits include improved I/O bandwidth than the default Persistent Disk and improved Pod startup times.

Using Local SSD for node ephemeral storage means that boot disk Local SSD volumes are not available for other uses. Don't modify the boot disk directly by using a privileged Daemonset, HostPath, or other mechanism; otherwise, the node may fail. If you need fine grained control over the Local SSD volumes, use the raw block approach instead.

Block device storage

Block device storage enables random access to data in fixed-size blocks. Some specialized applications require direct access to block device storage because, for example, the file system layer introduces unneeded overhead.

Common scenarios for using block device storage include the following:

  • Databases where data is organized directly on the underlying storage.
  • Software which itself implements some kind of storage service (software-defined storage systems).

In GKE version v1.25.3-gke.1800 and later, you can create clusters and node pools with raw block local NVMe SSDs attached, by using the --local-nvme-ssd-block option. You can then customize block storage by formatting a filesystem of choice (such as ZFS or HDFS) and configuring RAID. This option is suitable if you need additional control to run workloads that specifically requires access to block storage backed by Local SSD.

You can also use the block access approach with Local SSD if you want a unified data cache to share data across Pods, and the data is tied to a node lifecycle. To do so, install a DaemonSet with a RAID configuration, format a filesystem, and use a local PersistentVolume to share data across Pods.

Machine requirements

The way you provision Local SSD for your GKE clusters and node pools depends on your underlying machine type. GKE supports Local SSD volumes on Compute Engine first, second, and third generation machine machine series. Local SSD volumes require machine type n1-standard-1 or larger. The default machine type, e2-medium is not supported. To identify your machine series, use the number in the machine type name. For example, N1 machines are first-generation and N2 machines are second generation. For a list of available machine series and types, see Machine families resource and comparison guide in the Compute Engine documentation.

First and second generation machine series requirements

To use a first- or second-generation machine series with Local SSD, your cluster or node pool must be running GKE version 1.25.3-gke.1800 or later.

To provision Local SSD on a first or second generation machine series, you specify the number of Local SSD volumes to use with your VM. For a list of machine series and the corresponding allowable number of Local SSDs, see see Choosing a valid number of Local SSDs in the Compute Engine documentation.

Third generation machine series requirements

If you want to use a third generation machine series with Local SSD, your cluster or node pool must be running one of the following GKE versions or later:

  • 1.25.13-gke.200 to 1.26
  • 1.26.8-gke.200 to 1.27
  • 1.27.5-gke.200 to 1.28
  • 1.28.1-gke.200 to 1.29

For third generation machine series, each machine type is pre-configured with either no Local SSD or a fixed amount of Local SSD volumes. You don't specify the number of Local SSD volumes to include. Instead, the Local SSD capacity available to your clusters is implicitly defined as part of the VM shape.

For a list of machine series and the corresponding allowable number of Local SSDs, see see Choosing a valid number of Local SSDs in the Compute Engine documentation.

Usage pattern for Local SSD volumes

To use Local SSD volumes in your clusters, follow these general steps:

  1. Provision a node pool with Local SSD attached: To create a GKE node pool with attached Local SSDs, pass in either the ephemeral storage or raw block storage parameter when you call the create cluster command. Setting the Local SSD parameters creates a GKE node pool with Local SSD attached and configured as local ephemeral storage or raw block storage depending on which parameters you choose. To learn more about options for provisioning Local SSD, see Local SSD parameters.
  2. Access data from Local SSD volumes: To use the data from Local SSD volumes, you can use a Kubernetes construct such as emptyDir or local persistent volume. To learn more about these options, see Local SSD access.

Local SSD parameters on GKE

The following table summarizes the recommended parameters that GKE provides for provisioning Local SSD storage on clusters. You can use the gcloud CLI to pass in these parameters.

Local SSD type gcloud CLI command GKE availability Local SSD profile
Ephemeral Storage Local SSD gcloud container clusters create
--ephemeral-storage-local-ssd
v1.25.3-gke.1800 or later

Storage technology: NVMe

Data shared across pods: No

Data lifecycle: Pod

Size and need for RAID configuration: Up to 9 TiB. GKE automatically configures RAID under the hood.

Format: File System (Kubernetes emptyDir)

Kubernetes scheduler integration: Fully integrated by default. Kubernetes scheduler will ensure space on the node before placement and scale nodes if needed.

To learn how to use this API parameter, see Provision and use Local SSD-backed ephemeral storage.

Local NVMe SSD Block gcloud container clusters create
--local-nvme-ssd-block
v1.25.3-gke.1800 or later

Storage technology: NVMe

Data shared across pods: Yes, via Local PVs.

Data lifecycle: Node

Size and need for RAID configuration: Up to 9 TiB. You need to manually configure RAID for larger sizes.

Format: Raw block

Kubernetes scheduler integration: No, by default. You need to ensure capacity on nodes and handle noisy neighbors. If you opt-in to local PV, scheduling is integrated.

To learn how to use this API parameter, see Provision and use Local SSD-backed raw block storage.

Support for existing Local SSD parameters

The following table summarizes these existing Local SSD parameters and their recommended substitutes:

Existing Local SSD parameters gcloud CLI command Local SSD profile Recommended GA version of the Local SSD parameters
Local SSD Count parameter gcloud container clusters create
--local-ssd-count

Storage technology: SCSI

Data shared across pods: Yes, via Local PVs

Data lifecycle: Node

Size and need for RAID configuration: 375 GiB. You need to manually configure RAID for larger sizes.

Format: File System (ext-4)

Kubernetes scheduler integration: No by default. You need to ensure capacity on nodes and handle noisy neighbors. If you opt-in to local PV, scheduling is integrated.

gcloud container clusters create
--ephemeral-storage-local-ssd
Ephemeral Storage parameter (beta) gcloud beta container clusters create
--ephemeral-storage

Storage technology: NVMe

Data shared across pods: No

Data lifecycle: Pod

Size and need for RAID configuration: Up to 9 TiB. GKE automatically configures RAID under the hood.

Format: File System (Kubernetes emptyDir)

Kubernetes scheduler integration: Fully integrated by default. Kubernetes scheduler will ensure space on nodes before placement and scale nodes if needed.

gcloud container clusters create
--ephemeral-storage-local-ssd
Local SSD Volumes parameter (alpha) gcloud alpha container clusters create
--local-ssd-volumes

Storage technology: NVMe or SCSI

Data shared across pods: No

Data lifecycle: Node

Size and need for RAID configuration:

375 GiB. You need to manually configure RAID for larger sizes.

Format: File system (ext-4) or raw block

Kubernetes scheduler integration: No by default, You need to ensure capacity on nodes and handle noisy neighbors.

gcloud container clusters create
--local-nvme-ssd-block

Local SSD access

You can access Local SSD volumes with one of the following methods.

emptyDir volume

In GKE version v1.25.3-gke.1800 and later, you can use ephemeral storage as an emptyDir volume backed by Local SSD, via the --ephemeral-storage-local-ssd option. We recommend this approach for most cases, including applications that need high-performance ephemeral scratch space.

GKE lets you configure a node pool to mount node ephemeral storage on Local SSD with an NVMe interface.

To learn more, see this example.

Local persistent volume

A local persistent volume represents a local disk that is attached to a single node. Local persistent volumes allow you to share Local SSD resources across Pods. Because the local disk is a Local SSD disk, the data is still ephemeral.

We recommend this approach if any of the following run on your cluster:

  • Workloads using StatefulSets and volumeClaimTemplates.
  • Workloads that share node pools. Each Local SSD volume can be reserved through a PersistentVolumeClaim, and specific HostPaths are not encoded directly in the Pod specification.
  • Pods that require data gravity to the same Local SSD. A Pod is always scheduled to the same node as its local PersistentVolume.

To learn more, see this example and the open-source Kubernetes Volumes documentation.

Restrictions

  • Your application must gracefully handle losing access to data on Local SSD volumes. Data written to a Local SSD disk does not persist when the Pod or node is deleted, repaired, upgraded, or experiences an unrecoverable error.

    The ephemeral storage Local SSD parameter configures Local SSD volumes to have a Pod-based data lifecycle, and the NVMe Local SSD block parameter configures Local SSD volumes to have a Node-based data lifecycle.

    If you need persistent storage, we recommend you use a durable storage option (such as Persistent Disk, Filestore, or Cloud Storage). You can also use regional replicas to minimize the risk of data loss during cluster lifecycle or application lifecycle operations.

  • Local SSD configuration settings cannot be modified once the node pool is created. You cannot enable, disable, or update the Local SSD configuration for an existing node pool. You have to delete the node pool and re-create a new one to make any changes.

  • Pods using emptyDir transparently use Local SSD; however, this is true for all Pods on all nodes in that node pool. GKE does not support having, in the same node pool, some Pods that use Local SSD emptyDir volumes backed by Local SSD and other Pods that use emptyDir volumes backed by a node boot disk. If you have workloads that use emptyDir volumes backed by a node boot disk, schedule the workloads on a different node pool.

  • Autopilot clusters and node auto-provisioning is not supported for Local SSDs.

  • We recommend using the Local SSD as ephemeral storage for workloads running on storage-optimized (Z3) VMs. Z3 nodes are terminated during maintenance events. Because of this, data in the Local SSD for these nodes might not be available during maintenance events, and data recovery isn't assured after maintenance.

What's next