Dataproc Persistent Solid State Drive (PD-SSD) Boot Disks

This feature allows you to select Persistent Disk Solid State Drives (PD-SSDs) as boot disks for your cluster's master and/or primary and/or secondary (preemptible) worker node(s).

Using PD-SSD

gcloud command

You can create a cluster and select PD-SSD or PD-Balanced as the boot disk for the cluster's master and/or primary and/or secondary worker node(s) using the gcloud dataproc clusters create command with the --master-boot-disk-type, and/or --worker-boot-disk-type, and/or --secondary-worker-boot-disk-type flag(s).

The default value for each flag is "pd-standard", which selects PD-Standard (Hard Disk Drive) as the boot disk. To select PD-SSD or PD-Balanced as the boot disk, pass a value of "pd-ssd" or "pd-balanced", respectively, to the --master-boot-disk-type, and/or --worker-boot-disk-type, and/or --secondary-worker-boot-disk-type flag(s). These flags can be used independently or in combination to select PD-SSD or PD-Balanced as the boot disk for master and/or primary and/or secondary worker node(s).

Example:
gcloud dataproc clusters create cluster-name \
    --region=region \
    --master-boot-disk-type=pd-ssd \
    --worker-boot-disk-type=pd-ssd \
    --secondary-worker-boot-disk-type=pd-ssd \
    other args ...

REST API

You can create a cluster and select PD-SSD or PD-Balanced as the boot disk for the cluster's master and/or primary and/or secondary (preemptible) worker node(s) by setting the bootDiskType field to "pd-ssd" or "pd-balanced" in the masterConfig, and/or workerConfig, and/or secondaryWorkerConfig InstanceGroupConfig object(s) in your cluster.create API request. If not specified, the default bootDiskType is "pd-standard", which selects PD-Standard (Hard Disk Drive) as the boot disk. Each config can be set independently to select pd-ssd or pd-balanced as the boot disk type for master and/or primary and/or secondary (preemptible) worker node(s).

Console

You can create a cluster and select PD-SSD as the boot disk (and select per-node boot disk size) for master, primary worker, and secondary nodes from the Configure nodes panel of the Dataproc Create a cluster page of the Google Cloud console.