Dataproc Persistent Boot Disks

You select standard, SSD, or balanced persistent disks as boot disks for Dataproc cluster nodes.

Select persistent boot disk types for cluster nodes

You can select persistent boot disks when you create a cluster using the Google Cloud console, Google Cloud CLI, or Dataproc API.

Console

You can create a cluster and select a standard, SSD, or balanced persistent boot disk for master, primary worker, and secondary worker cluster nodes from the Configure nodes panel on the Dataproc Create a cluster page of the Google Cloud console.

Google Cloud CLI

You can create a cluster and select a standard, SSD , or balanced persistent boot disk for master, primary worker, and secondary worker cluster nodes using the gcloud dataproc clusters create command with the --master-boot-disk-type, --worker-boot-disk-type, and --secondary-worker-boot-disk-type flags.

The default persistent boot disk type for Dataproc cluster master and primary worker nodes is pd-standard. The default persistent boot disk type for cluster secondary worker nodes is the primary worker node persistent boot disk type. You can pass a value of pd-standard, pd-ssd or pd-balanced to the --master-boot-disk-type, --worker-boot-disk-type, and --secondary-worker-boot-disk-type flags. Any of the valid disk type values can be set on any cluster node type.

Example:
gcloud dataproc clusters create CLUSTER_NAME \
    --region=REGION \
    --master-boot-disk-type=pd-ssd \
    --worker-boot-disk-type=pd-ssd \
    --secondary-worker-boot-disk-type=pd-standard \
    other args ...

REST API

The default boot disk type for Dataproc cluster master and primary worker nodes is pd-standard. The default boot disk type for secondary worker nodes is the primary work node boot disk type. You can set a value of pd-standard, pd-ssd or pd-balanced in the InstanceGroupConfig.DiskConfig.bootDiskType field in the masterConfig, workerConfig, and secondaryWorkerConfig as part of a cluster.create API request. Any of the valid boot disk type type values can be set on any cluster node type.