Cloud Dataproc Local SSDs

In supplement the boot disk, you can attach local Solid State Drives (local SSDs) to master, primary worker, and secondary worker nodes in your cluster. Local SSDs can provide faster read and write times than persistent disk. The size of each local SSD is fixed, but you can attach multiple local SSDs to increase SSD storage (see Adding Local SSDs). Each local SSD is mounted to /mnt/<id> in Cloud Dataproc cluster nodes. By default, local SSDs are used for writing and reading Apache Hadoop and Apache Spark scratch files, such as shuffle outputs.

Using local SSDs

gcloud command

Use the gcloud dataproc clusters create command with the ‑‑num-master-local-ssds, ‑‑num-workers-local-ssds, and --num-preemptible-worker-local-ssds flags to attach local SSDs to the cluster's master, primary, and secondary (preemptible) worker nodes, respectively.

Example:

gcloud dataproc clusters create cluster-name \
    ‑‑num-master-local-ssds=1 \
    ‑‑num-worker-local-ssds=1  \
    --num-preemptible-worker-local-ssds=1 \
    ... other args ...

REST API

Set the numLocalSsds field in the masterConfig, workerConfig, and secondaryWorkerConfig InstanceGroupConfig in a cluster.create API request to attach local SSDs to the cluster's master, primary worker, and secondary (preemptible) worker nodes, respectively.

Console

Create a cluster and attach local SSDs to the primary worker node(s) from the Cloud Dataproc Create a cluster page of the Google Cloud Platform Console.

هل كانت هذه الصفحة مفيدة؟ يرجى تقييم أدائنا:

إرسال تعليقات حول...

Cloud Dataproc Documentation
هل تحتاج إلى مساعدة؟ انتقل إلى صفحة الدعم.