[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-04 (世界標準時間)。"],[[["\u003cp\u003eDataproc utilizes the Hadoop Distributed File System (HDFS) for storage and integrates with Cloud Storage.\u003c/p\u003e\n"],["\u003cp\u003eData can be moved into and out of Dataproc clusters via upload and download to HDFS or Cloud Storage.\u003c/p\u003e\n"],["\u003cp\u003eHDFS data and intermediate shuffle data are stored on VM boot disks by default, unless local SSDs are configured.\u003c/p\u003e\n"],["\u003cp\u003ePersistent disk size and type influence performance and VM size, regardless of whether HDFS or Cloud Storage is utilized.\u003c/p\u003e\n"],["\u003cp\u003eVM Boot disks are deleted when the cluster is deleted.\u003c/p\u003e\n"]]],[],null,["Dataproc integrates with Apache Hadoop and the Hadoop Distributed\nFile System (HDFS). The following features and considerations can be important\nwhen selecting compute and data storage options for Dataproc\nclusters and jobs:\n\n- HDFS with Cloud Storage: Dataproc uses the Hadoop Distributed File System (HDFS) for storage. Additionally, Dataproc automatically installs the HDFS-compatible [Cloud Storage connector](/dataproc/docs/concepts/connectors/cloud-storage), which enables the use of Cloud Storage in parallel with HDFS. Data can be moved in and out of a cluster through upload and download to HDFS or Cloud Storage.\n- VM disks:\n - By default, when no local SSDs are provided, HDFS data and intermediate shuffle data is stored on VM boot disks, which are [Persistent Disks](https://cloud.google.com/persistent-disk/).\n - If you use [local SSDs](/dataproc/docs/concepts/compute/dataproc-local-ssds), HDFS data and intermediate shuffle data is stored on the SSDs.\n - Persistent disk (PD) size and type affect performance and VM size, whether using HDFS or Cloud Storage for data storage.\n - **VM Boot disks are deleted when the cluster is deleted.**"]]