Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Dokumen ini menjelaskan cara membuat cluster zero-scale Dataproc.
Cluster skala nol Dataproc memberikan cara yang hemat biaya untuk menggunakan cluster Dataproc. Tidak seperti
cluster Dataproc standar
yang memerlukan minimal dua pekerja utama, cluster Dataproc berskala nol
hanya menggunakan pekerja sekunder
yang dapat diskalakan ke nol.
Cluster skala nol Dataproc ideal untuk digunakan sebagai cluster yang berjalan lama yang mengalami periode tidak ada aktivitas, seperti cluster yang menghosting notebook Jupiter.
Kebijakan ini memberikan pemanfaatan resource yang lebih baik melalui penggunaan kebijakan penskalaan otomatis
skala nol.
Karakteristik dan batasan
Cluster skala nol Dataproc memiliki kesamaan dengan cluster standar, tetapi memiliki karakteristik dan batasan unik berikut:
Memerlukan versi image 2.2.53 atau yang lebih baru.
Hanya mendukung pekerja sekunder, bukan pekerja utama.
Mencakup layanan seperti YARN, tetapi tidak mendukung sistem file HDFS.
Untuk menggunakan Cloud Storage sebagai sistem file default, tetapkan properti cluster
core:fs.defaultFS ke lokasi bucket Cloud Storage
(gs://BUCKET_NAME).
Jika Anda menonaktifkan komponen selama pembuatan cluster, nonaktifkan juga HDFS.
Tidak dapat dikonversi ke atau dari cluster standar.
Memerlukan kebijakan penskalaan otomatis untuk jenis cluster ZERO_SCALE.
Memerlukan pemilihan
VM fleksibel
sebagai jenis mesin.
Anda dapat mengonfigurasi kebijakan penskalaan otomatis untuk menentukan penskalaan pekerja sekunder untuk
cluster skala nol. Saat melakukannya, perhatikan hal berikut:
Tetapkan jenis cluster ke ZERO_SCALE.
Konfigurasi kebijakan penskalaan otomatis hanya untuk konfigurasi pekerja sekunder.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-08 UTC."],[],[],null,["| **Preview**\n|\n|\n| This product or feature is\n|\n| subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section of the\n| [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA products and features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nThis document describes how to create a Dataproc zero-scale cluster.\n\nDataproc zero-scale clusters provide a cost-effective way to use\nDataproc clusters. Unlike\n[standard Dataproc clusters](/dataproc/docs/guides/create-cluster)\nthat require at least two primary workers, Dataproc zero-scale clusters\nuse only [secondary workers](/dataproc/docs/concepts/compute/secondary-vms)\nthat can be scaled down to zero.\n\nDataproc zero-scale clusters are ideal for use as long-running clusters\nthat experience idle periods, such as a cluster that hosts a Jupiter notebook.\nThey provide improved resource utilization through the use of zero-scale\nautoscaling policies.\n\nCharacteristics and limitations\n\nA Dataproc zero-scale cluster shares similarities with a standard\ncluster, but has the following unique characteristics and limitations:\n\n- Requires image version `2.2.53` or later.\n- Supports only secondary workers, not primary workers.\n- Includes services such as YARN, but doesn't support the HDFS file system.\n\n - To use Cloud Storage as the default file system, set the `core:fs.defaultFS` cluster property to a Cloud Storage bucket location (`gs://`\u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e).\n - If you disable a component during cluster creation, also disable HDFS.\n- Can't be converted to or from a standard cluster.\n\n- Requires an autoscaling policy for `ZERO_SCALE` cluster types.\n\n- Requires selecting\n [flexible VMs](/dataproc/docs/concepts/configuring-clusters/flexible-vms#how_to_request_flexible_vms)\n as machine type.\n\n- Doesn't support the Oozie component.\n\n- Can't be created from the Google Cloud console.\n\nOptional: Configure an autoscaling policy\n\nYou can configure an autoscaling policy to define secondary working scaling for\na zero-scale cluster. When doing so, note the following:\n\n- Set the cluster type to `ZERO_SCALE`.\n- Configure an autoscaling policy to the secondary worker config only.\n\nFor more information, see\n[Create an autoscaling policy](/dataproc/docs/concepts/configuring-clusters/autoscaling#create_an_autoscaling_policy).\n\nCreate a Dataproc zero-scale cluster\n\nCreate a zero-scale cluster using the gcloud CLI or\nthe Dataproc API.\n**Note:** When selecting a machine type for zero-scale clusters, use [flexible VMs](/dataproc/docs/concepts/configuring-clusters/flexible-vms#how_to_request_flexible_vms). \n\ngcloud\n\nRun\n[`gcloud dataproc clusters create`](/sdk/gcloud/reference/dataproc/clusters/create)\ncommand locally in a terminal window or in\n[Cloud Shell](https://console.cloud.google.com/?cloudshell=true%22). \n\n gcloud dataproc clusters create \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e \\\n --region=\u003cvar translate=\"no\"\u003eREGION\u003c/var\u003e \\\n --cluster-type=zero-scale \\\n --autoscaling-policy=\u003cvar translate=\"no\"\u003eAUTOSCALING_POLICY\u003c/var\u003e \\\n --properties=core:fs.defaultFS=gs://\u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e \\\n --secondary-worker-machine-types=\"type=\u003cvar translate=\"no\"\u003eMACHINE_TYPE1\u003c/var\u003e[,type=\u003cvar translate=\"no\"\u003eMACHINE_TYPE2\u003c/var\u003e...][,rank=\u003cvar translate=\"no\"\u003eRANK\u003c/var\u003e]\"\n ...other args\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: name of the Dataproc zero-scale cluster.\n- \u003cvar translate=\"no\"\u003eREGION\u003c/var\u003e: an [available Compute Engine region](/compute/docs/regions-zones#available).\n- \u003cvar translate=\"no\"\u003eAUTOSCALING_POLICY\u003c/var\u003e: the ID or resource URI of the autoscaling policy.\n- \u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e: name of your Cloud Storage bucket.\n- \u003cvar translate=\"no\"\u003eMACHINE_TYPE\u003c/var\u003e: specific Compute Engine machine type, such as `n1-standard-4`, `e2-standard-8`.\n- \u003cvar translate=\"no\"\u003eRANK\u003c/var\u003e: defines the priority of a list of machine types.\n\nREST\n\nCreate a zero-scale cluster using a Dataproc REST API\n[cluster.create](/dataproc/docs/reference/rest/v1/projects.regions.clusters/create)\nrequest:\n\n- Set [`ClusterConfig.ClusterType`](/dataproc/docs/reference/rest/v1/ClusterConfig#ClusterType.ENUM_VALUES.ZERO_SCALE) for the `secondaryWorkerConfig` to `ZERO_SCALE`.\n- Set the [`AutoscalingConfig.policyUri`](/dataproc/docs/reference/rest/v1/ClusterConfig#AutoscalingConfig.FIELDS.policy_uri) with the `ZERO_SCALE` autoscaling policy ID.\n- Add the `core:fs.defaultFS:gs://`\u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e [SoftwareConfig.property](/static/dataproc/docs/reference/rest/v1/ClusterConfig#SoftwareConfig.FIELDS.properties). Replace \u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e with the name of your Cloud Storage bucket.\n\nWhat's next\n\n- Learn more about [Dataproc autoscaling](/dataproc/docs/concepts/configuring-clusters/autoscaling)."]]