[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["わかりにくい","hardToUnderstand","thumb-down"],["情報またはサンプルコードが不正確","incorrectInformationOrSampleCode","thumb-down"],["必要な情報 / サンプルがない","missingTheInformationSamplesINeed","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["その他","otherDown","thumb-down"]],["最終更新日 2025-08-28 UTC。"],[[["\u003cp\u003eDataproc clusters can be rotated at regular intervals to adhere to security policies and compliance rules, enabling the provisioning of new clusters with updated image versions while retaining configurations.\u003c/p\u003e\n"],["\u003cp\u003eRotated clusters are set up by assigning unique, timestamp-suffixed names and attaching labels like \u003ccode\u003ecluster-pool\u003c/code\u003e and \u003ccode\u003ecluster-state=active\u003c/code\u003e to distinguish and identify them within a pool.\u003c/p\u003e\n"],["\u003cp\u003eJobs can be submitted to active clusters within a cluster pool by using cluster labels to ensure that the job is directed to a cluster that is currently accepting new submissions.\u003c/p\u003e\n"],["\u003cp\u003eClusters are rotated by updating their labels to indicate they are no longer active, for example, by changing \u003ccode\u003ecluster-state=active\u003c/code\u003e to \u003ccode\u003ecluster-state=pendingfordeletion\u003c/code\u003e, which prevents them from receiving new jobs.\u003c/p\u003e\n"],["\u003cp\u003eClusters marked as ready for deletion can be removed after they have completed their current jobs, which can be automated using a monitoring script.\u003c/p\u003e\n"]]],[],null,[".\n\nOrganization security policies, regulatory compliance rules, and other\nconsiderations can prompt you to \"rotate\" your Dataproc clusters\nat regular intervals by deleting, then recreating clusters on a schedule.\nAs part of cluster rotation, new clusters can be provisioned with the latest\nDataproc image versions while retaining the configuration settings\nof the replaced clusters.\n\nThis page shows you how to set up clusters that you plan to rotate (\"rotated\nclusters\"), submit jobs to them, and then rotate the clusters as needed.\n\n[Custom image](/dataproc/docs/guides/dataproc-images) cluster rotation:\nYou can apply previous or new customizations to a previous or new\nDataproc base image when recreating the custom image cluster.\n\nSet up rotated clusters\n\nTo set up rotated clusters, create unique, timestamp-suffixed cluster names\nto distinguish previous from new clusters, and then attach labels to clusters\nthat indicate if a cluster is part of a rotated cluster pool and actively\nreceiving new job submissions. This example uses `cluster-pool` and\n`cluster-state=active` labels for these purposes, but you can use\nyour own label names.\n\n1. Set environment variables:\n\n ```\n PROJECT=project ID \\\n REGION=/compute/docs/regions-zones#available \\\n CLUSTER_POOL=cluster-pool-name \\\n CLUSTER_NAME=$CLUSTER_POOL-$(date '+%Y%m%d%H%M') \\\n BUCKET=Cloud Storage bucket-name\n ```\n\n \u003cbr /\u003e\n\n Notes:\n - \u003cvar translate=\"no\"\u003ecluster-pool-name\u003c/var\u003e: The name of the cluster pool associated with one or more clusters. This name is used in the cluster name and with the `cluster-pool` label attached to the cluster to identify the cluster as part of the pool.\n2. Create the cluster. You can add arguments and use different labels.\n\n ```\n gcloud dataproc clusters create ${CLUSTER_NAME} \\\n --project=${PROJECT_ID} \\\n --region=${REGION} \\\n --bucket=${BUCKET} \\\n --labels=\"cluster-pool=${CLUSTER_POOL},cluster-state=active\"\n ```\n\nSubmit jobs to clusters\n\nThe following Google Cloud CLI and\n[Apache Airflow directed acyclic graph (DAG)](/composer/docs/how-to/using/writing-dags)\nexamples submit an Apache Pig job to a cluster. Cluster labels are\nused to submit the job to an active cluster within a cluster pool. \n\ngcloud\n\nSubmit an Apache Pig job located in Cloud Storage. Pick the cluster using labels.\n\n\u003cbr /\u003e\n\n```\ngcloud dataproc jobs submit pig \\\n --region=${REGION} \\\n --file=gs://${BUCKET}/scripts/script.pig \\\n --cluster-labels=\"cluster-pool=${CLUSTER_POOL},cluster-state=active\"\n \n```\n\n\u003cbr /\u003e\n\nAirflow\n\nSubmit an Apache Pig job located in Cloud Storage using Airflow.\nPick the cluster using labels. \n\n```\nfrom airflow import DAG\nfrom airflow.providers.google.cloud.operators.dataproc import DataprocSubmitJobOperator\nfrom datetime import datetime\n\n# Declare variables\nproject_id= # e.g: my-project\nregion=\"us-central1\"\ndag_id='pig_wordcount'\ncluster_labels={\"cluster-pool\":${CLUSTER_POOL},\n \"cluster-state\":\"active\"}\nwordcount_script=\"gs://bucket-name/scripts/wordcount.pig\"\n\n# Define DAG\n\ndag = DAG(\n dag_id,\n schedule_interval=None,\n start_date=datetime(2023, 8, 16),\n catchup=False\n)\n\nPIG_JOB = {\n \"reference\": {\"project_id\": project_id},\n \"placement\": {\"cluster_labels\": cluster_labels},\n \"pig_job\": {\"query_file_uri\": wordcount_script},\n}\n\nwordcount_task = DataprocSubmitJobOperator(\n task_id='wordcount',\n region=region,\n project_id=project_id,\n job=PIG_JOB,\n dag=dag\n)\n```\n\n\u003cbr /\u003e\n\nRotate clusters\n\n1. Update the cluster labels attached to the clusters you are rotating out. This\n examples uses the `cluster-state=pendingfordeletion` label to signify that\n the cluster is not receiving new job submissions and is being rotated out,\n but you can use your own label for this purpose.\n\n ```\n gcloud dataproc clusters update ${CLUSTER_NAME} \\\n --region=${REGION} \\\n --update-labels=\"cluster-state=pendingfordeletion\"\n ```\n\n \u003cbr /\u003e\n\n After the cluster label is updated, the cluster does not receive new jobs\n since jobs are submitted to clusters within a cluster pool\n with `active` labels only (see\n [Submit jobs to clusters](#submit_jobs_to_clusters)).\n2. Delete clusters you are rotating out after they finish running jobs.\n\n | **Note:** You can automate this step with a monitoring script that fetches clusters with the `cluster-state=pendingfordeletion` label (or other label you added with the previous command), checks that no jobs are running on the cluster, and then deletes the cluster."]]