Restez organisé à l'aide des collections
Enregistrez et classez les contenus selon vos préférences.
Recréer et mettre à jour un cluster
Vous pouvez utiliser l'outil de ligne de commande gcloud ou l'API Dataproc pour copier la configuration d'un cluster existant, la mettre à jour, puis créer un cluster avec la configuration mise à jour.
gcloud CLI
Les instructions de l'exemple montrent comment mettre à jour le paramètre de version de l'image dans la configuration d'un cluster. Vous pouvez modifier l'exemple pour mettre à jour différents paramètres de configuration du cluster.
Définissez les variables.
export PROJECT=project-idexport REGION=regionexport OLD_CLUSTER=old-cluster-nameexport NEW_CLUSTER=new-cluster-nameexport NEW_IMAGE_VERSION=image-version (for example, '2.2-debian12')
Exportez la configuration de l'ancien cluster existant dans un fichier YAML.
Après avoir vérifié que vos charges de travail s'exécutent sans problème dans le nouveau cluster, supprimez l'ancien cluster. IMPORTANT : Cette étape supprime toutes les données stockées dans HDFS et sur le disque local de votre cluster.
Les instructions de l'exemple montrent comment mettre à jour le nom du cluster et les paramètres de version de l'image dans la configuration d'un cluster. Vous pouvez modifier les variables d'exemple pour mettre à jour différents paramètres de configuration du cluster.
Définissez les variables.
export PROJECT=project-idexport REGION=regionexport OLD_CLUSTER=old-cluster-nameexport NEW_CLUSTER=new-cluster-nameexport NEW_IMAGE_VERSION=image-version (for example, '2.2-debian12')
Exportez la configuration de l'ancien cluster existant dans un fichier JSON.
Après avoir vérifié que vos charges de travail s'exécutent sans problème dans le nouveau cluster, supprimez l'ancien cluster. IMPORTANT : Cette étape supprime toutes les données stockées dans HDFS et sur le disque local de votre cluster.
La console n'est pas compatible avec la recréation d'un cluster en important une configuration de cluster.
Sauf indication contraire, le contenu de cette page est régi par une licence Creative Commons Attribution 4.0, et les échantillons de code sont régis par une licence Apache 2.0. Pour en savoir plus, consultez les Règles du site Google Developers. Java est une marque déposée d'Oracle et/ou de ses sociétés affiliées.
Dernière mise à jour le 2025/09/04 (UTC).
[[["Facile à comprendre","easyToUnderstand","thumb-up"],["J'ai pu résoudre mon problème","solvedMyProblem","thumb-up"],["Autre","otherUp","thumb-up"]],[["Difficile à comprendre","hardToUnderstand","thumb-down"],["Informations ou exemple de code incorrects","incorrectInformationOrSampleCode","thumb-down"],["Il n'y a pas l'information/les exemples dont j'ai besoin","missingTheInformationSamplesINeed","thumb-down"],["Problème de traduction","translationIssue","thumb-down"],["Autre","otherDown","thumb-down"]],["Dernière mise à jour le 2025/09/04 (UTC)."],[[["\u003cp\u003eDataproc restricts the creation of clusters with image versions older than 1.3.95, 1.4.77, 1.5.53, and 2.0.27 due to Apache Log4j security vulnerabilities, and also prevents clusters for version 0.x, 1.0.x, 1.1.x, and 1.2.x.\u003c/p\u003e\n"],["\u003cp\u003eThe latest sub-minor image versions (2.0.29, 1.5.55, and 1.4.79, or later) are advised for creating Dataproc clusters to ensure the most recent security and performance updates, while 2.0.27, 1.5.53, and 1.4.77 are strongly recommended.\u003c/p\u003e\n"],["\u003cp\u003eYou can recreate and update a Dataproc cluster's configuration, including its image version, by exporting the existing cluster's settings, updating them via tools like \u003ccode\u003esed\u003c/code\u003e or \u003ccode\u003ejq\u003c/code\u003e, and then creating a new cluster with the modified configuration.\u003c/p\u003e\n"],["\u003cp\u003eFor production environments, specifying the \u003ccode\u003emajor.minor\u003c/code\u003e image version is recommended to ensure compatibility, and the sub-minor version and OS distributions are automatically set to the latest weekly release.\u003c/p\u003e\n"],["\u003cp\u003eAfter confirming workloads function correctly on the new cluster, the old cluster should be deleted, keeping in mind this deletes all data stored in HDFS and on local disk.\u003c/p\u003e\n"]]],[],null,["| Dataproc prevents the creation of clusters with image versions\n| prior to 1.3.95, 1.4.77, 1.5.53, and 2.0.27, which were affected by\n| [Apache Log4j security vulnerabilities](https://logging.apache.org/log4j/2.x/security.html). Dataproc also prevents cluster creation for Dataproc image versions 0.x, 1.0.x, 1.1.x, and 1.2.x.\n| Dataproc advises that, when possible, you create Dataproc\n| clusters with the latest sub-minor image versions.\n|\n| | Image version | log4j version | Customer guidance |\n| |------------------------------------------------|---------------|----------------------|\n| | 2.0.29, 1.5.55, and 1.4.79, or later of each | log4j.2.17.1 | Advised |\n| | 2.0.28, 1.5.54, and 1.4.78 | log4j.2.17.0 | Advised |\n| | 2.0.27, 1.5.53, and 1.4.77 | log4j.2.16.0 | Strongly recommended |\n| | 2.0.26, 1.5.52, and 1.4.76, or earlier of each | Older version | Discontinue use |\n|\n| See the\n| [Dataproc release notes](/dataproc/docs/release-notes)\n| for specific image and `log4j` update information.\n\nRecreate and update a cluster\n\nYou can use the `gcloud` command-line tool or the Dataproc API\nto copy configuration from an existing cluster, update the copied configuration,\nand then create a new cluster with the updated configuration. \n\ngcloud CLI\n\nThe example instructions show updating the image\nversion setting in a cluster configuration. You can change the\nexample to update different cluster configuration settings.\n| The recommended practice is to specify the `major.minor` image version for production environments or when compatibility with specific component versions is important. The sub-minor and OS distributions are automatically set to the latest weekly release.\n\n1. Set variables. \n\n ```\n export PROJECT=\u003cvar translate=\"no\"\u003eproject-id\u003c/var\u003e\n export REGION=\u003cvar translate=\"no\"\u003eregion\u003c/var\u003e\n export OLD_CLUSTER=\u003cvar translate=\"no\"\u003eold-cluster-name\u003c/var\u003e\n export NEW_CLUSTER=\u003cvar translate=\"no\"\u003enew-cluster-name\u003c/var\u003e\n export NEW_IMAGE_VERSION=\u003cvar translate=\"no\"\u003eimage-version (for example, '2.2-debian12')\u003c/var\u003e\n ```\n2. Export the existing (old) cluster configuration to a YAML file. \n\n ```\n gcloud dataproc clusters export $OLD_CLUSTER \\\n --project=$PROJECT \\\n --region=$REGION \u003e \"${OLD_CLUSTER}-config.yaml\"\n ```\n3. Update the configuration. The following example uses `sed` to update the image version. \n\n ```\n sed -E \"s|(^[[:blank:]]+)imageVersion: .+|\\1imageVersion: ${NEW_IMAGE_VERSION}|g\" \"${OLD_CLUSTER}-config.yaml\" | sed -E '/^[[:blank:]]+imageUri: /d' \u003e \"${NEW_CLUSTER}-config-updated.yaml\"\n ```\n4. Create a new cluster with a new name and the updated configuration. \n\n ```\n gcloud dataproc clusters import $NEW_CLUSTER \\\n --project=$PROJECT \\\n --region=$REGION \\\n --source=\"${NEW_CLUSTER}-config-updated.yaml\"\n ```\n5. After confirming your workloads run in the new cluster without issues, delete the existing (old) cluster. **IMPORTANT:** This step deletes all data stored in HDFS and on local disk in your cluster. \n\n ```\n gcloud dataproc clusters delete $OLD_CLUSTER \\\n --project=$PROJECT \\\n --region=$REGION\n ```\n\nREST API\n\nThe example instructions show updating the cluster name and the image\nversion settings in a cluster configuration. You can change the\nexample variables to update different cluster configuration settings.\n| The recommended practice is to specify the `major.minor` image version for production environments or when compatibility with specific component versions is important. The sub-minor and OS distributions are automatically set to the latest weekly release.\n\n1. Set variables. \n\n ```\n export PROJECT=\u003cvar translate=\"no\"\u003eproject-id\u003c/var\u003e\n export REGION=\u003cvar translate=\"no\"\u003eregion\u003c/var\u003e\n export OLD_CLUSTER=\u003cvar translate=\"no\"\u003eold-cluster-name\u003c/var\u003e\n export NEW_CLUSTER=\u003cvar translate=\"no\"\u003enew-cluster-name\u003c/var\u003e\n export NEW_IMAGE_VERSION=\u003cvar translate=\"no\"\u003eimage-version (for example, '2.2-debian12')\u003c/var\u003e\n\n ```\n2. Export the existing (old) cluster configuration to a JSON file. \n\n ```\n curl -X GET -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \"https://dataproc.googleapis.com/v1/projects/${PROJECT}/regions/${REGION}/clusters/${OLD_CLUSTER}?alt=json\" \u003e \"${OLD_CLUSTER}-config.json\"\n ```\n3. Update the configuration. The following example uses `jq` to update the cluster name and the image version. \n\n ```\n jq \".clusterName = \\\"${NEW_CLUSTER}\\\" | .config.softwareConfig.imageVersion=\\\"${NEW_IMAGE_VERSION}\\\" | del(.config.workerConfig.imageUri) | del(.config.masterConfig.imageUri)\" \"${OLD_CLUSTER}-config.json\" \u003e \"${NEW_CLUSTER}-config-updated.json\"\n ```\n4. Import the updated cluster configuration to create a new cluster with the updated configuration. \n\n ```\n curl -i -X POST -H \"Authorization: Bearer $(gcloud auth print-access-token)\" -H \"Content-Type: application/json; charset=utf-8\" -d \"@${NEW_CLUSTER}-config-updated.json\" \"https://dataproc.googleapis.com/v1/projects/${PROJECT}/regions/${REGION}/clusters?alt=json\"\n ```\n5. After confirming your workloads run in the new cluster without issues, delete the existing (old) cluster. **IMPORTANT:** This step deletes all data stored in HDFS and on local disk in your cluster. \n\n ```\n curl -X DELETE -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \"https://dataproc.googleapis.com/v1/projects/${PROJECT}/regions/${REGION}/clusters/${OLD_CLUSTER}\"\n ```\n\nConsole\n\nThe console does not support recreating a cluster by importing\na cluster configuration."]]