Restez organisé à l'aide des collections
Enregistrez et classez les contenus selon vos préférences.
Créer un cluster Dataproc à l'aide de la gcloud CLI
Cette page vous explique comment utiliser l'outil de ligne de commande gcloud de la CLI Google Cloud pour créer un cluster Dataproc, exécuter un job Apache Spark dans le cluster, puis modifier le nombre de nœuds de calcul dans le cluster.
Sign in to your Google Cloud account. If you're new to
Google Cloud,
create an account to evaluate how our products perform in
real-world scenarios. New customers also get $300 in free credits to
run, test, and deploy workloads.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Le résultat de la commande confirme la création du cluster :
Waiting for cluster creation operation...done.
Created [... example-cluster]
Pour en savoir plus sur la sélection d'une région, consultez Régions et zones disponibles.
Pour afficher la liste des régions disponibles, vous pouvez exécuter la commande gcloud compute regions list.
Pour en savoir plus sur les points de terminaison régionaux, consultez Points de terminaison régionaux.
Envoyer une tâche
Pour envoyer un exemple de tâche Spark qui calcule une valeur approximative de pi, exécutez la commande suivante :
Sauf indication contraire, le contenu de cette page est régi par une licence Creative Commons Attribution 4.0, et les échantillons de code sont régis par une licence Apache 2.0. Pour en savoir plus, consultez les Règles du site Google Developers. Java est une marque déposée d'Oracle et/ou de ses sociétés affiliées.
Dernière mise à jour le 2025/09/04 (UTC).
[[["Facile à comprendre","easyToUnderstand","thumb-up"],["J'ai pu résoudre mon problème","solvedMyProblem","thumb-up"],["Autre","otherUp","thumb-up"]],[["Difficile à comprendre","hardToUnderstand","thumb-down"],["Informations ou exemple de code incorrects","incorrectInformationOrSampleCode","thumb-down"],["Il n'y a pas l'information/les exemples dont j'ai besoin","missingTheInformationSamplesINeed","thumb-down"],["Problème de traduction","translationIssue","thumb-down"],["Autre","otherDown","thumb-down"]],["Dernière mise à jour le 2025/09/04 (UTC)."],[[["\u003cp\u003eThis guide demonstrates how to create a Dataproc cluster using the \u003ccode\u003egcloud\u003c/code\u003e command-line tool.\u003c/p\u003e\n"],["\u003cp\u003eYou can use the \u003ccode\u003egcloud\u003c/code\u003e command to submit an Apache Spark job to a cluster to execute code, such as a sample job that calculates the value of \u003ccode\u003epi\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe number of workers within an existing Dataproc cluster can be adjusted with the \u003ccode\u003egcloud\u003c/code\u003e update command.\u003c/p\u003e\n"],["\u003cp\u003eAfter you are finished with your Dataproc cluster, it can be deleted using the \u003ccode\u003egcloud\u003c/code\u003e command to prevent continued resource usage charges.\u003c/p\u003e\n"]]],[],null,["Create a Dataproc cluster by using the gcloud CLI This page shows you how to use the Google Cloud CLI\n[gcloud](/sdk/gcloud/reference/dataproc) command-line tool to create a\nDataproc cluster, run a [Apache Spark](http://spark.apache.org/) job\nin the cluster, then modify the number of workers in the cluster.\n| A convenient way to run the `gcloud` command-line tool is from [Cloud Shell](https://console.cloud.google.com/?cloudshell=true), which has the Google Cloud CLI pre-installed. Cloud Shell is free for Google Cloud customers. To use Cloud Shell, you need a Google Cloud project.\n\nYou can find out how to do the same or similar tasks with\n[Quickstarts Using the API Explorer](/dataproc/docs/quickstarts/create-cluster-template),\nthe Google Cloud console in\n[Create a Dataproc cluster by using the Google Cloud console](/dataproc/docs/quickstarts/create-cluster-console),\nand using the client libraries in\n[Create a Dataproc cluster by using client libraries](/dataproc/docs/quickstarts/create-cluster-client-libraries).\n\nBefore you begin\n\n- Sign in to your Google Cloud account. If you're new to Google Cloud, [create an account](https://console.cloud.google.com/freetrial) to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc&redirect=https://console.cloud.google.com)\n\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc&redirect=https://console.cloud.google.com)\n\n\u003cbr /\u003e\n\nCreate a cluster\n\nTo create a cluster called `example-cluster`, run the following command: \n\n```\ngcloud dataproc clusters create example-cluster --region=REGION\n```\n\nThe command output confirms cluster creation: \n\n```\nWaiting for cluster creation operation...done.\nCreated [... example-cluster]\n```\n\n\u003cbr /\u003e\n\nFor information on selecting a region, see\n[Available regions \\& zones](/compute/docs/regions-zones/regions-zones#available).\nTo see a list of available regions, you can run the\n`gcloud compute regions list` command.\nTo learn about regional endpoints, see\n[Regional endpoints](/dataproc/docs/concepts/regional-endpoints).\n\nSubmit a job\n\nTo submit a sample Spark job that calculates a rough value for `pi`, run the\nfollowing command: \n\n```\ngcloud dataproc jobs submit spark --cluster example-cluster \\\n --region=REGION \\\n --class org.apache.spark.examples.SparkPi \\\n --jars file:///usr/lib/spark/examples/jars/spark-examples.jar -- 1000\n```\n\nThis command specifies the following:\n\n- You want to run a [`spark`](/sdk/gcloud/reference/dataproc/jobs/submit/spark) job on the `example-cluster` cluster in the specified region\n- The `class` containing the main method for the job's pi-calculating application\n- The location of the jar file containing your job's code\n- Any parameters you want to pass to the job---in this case the number of tasks, which is `1000`\n\n| Parameters passed to the job must follow a double dash (`--`). For more information, see the [Google Cloud CLI documentation](/sdk/gcloud/reference/dataproc/jobs/submit/spark).\n\nThe job's running and final output is displayed in the terminal window: \n\n```\nWaiting for job output...\n...\nPi is roughly 3.14118528\n...\nJob finished successfully.\n```\n\nUpdate a cluster\n\nTo change the number of workers in the cluster to five, run the\nfollowing command: \n\n```\ngcloud dataproc clusters update example-cluster \\\n --region=REGION \\\n --num-workers 5\n```\n\nThe command output displays your cluster's details. For example: \n\n```\nworkerConfig:\n...\n instanceNames:\n - example-cluster-w-0\n - example-cluster-w-1\n - example-cluster-w-2\n - example-cluster-w-3\n - example-cluster-w-4\n numInstances: 5\nstatusHistory:\n...\n- detail: Add 3 workers.\n```\n\nTo decrease the number of worker nodes to the original value, use the same\ncommand: \n\n```\ngcloud dataproc clusters update example-cluster \\\n --region=REGION \\\n --num-workers 2\n```\n\nClean up\n\n\nTo avoid incurring charges to your Google Cloud account for\nthe resources used on this page, follow these steps.\n\n1. To delete your `example-cluster`, run the\n [`clusters delete`](/sdk/gcloud/reference/dataproc/clusters/delete)\n command:\n\n ```\n gcloud dataproc clusters delete example-cluster \\\n --region=REGION\n ```\n\n \u003cbr /\u003e\n\n2. To confirm and complete the cluster deletion, press \u003ckbd\u003ey\u003c/kbd\u003e and then\n press \u003ckbd\u003eEnter\u003c/kbd\u003e when prompted.\n\nWhat's next\n\n- Learn how to [write and run a Spark Scala job](/dataproc/docs/tutorials/spark-scala)."]]