Sign in to your Google Cloud account. If you're new to
Google Cloud,
create an account to evaluate how our products perform in
real-world scenarios. New customers also get $300 in free credits to
run, test, and deploy workloads.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Dataproc 클러스터 만들기 대화상자의 Compute Engine의 클러스터 행에서 만들기를 클릭합니다.
클러스터 이름 필드에 example-cluster를 입력합니다.
리전 및 영역 목록에서 리전과 영역을 선택합니다.
리전(예: us-east1 또는 europe-west1)을 선택하여 해당 리전에서 Dataproc에서 활용하는 가상 머신(VM) 인스턴스, Cloud Storage와 같은 리소스와 메타데이터 스토리지 위치를 격리합니다. 자세한 내용은 사용 가능한 리전 및 영역과 리전 엔드포인트를 참조하세요.
다른 모든 옵션은 기본 설정을 사용합니다.
만들기를 클릭하여 클러스터를 만듭니다.
새 클러스터가 클러스터 페이지의 목록에 나타납니다. 클러스터를 사용할 준비가 될 때까지 상태는 프로비저닝이 되고 상태는 실행 중으로 변경됩니다. 클러스터를 프로비저닝하는 데 몇 분 정도 걸릴 수 있습니다.
Spark 작업 제출
Pi 값을 추정하는 Spark 작업을 제출합니다.
Dataproc 탐색 메뉴에서 작업을 클릭합니다.
작업 페이지에서 add_box작업 제출을 클릭한 후 다음을 수행합니다.
작업 ID 필드에서 기본 설정을 사용하거나 Google Cloud 프로젝트에 고유한 ID를 제공합니다.
클러스터 드롭다운에서 example-cluster를 선택합니다.
작업 유형으로 Spark를 선택합니다.
기본 클래스 또는 jar 필드에 org.apache.spark.examples.SparkPi를 입력합니다.
Jar 파일 필드에 file:///usr/lib/spark/examples/jars/spark-examples.jar를 입력합니다.
인수 필드에 1000을 입력하여 작업 수를 설정합니다.
제출을 클릭합니다.
작업이 작업 세부정보 페이지에 표시됩니다. 작업 상태는 실행 중 또는 시작 중 하나이며 제출 후 성공으로 변경됩니다.
출력에서 스크롤되지 않도록 하려면 자동 줄바꿈: 사용 중지를 클릭합니다. 출력은 다음과 비슷합니다.
Pi is roughly 3.1416759514167594
작업 세부정보를 보려면 구성 탭을 클릭합니다.
클러스터 업데이트
작업자 인스턴스 수를 변경하여 클러스터를 업데이트합니다.
Dataproc 탐색 메뉴에서 클러스터를 클릭합니다.
클러스터 목록에서 example-cluster를 클릭합니다.
클러스터 세부정보 페이지에서 구성 탭을 클릭합니다.
클러스터 설정이 표시됩니다.
mode_edit수정을 클릭합니다.
작업자 노드 필드에 5를 입력합니다.
저장을 클릭합니다.
클러스터가 업데이트되었습니다. 작업자 노드 수를 기존 값으로 줄이려면 동일한 절차를 따릅니다.
삭제
이 페이지에서 사용한 리소스 비용이 Google Cloud 계정에 청구되지 않도록 하려면 다음 단계를 수행합니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[[["\u003cp\u003eThis guide demonstrates how to create a Dataproc cluster using the Google Cloud console, with steps provided in a guided format.\u003c/p\u003e\n"],["\u003cp\u003eYou can submit an Apache Spark job to the cluster, specifically one that estimates Pi using the Monte Carlo method, by following the provided steps.\u003c/p\u003e\n"],["\u003cp\u003eThe guide shows how to modify the worker nodes of an existing cluster, allowing you to increase or decrease the resources allocated to your cluster.\u003c/p\u003e\n"],["\u003cp\u003eInstructions are included for cleaning up the cluster to avoid incurring unwanted charges.\u003c/p\u003e\n"],["\u003cp\u003eThe content also provides additional resources, links to quickstart guides for using other tools, and additional guidance on creating firewall rules and writing Spark Scala jobs.\u003c/p\u003e\n"]]],[],null,["Create a Dataproc cluster by using the Google Cloud console This page shows you how to use the Google Cloud console to create a\nDataproc cluster, run a basic\n[Apache Spark](http://spark.apache.org/)\njob in the cluster, and then modify the number of workers in the cluster.\n\n*** ** * ** ***\n\nTo follow step-by-step guidance for this task directly in the\nGoogle Cloud console, click **Guide me**:\n\n[Guide me](https://console.cloud.google.com/freetrial?redirectPath=/?walkthrough_id=dataproc--quickstart-dataproc-console)\n\n*** ** * ** ***\n\nBefore you begin\n\n- Sign in to your Google Cloud account. If you're new to Google Cloud, [create an account](https://console.cloud.google.com/freetrial) to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc)\n\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc)\n\n\u003cbr /\u003e\n\nCreate a cluster\n\n1. In the Google Cloud console, go to the Dataproc\n **Clusters** page.\n\n [Go to Clusters](https://console.cloud.google.com/dataproc/clusters)\n2. Click **Create cluster**.\n\n3. In the **Create Dataproc cluster** dialog, click **Create** in\n the **Cluster on Compute Engine** row.\n\n4. In the **Cluster name** field, enter `example-cluster`.\n\n5. In the **Region** and **Zone** lists, select a region and zone.\n\n Select a region (for example, `us-east1` or `europe-west1`)\n to isolate resources, such as virtual machine (VM) instances and\n Cloud Storage and metadata storage locations that are utilized by\n Dataproc, in the region. For more\n information, see\n [Available regions and zones](/compute/docs/regions-zones/regions-zones#available)\n and\n [Regional endpoints](/dataproc/docs/concepts/regional-endpoints).\n6. For all the other options, use the default settings.\n\n7. To create the cluster, click **Create**.\n\n Your new cluster appears in a list on the **Clusters** page. The status is\n **Provisioning** until the cluster is ready to use, and then the status\n changes to **Running**. Provisioning the cluster might take a couple of\n minutes.\n\nSubmit a Spark job\n\nSubmit a Spark job that estimates a value of Pi:\n\n1. In the Dataproc navigation menu, click **Jobs**.\n2. On the **Jobs** page, click\n add_box **Submit job**, and then do\n the following:\n\n 1. In the **Job ID** field, use the default setting, or provide an ID that is unique to your Google Cloud project.\n 2. In the **Cluster** drop-down, select **`example-cluster`**.\n 3. For **Job type** , select **Spark**.\n 4. In the **Main class or jar** field, enter `org.apache.spark.examples.SparkPi`.\n 5. In the **Jar files** field, enter `file:///usr/lib/spark/examples/jars/spark-examples.jar`.\n 6. In the **Arguments** field, enter `1000` to set the number of tasks.\n\n | **Note:** The Spark job estimates Pi by using the [Monte Carlo method](https://wikipedia.org/wiki/Monte_Carlo_method). It generates *x* and *y* points on a coordinate plane that models a circle enclosed by a unit square. The input argument (`1000`) determines the number of x-y pairs to generate; the more pairs generated, the greater the accuracy of the estimation. This estimation uses Dataproc worker nodes to parallelize the computation. For more information, see [Estimating Pi using the Monte Carlo Method](https://academo.org/demos/estimating-pi-monte-carlo/) and [JavaSparkPi.java on GitHub](https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/JavaSparkPi.java).\n 7. Click **Submit**.\n\n Your job is displayed on the **Job details** page. The job status is\n **Running** or **Starting** , and then it changes to **Succeeded** after\n it's submitted.\n\n To avoid scrolling in the output, click **Line wrap: off**. The output\n is similar to the following: \n\n ```\n Pi is roughly 3.1416759514167594\n ```\n\n To view job details, click the **Configuration** tab.\n\nUpdate a cluster\n\nUpdate your cluster by changing the number of worker instances:\n\n1. In the Dataproc navigation menu, click **Clusters**.\n2. In the list of clusters, click **`example-cluster`**.\n3. On the **Cluster details** page, click the **Configuration** tab.\n\n Your cluster settings are displayed.\n4. Click mode_edit **Edit**.\n\n5. In the **Worker nodes** field, enter `5`.\n\n6. Click **Save**.\n\nYour cluster is now updated. To decrease the number of worker nodes to the\noriginal value, follow the same procedure.\n\nClean up\n\n\nTo avoid incurring charges to your Google Cloud account for\nthe resources used on this page, follow these steps.\n\n1. To delete the cluster, on the **Cluster details** page for **`example-cluster`** , click delete **Delete**.\n2. To confirm that you want to delete the cluster, click **Delete**.\n\nWhat's next\n\n- Try this quickstart by using other tools:\n - [Use the API Explorer](/dataproc/docs/quickstarts/create-cluster-template).\n - [Use the Google Cloud CLI](/dataproc/docs/quickstarts/create-cluster-gcloud).\n- Learn how to [create robust firewall rules when you create a project](/dataproc/docs/concepts/configuring-clusters/network).\n- Learn how to [write and run a Spark Scala job](/dataproc/docs/tutorials/spark-scala)."]]