Dataproc 클러스터를 만들 때 클러스터의 마스터 인스턴스 수를 지정하여 클러스터를 Hadoop 고가용성(HA) 모드로 전환할 수 있습니다. 마스터 수는 클러스터를 만드는 시점에만 지정할 수 있습니다.
현재 Dataproc은 다음 두 가지 마스터 구성을 지원합니다.
마스터 1개(기본값, HA 아님)
마스터 3개(Hadoop HA)
기본 모드와 Hadoop 고가용성 모드 비교
Compute Engine 장애: 드물긴 하지만 예상치 못한 Compute Engine 오류가 발생하는 경우 Dataproc 인스턴스에 머신 재부팅이 발생합니다. Dataproc의 기본 단일 마스터 구성은 해당 경우에 새 작업을 복구하여 계속 처리하도록 설계되었지만 진행 중인 작업은 오류가 발생하면 다시 시도해야 하며 단일 NameNode가 재부팅 시 완전히 복구될 때까지 HDFS에 액세스할 수 없습니다. HA 모드에서 HDFS 고가용성 및 YARN 고가용성은 단일 노드 오류/재부팅이 발생해도 중단 없는 YARN 및 HDFS 작업을 허용하도록 구성됩니다.
작업 드라이버 종료: 성공적으로 실행되는 드라이버 프로그램에 따라 작업의 정확성이 달라지는 경우 실행하는 모든 작업의 드라이버/기본 프로그램은 여전히 잠재적 단일 오류 지점을 나타냅니다. Dataproc 작업 API를 통해 제출된 작업은 '고가용성'으로 간주되지 않고 해당 작업 드라이버 프로그램을 실행하는 마스터 노드에 오류가 발생하면 종료됩니다. 개별 작업이 HA Cloud Dataproc 클러스터를 사용하여 단일 노드 오류에 대비하려면 작업은 1) 동기식 드라이버 프로그램 없이 실행하거나 2) YARN 컨테이너 내에서 드라이버 프로그램을 실행하고 드라이버-프로그램 재시작을 처리하도록 작성해야 합니다. 내결함성을 위해 YARN 컨테이너 내에서 다시 시작 가능한 드라이버 프로그램을 실행하는 방법의 예시는 YARN에서 Spark 실행을 참조하세요.
영역 장애: 모든 Dataproc 클러스터와 마찬가지로 고가용성 클러스터의 모든 노드는 동일한 영역에 있습니다. 영역의 모든 노드에 영향을 미치는 오류가 있는 경우 실패가 완화되지 않습니다.
인스턴스 이름
기본 마스터의 이름은 cluster-name-m으로 지정되고 HA 마스터는 cluster-name-m-0, cluster-name-m-1, cluster-name-m-2로 이름이 지정됩니다.
Apache ZooKeeper
HA Dataproc 클러스터에서 Zookeeper 구성요소는 클러스터 마스터 노드에 자동으로 설치됩니다. 모든 마스터가 ZooKeeper 클러스터에 참여하여 다른 Hadoop 서비스에 대한 자동 장애 조치를 사용 설정합니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-08-26(UTC)"],[[["\u003cp\u003eDataproc clusters can be configured in Hadoop High Availability (HA) mode by setting the number of master instances to 3 during cluster creation, as opposed to the default of 1.\u003c/p\u003e\n"],["\u003cp\u003eHA mode provides uninterrupted YARN and HDFS operations despite single-node failures or reboots, unlike the default mode where in-flight jobs may fail during a Compute Engine failure, necessitating job retries.\u003c/p\u003e\n"],["\u003cp\u003eJobs submitted through the Dataproc Jobs API are not considered "high availability" and will be terminated if the master node running the job driver fails; if a job requires high availability, it must be launched without a driver program, or the driver program must be launched within a YARN container.\u003c/p\u003e\n"],["\u003cp\u003eIn an HA cluster, all master nodes participate in a ZooKeeper cluster to enable automatic failover, and each node runs ResourceManager, while in a default cluster, the single master runs the NameNode, Secondary NameNode, and ResourceManager.\u003c/p\u003e\n"],["\u003cp\u003eCreating an HA cluster involves using either the gcloud command with \u003ccode\u003e--num-masters=3\u003c/code\u003e, the REST API by setting \u003ccode\u003emasterConfig.numInstances\u003c/code\u003e to \u003ccode\u003e3\u003c/code\u003e, or by selecting "High Availability (3 masters, N workers)" in the Dataproc console.\u003c/p\u003e\n"]]],[],null,["When creating a Dataproc cluster, you can put the cluster into\nHadoop High Availability (HA) mode by\nspecifying the number of master instances in the\ncluster. The number of masters can only be specified at cluster creation time.\n\nCurrently, Dataproc supports two master configurations:\n\n- 1 master (default, non HA)\n- 3 masters (Hadoop HA)\n\nComparison of default and Hadoop High Availability mode Due to the complexity and higher cost of HA mode, use the default mode unless your use case requires HA mode.\n\n- **Compute Engine failure:** In the rare case of an\n unexpected Compute Engine failure, Dataproc\n instances will experience a machine reboot. The default single-master\n configuration for Dataproc is designed to recover and continue processing\n new work in such cases, but in-flight jobs will necessarily fail and need to be\n retried, and HDFS will be inaccessible until the single NameNode fully recovers\n on reboot. In **HA mode** , [HDFS High Availability](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html) and\n [YARN High Availability](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html)\n are configured to allow uninterrupted YARN and HDFS operations despite any\n single-node failures/reboots.\n\n- **Job driver termination:** The driver/main program of any jobs you run still represents a\n potential single point of failure if the correctness of your job depends on the\n driver program running successfully. Jobs submitted through the Dataproc\n Jobs API are not considered \"high availability,\" and will still be terminated on\n failure of the master node that runs the corresponding job driver programs. For\n individual jobs to be resilient against single-node failures using a HA Cloud\n Dataproc cluster, the job must either 1) run without a synchronous driver\n program or 2) it must run the driver program itself inside a YARN container and\n be written to handle driver-program restarts. See\n [Launching Spark on YARN](http://spark.apache.org/docs/latest/running-on-yarn.html#launching-spark-on-yarn) for an example\n of how restartable driver programs can run inside YARN containers for fault\n tolerance.\n\n- **Zonal failure:** As is the case with all Dataproc clusters, all nodes in a High\n Availability cluster reside in the same zone. If there is a failure that\n impacts all nodes in a zone, the failure will not be mitigated.\n\nInstance Names\n\nThe default master is named `cluster-name-m`; HA masters are named\n`cluster-name-m-0`, `cluster-name-m-1`, `cluster-name-m-2`.\n\nApache ZooKeeper\n\nIn an HA Dataproc cluster, the\n[Zookeeper component](/dataproc/docs/concepts/components/zookeeper)\nis automatically installed on cluster master nodes. All masters\nparticipate in a ZooKeeper cluster, which enables automatic failover for\nother Hadoop services.\n\nHDFS\n\nIn a standard Dataproc cluster:\n\n- `cluster-name-m` runs:\n - NameNode\n - Secondary NameNode\n\nIn a High Availability Dataproc cluster:\n\n- `cluster-name-m-0` and `cluster-name-m-1` run:\n - NameNode\n - ZKFailoverController\n- All masters run JournalNode\n- There is no Secondary NameNode\n\nPlease see the [HDFS High Availability](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html)\ndocumentation for additional details on components.\n\nYARN\n\nIn a standard Dataproc cluster, `cluster-name-m` runs ResourceManager.\n\nIn a High Availability Dataproc cluster, all masters run ResourceManager.\n\nPlease see the [YARN High Availability](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html)\ndocumentation for additional details on components.\n\nCreate a High Availability cluster \n\ngcloud command\n\n\n| **gcloud CLI setup:** You must [setup and configure](/sdk/docs/quickstarts) the gcloud CLI to use the Google Cloud CLI.\nTo create an HA cluster with [gcloud dataproc clusters create](/sdk/gcloud/reference/dataproc/clusters/create), run the following command: \n\n```\ngcloud dataproc clusters create cluster-name \\\n --region=region \\\n --num-masters=3 \\\n ... other args\n```\n\n\u003cbr /\u003e\n\nREST API\n\n\nTo create an HA cluster, use the\n[clusters.create](/dataproc/docs/reference/rest/v1/projects.regions.clusters/create)\nAPI, setting [masterConfig.numInstances](/dataproc/docs/reference/rest/v1/ClusterConfig#InstanceGroupConfig)\nto `3`.\n| An easy way to construct the JSON body of an HA cluster create request is to create the request from the Dataproc [Create a cluster](https://console.cloud.google.com/dataproc/clustersAdd) page of the Google Cloud console. Select High Availability (3 masters, N workers) in the Cluster type section of the Set up cluster panel, then click the Equivalent REST button at the bottom of the left panel. Here's a snippet of a sample JSON output produced by the console for an HA cluster create request: \n|\n| ```\n| ...\n| masterConfig\": {\n| \"numInstances\": 3,\n| \"machineTypeUri\": \"n1-standard-4\",\n| \"diskConfig\": {\n| \"bootDiskSizeGb\": 500,\n| \"numLocalSsds\": 0\n| }\n| }\n| ...\n| ```\n\n\u003cbr /\u003e\n\nConsole\n\n\nTo create an HA cluster, select High Availability (3 masters, N workers) in\nthe Cluster type section of the Set up cluster panel on the\nDataproc\n[Create a cluster](https://console.cloud.google.com/dataproc/clustersAdd)\npage."]]