Kerberos와 Ranger 클러스터에서 Dataproc은 Kerberos 사용자의 영역과 인스턴스를 제거하여 Kerberos 사용자를 시스템 사용자에 매핑합니다. 예를 들어 Kerberos 주 구성원 user1/cluster-m@MY.REALM은 시스템 user1에 매핑되고 Ranger 정책은 user1에 대한 권한을 허용하거나 거부하도록 정의됩니다.
클러스터가 실행된 후 Google Cloud 콘솔에서 Dataproc 클러스터 페이지로 이동한 다음 클러스터 이름을 선택하여 클러스터 세부정보 페이지를 엽니다. 웹 인터페이스 탭을 클릭하여 클러스터에 설치된 기본 및 선택적 구성요소의 웹 인터페이스에 구성요소 게이트웨이 링크 목록을 표시합니다. Ranger 링크를 클릭합니다.
'관리' 사용자 이름과 Ranger 관리 비밀번호를 입력하여 Ranger에 로그인합니다.
Ranger 관리 UI가 로컬 브라우저에서 열립니다.
YARN 액세스 정책
이 예시에서는 YARN 루트.기본 큐에 대한 사용자 액세스를 허용하고 거부하는 Ranger 정책을 만듭니다.
Ranger 관리 UI에서 yarn-dataproc를 선택합니다.
yarn-dataproc 정책 페이지에서 새 정책 추가를 클릭합니다.
정책 만들기 페이지에서 다음 필드를 입력하거나 선택합니다.
Policy Name: 'yarn-policy-1'
Queue: "root.default"
Audit Logging: '예'
Allow Conditions:
Select User: '사용자1'
Permissions : 모든 권한을 부여하려면 '모두 선택'
Deny Conditions:
Select User: '사용자2'
Permissions : 모든 권한을 거부하려면 '모두 선택'
추가를 클릭하여 정책을 저장합니다. 이 정책은 yarn-dataproc 정책 페이지에 나열됩니다.
마스터 SSH 세션 창에서 Hadoop 맵리듀스 작업을 사용자1로 실행합니다.
userone@example-cluster-m:~$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduced-examples.
jar pi 5 10
Ranger UI는 userone가 작업을 제출하도록 허용되었음을 표시합니다.
VM 마스터 SSH 세션 창에서 Hadoop 맵리듀스 작업을 usertwo로 실행합니다.
usertwo@example-cluster-m:~$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduced-examples.
jar pi 5 10
Ranger UI는 usertwo가 작업을 제출할 수 있는 액세스가 거부되었음을 표시합니다.
HDFS 액세스 정책
이 예시에서는 HDFS /tmp 디렉터리에 대한 사용자 액세스를 허용 및 거부하는 Ranger 정책을 만듭니다.
Ranger 관리 UI에서 hadoop-dataproc를 선택합니다.
hadoop-dataproc 정책 페이지에서 새 정책 추가를 클릭합니다.
정책 만들기 페이지에서 다음 필드를 입력하거나 선택합니다.
Policy Name: 'hadoop-policy-1'
Resource Path: '/tmp'
Audit Logging: '예'
Allow Conditions:
Select User: '사용자1'
Permissions : 모든 권한을 부여하려면 '모두 선택'
Deny Conditions:
Select User: '사용자2'
Permissions : 모든 권한을 거부하려면 '모두 선택'
추가를 클릭하여 정책을 저장합니다. 정책은 hadoop-dataproc 정책 페이지에 나와 있습니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-08-27(UTC)"],[[["\u003cp\u003eThis guide demonstrates creating a Kerberos-enabled Dataproc cluster with Ranger and Solr components to manage access to Hadoop, YARN, and HIVE resources.\u003c/p\u003e\n"],["\u003cp\u003eRanger policies can be defined to grant or deny permissions to specific users, such as allowing \u003ccode\u003euserone\u003c/code\u003e and denying \u003ccode\u003eusertwo\u003c/code\u003e access to resources.\u003c/p\u003e\n"],["\u003cp\u003eThe examples show how to implement access control for YARN, HDFS, and Hive resources using Ranger policies, specifically controlling access to queues, directories, and tables.\u003c/p\u003e\n"],["\u003cp\u003eRanger allows for fine-grained control over Hive access, including masking specific columns (like employee names) and applying row-level filters based on user permissions.\u003c/p\u003e\n"],["\u003cp\u003eThe Ranger web UI can be accessed through the component gateway, and Dataproc maps Kerberos principals to system users by removing the Kerberos realm and instance.\u003c/p\u003e\n"]]],[],null,["The following examples create and use a\n[Kerberos enabled](/dataproc/docs/concepts/configuring-clusters/security)\nDataproc cluster with\n[Ranger](/dataproc/docs/concepts/components/ranger) and\n[Solr](/dataproc/docs/concepts/components/solr) components to control\naccess by users to Hadoop, YARN, and HIVE resources.\n\nNotes:\n\n- The Ranger Web UI can be accessed through the\n [Component Gateway](/dataproc/docs/concepts/accessing/dataproc-gateways).\n\n- In a Ranger with Kerberos cluster, Dataproc\n maps a Kerberos user to the system user by stripping the Kerberos user's\n realm and instance. For example, Kerberos principal\n `user1/cluster-m@MY.REALM` is mapped to system `user1`, and\n Ranger policies are defined to allow or deny permissions for `user1`.\n\n1. [Set up the Ranger admin password](/dataproc/docs/concepts/components/ranger#installation_steps).\n\n2. [Set up the Kerberos root principal password](/dataproc/docs/concepts/configuring-clusters/security#set_up_your_kerberos_root_principal_password).\n\n3. Create the cluster.\n\n 1. The following `gcloud` command can be run in a local terminal window or from a project's [Cloud Shell](https://console.cloud.google.com/?cloudshell=true). \n\n ```\n gcloud dataproc clusters create cluster-name \\\n --region=region \\\n --optional-components=SOLR,RANGER \\\n --enable-component-gateway \\\n --properties=\"dataproc:ranger.kms.key.uri=projects/project-id/locations/global/keyRings/keyring/cryptoKeys/key,dataproc:ranger.admin.password.uri=gs://bucket/admin-password.encrypted\" \\\n --kerberos-root-principal-password-uri=gs://bucket/kerberos-root-principal-password.encrypted \\\n --kerberos-kms-key=projects/project-id/locations/global/keyRings/keyring/cryptoKeys/key\n ```\n4. After the cluster is running, navigate to the Dataproc\n [Clusters](https://console.cloud.google.com/dataproc/clusters) page on Google Cloud console,\n then select the cluster's name to open the\n **Cluster details** page. Click the **Web Interfaces**\n tab to display a list of Component Gateway links to the web interfaces of\n [default and optional components](/dataproc/docs/concepts/components/overview)\n installed on the cluster. Click the Ranger link.\n\n5. Sign in to Ranger by entering the \"admin\" username and the Ranger admin\n password.\n\n \u003cbr /\u003e\n\n6. The Ranger admin UI opens in a local browser.\n\n \u003cbr /\u003e\n\n| The following examples create Ranger policies to allow or deny access to two OS users and Kerberos principals: `userone` and `usertwo`.\n\nYARN access policy\n\nThis example creates a Ranger policy to allow and deny user access to the\n[YARN root.default queue](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).\n\n1. Select `yarn-dataproc` from the Ranger Admin UI.\n\n \u003cbr /\u003e\n\n2. On the **yarn-dataproc Policies** page, click **Add New Policy** .\n On the **Create Policy** page, the following fields\n are entered or selected:\n\n - `Policy Name`: \"yarn-policy-1\"\n - `Queue`: \"root.default\"\n - `Audit Logging`: \"Yes\"\n - `Allow Conditions`:\n - `Select User`: \"userone\"\n - `Permissions`: \"Select All\" to grant all permissions\n - `Deny Conditions`:\n\n - `Select User`: \"usertwo\"\n - `Permissions`: \"Select All\" to deny all permissions\n\n \u003cbr /\u003e\n\n Click **Add** to save the policy. The policy is listed\n on the **yarn-dataproc Policies** page:\n3. Run a Hadoop mapreduce job in the master SSH session window as userone:\n\n ```\n userone@example-cluster-m:~$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduced-examples.\n jar pi 5 10\n ```\n\n \u003cbr /\u003e\n\n 1. The Ranger UI shows that `userone` was allowed to submit the job.\n4. Run the Hadoop mapreduce job from the VM master SSH session\n window as `usertwo`:\n\n ```\n usertwo@example-cluster-m:~$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduced-examples.\n jar pi 5 10\n ```\n\n \u003cbr /\u003e\n\n 1. The Ranger UI shows that `usertwo` was denied access to submit the job.\n\nHDFS access policy\n\nThis example creates a Ranger policy to allow and deny user access to the\nHDFS `/tmp` directory.\n\n1. Select `hadoop-dataproc` from the Ranger Admin UI.\n\n \u003cbr /\u003e\n\n2. On the **hadoop-dataproc Policies** page, click **Add New Policy** .\n On the **Create Policy** page, the following fields\n are entered or selected:\n\n - `Policy Name`: \"hadoop-policy-1\"\n - `Resource Path`: \"/tmp\"\n - `Audit Logging`: \"Yes\"\n - `Allow Conditions`:\n - `Select User`: \"userone\"\n - `Permissions`: \"Select All\" to grant all permissions\n - `Deny Conditions`:\n\n - `Select User`: \"usertwo\"\n - `Permissions`: \"Select All\" to deny all permissions\n\n \u003cbr /\u003e\n\n Click **Add** to save the policy. The policy is listed\n on the **hadoop-dataproc Policies** page:\n3. Access the HDFS `/tmp` directory as userone:\n\n ```\n userone@example-cluster-m:~$ hadoop fs -ls /tmp\n ```\n\n \u003cbr /\u003e\n\n 1. The Ranger UI shows that `userone` was allowed access to the HDFS /tmp directory.\n4. Access the HDFS `/tmp` directory as `usertwo`:\n\n ```\n usertwo@example-cluster-m:~$ hadoop fs -ls /tmp\n ```\n\n \u003cbr /\u003e\n\n 1. The Ranger UI shows that `usertwo` was denied access to the HDFS /tmp directory.\n\nHive access policy\n\nThis example creates a Ranger policy to allow and deny user access to a Hive\ntable.\n\n1. Create a small `employee` table using the hive CLI on the master instance.\n\n ```\n hive\u003e CREATE TABLE IF NOT EXISTS employee (eid int, name String); INSERT INTO employee VALUES (1 , 'bob') , (2 , 'alice'), (3 , 'john');\n ```\n\n \u003cbr /\u003e\n\n2. Select `hive-dataproc` from the Ranger Admin UI.\n\n \u003cbr /\u003e\n\n3. On the **hive-dataproc Policies** page, click **Add New Policy** .\n On the **Create Policy** page, the following fields\n are entered or selected:\n\n - `Policy Name`: \"hive-policy-1\"\n - `database`: \"default\"\n - `table`: \"employee\"\n - `Hive Column`: \"\\*\"\n - `Audit Logging`: \"Yes\"\n - `Allow Conditions`:\n - `Select User`: \"userone\"\n - `Permissions`: \"Select All\" to grant all permissions\n - `Deny Conditions`:\n\n - `Select User`: \"usertwo\"\n - `Permissions`: \"Select All\" to deny all permissions\n\n \u003cbr /\u003e\n\n Click **Add** to save the policy. The policy is listed\n on the **hive-dataproc Policies** page:\n4. Run a query from the VM master SSH session against Hive employee table as userone:\n\n ```\n userone@example-cluster-m:~$ beeline -u \"jdbc:hive2://$(hostname -f):10000/default;principal=hive/$(hostname -f)@REALM\" -e \"select * from employee;\"\n ```\n\n \u003cbr /\u003e\n\n 1. The userone query succeeds: \n\n ```\n Connected to: Apache Hive (version 2.3.6)\n Driver: Hive JDBC (version 2.3.6)\n Transaction isolation: TRANSACTION_REPEATABLE_READ\n +---------------+----------------+\n | employee.eid | employee.name |\n +---------------+----------------+\n | 1 | bob |\n | 2 | alice |\n | 3 | john |\n +---------------+----------------+\n 3 rows selected (2.033 seconds)\n ```\n5. Run a query from the VM master SSH session against Hive employee table as usertwo:\n\n ```\n usertwo@example-cluster-m:~$ beeline -u \"jdbc:hive2://$(hostname -f):10000/default;principal=hive/$(hostname -f)@REALM\" -e \"select * from employee;\"\n ```\n\n \u003cbr /\u003e\n\n 1. usertwo is denied access to the table: \n\n ```\n Error: Could not open client transport with JDBC Uri:\n ...\n Permission denied: user=usertwo, access=EXECUTE, inode=\"/tmp/hive\"\n ```\n\nFine-Grained Hive Access\n\nRanger supports Masking and Row Level Filters on Hive. This example\nbuilds on the previous `hive-policy-1` by adding masking and filter\npolicies.\n\n1. Select `hive-dataproc` from the Ranger Admin UI, then select the\n **Masking** tab and click **Add New Policy**.\n\n \u003cbr /\u003e\n\n 1. On the **Create Policy** page, the following fields\n are entered or selected to create a policy to mask (nullify)\n the employee name column.:\n\n - `Policy Name`: \"hive-masking policy\"\n - `database`: \"default\"\n - `table`: \"employee\"\n - `Hive Column`: \"name\"\n - `Audit Logging`: \"Yes\"\n - `Mask Conditions`:\n - `Select User`: \"userone\"\n - `Access Types`: \"select\" add/edit permissions\n - `Select Masking Option`: \"nullify\"\n\n \u003cbr /\u003e\n\n Click **Add** to save the policy.\n2. Select `hive-dataproc` from the Ranger Admin UI, then select the\n **Row Level Filter** tab and click **Add New Policy**.\n\n \u003cbr /\u003e\n\n 1. On the **Create Policy** page, the following fields\n are entered or selected to create a policy to\n filter (return) rows where `eid` is not equal to `1`:\n\n - `Policy Name`: \"hive-filter policy\"\n - `Hive Database`: \"default\"\n - `Hive Table`: \"employee\"\n - `Audit Logging`: \"Yes\"\n - `Mask Conditions`:\n - `Select User`: \"userone\"\n - `Access Types`: \"select\" add/edit permissions\n - `Row Level Filter`: \"eid != 1\" filter expression\n\n \u003cbr /\u003e\n\n Click **Add** to save the policy.\n 2. Repeat the previous query from the VM master SSH session\n against Hive employee table as userone:\n\n ```\n userone@example-cluster-m:~$ beeline -u \"jdbc:hive2://$(hostname -f):10000/default;principal=hive/$(hostname -f)@REALM\" -e \"select * from employee;\"\n ```\n\n \u003cbr /\u003e\n\n 1. The query returns with the name column masked out and bob (eid=1) filtered from the results.: \n\n ```\n Transaction isolation: TRANSACTION_REPEATABLE_READ\n +---------------+----------------+\n | employee.eid | employee.name |\n +---------------+----------------+\n | 2 | NULL |\n | 3 | NULL |\n +---------------+----------------+\n 2 rows selected (0.47 seconds)\n ```"]]