授予 VM 服務帳戶 Cloud Storage bucket 中的資源存取權,允許該帳戶使用範圍縮減的存取權權杖授予這些資源的存取權 (請參閱「Cloud Storage 的 IAM 權限」)。此外,請移除使用者對值區資源的存取權,避免使用者直接存取值區。
在叢集 VM 上停用 sudo 和其他根存取方式,包括更新 sudoer 檔案,防止他人冒用身分或變更驗證和授權設定。詳情請參閱 Linux 指令,瞭解如何新增/移除 sudo 使用者權限。
使用 iptable 封鎖叢集 VM 對 Cloud Storage 的直接存取要求。舉例來說,您可以封鎖 VM 中繼資料伺服器的存取權,防止存取用於驗證及授權存取 Cloud Storage 的 VM 服務帳戶憑證或存取權杖 (請參閱block_vm_metadata_server.sh,這是使用 iptable 規則封鎖 VM 中繼資料伺服器存取權的初始化指令碼)。
Caused by: java.lang.RuntimeException: Failed creating a SPNEGO token.
Make sure that you have run `kinit` and that your Kerberos configuration is correct.
See the full Kerberos error message: No valid credentials provided
(Mechanism level: No valid credentials provided)
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-04 (世界標準時間)。"],[[["\u003cp\u003eThe Dataproc Ranger Cloud Storage plugin, available in Dataproc image versions 1.5 and 2.0, provides an authorization service on Dataproc cluster VMs that validates Cloud Storage connector requests against Ranger policies using Kerberos for authentication.\u003c/p\u003e\n"],["\u003cp\u003eInstallation of the Ranger Cloud Storage plugin on a Dataproc cluster requires specific environment variables for Kerberos and Ranger passwords, along with a command to create the cluster with the necessary components and properties enabled.\u003c/p\u003e\n"],["\u003cp\u003eThe plugin, by default, includes policies for reading from and writing to Dataproc's staging and temporary buckets, as well as an "all" policy that grants all users metadata access for all objects, essential for Hadoop Compatible Filesystem (HCFS) operations.\u003c/p\u003e\n"],["\u003cp\u003eFor enhanced security, it is crucial to restrict direct access to Cloud Storage buckets by users, grant the VM service account access to bucket resources, disable \u003ccode\u003esudo\u003c/code\u003e on cluster VMs, and potentially use \u003ccode\u003eiptable\u003c/code\u003e rules to block access to the VM metadata server.\u003c/p\u003e\n"],["\u003cp\u003eSpark, Hive-on-MapReduce, and Hive-on-Tez jobs utilize delegation tokens to access Cloud Storage through the Ranger Cloud Storage plugin, which means configurations like \u003ccode\u003e--conf spark.yarn.access.hadoopFileSystems=gs://bucket-name\u003c/code\u003e are required for Spark jobs in Kerberos enabled environments.\u003c/p\u003e\n"]]],[],null,["The Dataproc Ranger Cloud Storage plugin, available with\nDataproc image versions 1.5 and 2.0, activates an authorization service\non each Dataproc cluster VM. The authorization service evaluates\nrequests from the [Cloud Storage connector](/dataproc/docs/concepts/connectors/cloud-storage)\nagainst Ranger policies and, if the request is allowed, returns an access token\nfor the cluster\n[VM service account](/dataproc/docs/concepts/configuring-clusters/service-accounts#VM_service_account).\n\nThe Ranger Cloud Storage plugin relies on\n[Kerberos](https://web.mit.edu/kerberos/) for authentication,\nand integrates with Cloud Storage connector support for delegation tokens.\nDelegation tokens are stored in a\n[MySQL](https://dev.mysql.com/doc/refman/5.7/en/)\ndatabase on the cluster master node. The root password for the database is\nspecified through cluster properties when you\n[create the Dataproc cluster](#create_a_dataproc_cluster).\n| Use the default **KMS symmetric encryption**, which includes message authentication. Don't use asymmetric keys.\n\nBefore you begin\n\nGrant the\n[Service Account Token Creator](/iam/docs/understanding-roles#iam.serviceAccountTokenCreator)\nrole and the [IAM Role Admin](/iam/docs/understanding-roles#iam.roleAdmin) role on the\n[Dataproc VM service account](/dataproc/docs/concepts/configuring-clusters/service-accounts#VM_service_account)\nin your project.\n\nInstall the Ranger Cloud Storage plugin\n\nRun the following commands in a local terminal window or in\n[Cloud Shell](https://console.cloud.google.com/?cloudshell=true) to install the Ranger\nCloud Storage plugin when you create a Dataproc cluster.\n\nSet environment variables \n\n```\nexport CLUSTER_NAME=new-cluster-name \\\n export REGION=region \\\n export KERBEROS_KMS_KEY_URI=Kerberos-KMS-key-URI \\\n export KERBEROS_PASSWORD_URI=Kerberos-password-URI \\\n export RANGER_ADMIN_PASSWORD_KMS_KEY_URI=Ranger-admin-password-KMS-key-URI \\\n export RANGER_ADMIN_PASSWORD_GCS_URI=Ranger-admin-password-GCS-URI \\\n export RANGER_GCS_PLUGIN_MYSQL_KMS_KEY_URI=MySQL-root-password-KMS-key-URI \\\n export RANGER_GCS_PLUGIN_MYSQL_PASSWORD_URI=MySQL-root-password-GCS-URI\n```\n\nNotes:\n\n- CLUSTER_NAME: The name of the new cluster.\n- REGION: The [region](/compute/docs/regions-zones#available) where the cluster will be created, for example, `us-west1`.\n- KERBEROS_KMS_KEY_URI and KERBEROS_PASSWORD_URI: See [Set up your Kerberos root principal password](/dataproc/docs/concepts/configuring-clusters/security#set_up_your_kerberos_root_principal_password).\n- RANGER_ADMIN_PASSWORD_KMS_KEY_URI and RANGER_ADMIN_PASSWORD_GCS_URI: See [Set up your Ranger admin password](/dataproc/docs/concepts/components/ranger#installation_steps).\n- RANGER_GCS_PLUGIN_MYSQL_KMS_KEY_URI and RANGER_GCS_PLUGIN_MYSQL_PASSWORD_URI: Set up a MySQL password following the same procedure that you used to [Set up a Ranger admin password](/dataproc/docs/concepts/components/ranger#installation_steps).\n\nCreate a Dataproc cluster\n\nRun the following command to create a Dataproc cluster and install the Ranger\nCloud Storage plugin on the cluster. \n\n```\ngcloud dataproc clusters create ${CLUSTER_NAME} \\\n --region=${REGION} \\\n --scopes cloud-platform \\\n --enable-component-gateway \\\n --optional-components=SOLR,RANGER \\\n --kerberos-kms-key=${KERBEROS_KMS_KEY_URI} \\\n --kerberos-root-principal-password-uri=${KERBEROS_PASSWORD_URI} \\\n --properties=\"dataproc:ranger.gcs.plugin.enable=true, \\\n dataproc:ranger.kms.key.uri=${RANGER_ADMIN_PASSWORD_KMS_KEY_URI}, \\\n dataproc:ranger.admin.password.uri=${RANGER_ADMIN_PASSWORD_GCS_URI}, \\\n dataproc:ranger.gcs.plugin.mysql.kms.key.uri=${RANGER_GCS_PLUGIN_MYSQL_KMS_KEY_URI}, \\\n dataproc:ranger.gcs.plugin.mysql.password.uri=${RANGER_GCS_PLUGIN_MYSQL_PASSWORD_URI}\"\n\n```\n\nNotes:\n\n- **1.5 image version:** If you are creating a 1.5 image version cluster (see [Selecting versions](/dataproc/docs/concepts/versioning/overview#selecting_versions)), add the `--metadata=GCS_CONNECTOR_VERSION=`\u003cvar translate=\"no\"\u003e\u003cstrong\u003e\"2.2.6\"\u003c/strong\u003e or higher\u003c/var\u003e flag to install the required connector version.\n\nVerify Ranger Cloud Storage plugin installation\n\nAfter the cluster creation completes, a `GCS` service type, named`gcs-dataproc`,\nappears in the [Ranger admin web interface](/dataproc/docs/concepts/accessing/dataproc-gateways#viewing_and_accessing_component_gateway_urls).\n\nRanger Cloud Storage plugin default policies\n\nThe default `gcs-dataproc` service has the following policies:\n\n- Policies to read from and write to the Dataproc\n cluster [staging and temp buckets](/dataproc/docs/concepts/configuring-clusters/staging-bucket)\n\n- An `all - bucket, object-path` policy, which allows all users to access metadata for\n all objects. This access is required to allow the Cloud Storage connector\n to perform HCFS\n ([Hadoop Compatible Filesystem](https://cwiki.apache.org/confluence/display/HADOOP2/HCFS))\n operations.\n\nUsage tips\n\nApp access to bucket folders\n\nTo accommodate apps that create intermediate files in Cloud Storage bucket,\nyou can grant `Modify Objects`, `List Objects`, and `Delete Objects`\npermissions on the Cloud Storage bucket path, then select\n`recursive` mode to extend the permissions to sub-paths on the specified path.\n| **Note:** By default, gcloud CLI jobs have access to all Cloud Storage resources.\n\nProtective measures\n\nTo help prevent circumvention of the plugin:\n\n- Grant the\n [VM service account](/dataproc/docs/concepts/configuring-clusters/service-accounts#VM_service_account)\n access to the resources in your Cloud Storage\n buckets to allow it grant access to those resources with down-scoped access tokens\n (see [IAM permissions for Cloud Storage](/storage/docs/access-control/iam-permissions#object_permissions)). Also,\n remove access by users to bucket resources to avoid direct bucket access by users.\n\n- Disable `sudo` and other means of root access on cluster VMs, including updating\n the `sudoer` file, to prevent impersonation or changes to authentication\n and authorization settings. For more information, see the Linux instructions\n for adding/removing `sudo` user privileges.\n\n- Use `iptable` to block direct access requests to Cloud Storage\n from cluster VMs. For example, you can block access to the VM metadata server\n to prevent access to the VM service account credential or access token used to\n authenticate and authorize access to Cloud Storage (see\n [`block_vm_metadata_server.sh`](https://github.com/GoogleCloudDataproc/initialization-actions/blob/master/ranger/block_vm_metadata_server.sh), an\n [initialization script](/dataproc/docs/concepts/configuring-clusters/init-actions)\n that uses `iptable` rules to block access to VM metadata server).\n\nSpark, Hive-on-MapReduce, and Hive-on-Tez jobs\n\nTo protect sensitive user authentication details and to reduce load on the\nKey Distribution Center (KDC), the Spark driver does not distribute Kerberos\ncredentials to executors. Instead, the Spark driver obtains a delegation\ntoken from the Ranger Cloud Storage plugin, and then distributes the delegation\ntoken to executors. Executors use the delegation token to authenticate to the\nRanger Cloud Storage plugin, trading it for a Google access token that\nallows access to Cloud Storage.\n\nHive-on-MapReduce and Hive-on-Tez jobs also use tokens to access\nCloud Storage. Use the following properties to obtain tokens\nto access specified Cloud Storage buckets when you submit the following\njob types:\n\n- **Spark jobs:**\n\n ```\n --conf spark.yarn.access.hadoopFileSystems=gs://bucket-name,gs://bucket-name,...\n ```\n- **Hive-on-MapReduce jobs:**\n\n ```\n --hiveconf \"mapreduce.job.hdfs-servers=gs://bucket-name,gs://bucket-name,...\"\n ```\n- **Hive-on-Tez jobs:**\n\n ```\n --hiveconf \"tez.job.fs-servers=gs://bucket-name,gs://bucket-name,...\"\n ```\n\nSpark job scenario\n\nA Spark wordcount job fails when run from\na terminal window on a Dataproc cluster VM that has\nthe Ranger Cloud Storage plugin installed. \n\n```\nspark-submit \\\n --conf spark.yarn.access.hadoopFileSystems=gs://${FILE_BUCKET} \\\n --class org.apache.spark.examples.JavaWordCount \\\n /usr/lib/spark/examples/jars/spark-examples.jar \\\n gs://bucket-name/wordcount.txt\n```\n\nNotes:\n\n- FILE_BUCKET: Cloud Storage bucket for Spark access.\n\nError output: \n\n```\nCaused by: com.google.gcs.ranger.client.shaded.io.grpc.StatusRuntimeException: PERMISSION_DENIED:\nAccess denied by Ranger policy: User: '\u003cUSER\u003e', Bucket: '\u003cdataproc_temp_bucket\u003e',\nObject Path: 'a97127cf-f543-40c3-9851-32f172acc53b/spark-job-history/', Action: 'LIST_OBJECTS'\n```\n\nNotes:\n\n- `spark.yarn.access.hadoopFileSystems=gs://${FILE_BUCKET}` is required in a Kerberos enabled environment.\n\nError output: \n\n```\nCaused by: java.lang.RuntimeException: Failed creating a SPNEGO token.\nMake sure that you have run `kinit` and that your Kerberos configuration is correct.\nSee the full Kerberos error message: No valid credentials provided\n(Mechanism level: No valid credentials provided)\n```\n\nA policy is edited using the **Access Manager** in the **Ranger admin web interface**\nto add `username` to list of users who have `List Objects` and other `temp` bucket\npermissions.\n\nRunning the job generates a new error.\n\nError output: \n\n```\ncom.google.gcs.ranger.client.shaded.io.grpc.StatusRuntimeException: PERMISSION_DENIED:\nAccess denied by Ranger policy: User: \u003cUSER\u003e, Bucket: '\u003cfile-bucket\u003e',\nObject Path: 'wordcount.txt', Action: 'READ_OBJECTS'\n```\n\nA policy is added to grant the user read access to the `wordcount.text`\nCloud Storage path.\n\nThe job runs and completes successfully. \n\n```\nINFO com.google.cloud.hadoop.fs.gcs.auth.GcsDelegationTokens:\nUsing delegation token RangerGCSAuthorizationServerSessionToken\nowner=\u003cUSER\u003e, renewer=yarn, realUser=, issueDate=1654116824281,\nmaxDate=0, sequenceNumber=0, masterKeyId=0\nthis: 1\nis: 1\na: 1\ntext: 1\nfile: 1\n22/06/01 20:54:13 INFO org.sparkproject.jetty.server.AbstractConnector: Stopped\n```"]]