Overview
Identity and Access Management (IAM) allows you to control user and group access to your project's resources. This document focuses on the IAM permissions relevant to Dataproc and the IAM roles that grant those permissions.
Dataproc Permissions
Dataproc permissions allow users to perform specific actions on Dataproc
clusters, jobs, operations, and workflow templates. For example, the dataproc.clusters.create
permission allows a user to create Dataproc clusters in your project.
You don't directly give users permissions; instead, you grant them
roles, which have one or more permissions bundled within
them.
The following tables list the permissions necessary to call Dataproc APIs (methods). The tables are organized according to the APIs associated with each Dataproc resource (clusters, jobs, operations, and workflow templates).
Clusters Permissions
Method | Required Permission(s) |
---|---|
projects.regions.clusters.create 1, 2 | dataproc.clusters.create |
projects.regions.clusters.get | dataproc.clusters.get |
projects.regions.clusters.list | dataproc.clusters.list |
projects.regions.clusters.patch 1, 2, 3 | dataproc.clusters.update |
projects.regions.clusters.delete 1 | dataproc.clusters.delete |
projects.regions.clusters.start | dataproc.clusters.start |
projects.regions.clusters.stop | dataproc.clusters.stop |
projects.regions.clusters.getIamPolicy | dataproc.clusters.getIamPolicy |
projects.regions.clusters.setIamPolicy | dataproc.clusters.setIamPolicy |
Notes:
- The
dataproc.operations.get
permission is also required to get status updates fromgcloud
command-line tool. - The
dataproc.clusters.get
permission is also required to get the result of the operation fromgcloud
command-line tool. dataproc.autoscalingPolicies.use
permission is also required to enable an autoscaling policy on a cluster.
Jobs Permissions
Method | Required Permission(s) |
---|---|
projects.regions.jobs.submit 1, 2 | dataproc.jobs.create dataproc.clusters.use |
projects.regions.jobs.get | dataproc.jobs.get |
projects.regions.jobs.list | dataproc.jobs.list |
projects.regions.jobs.cancel 1 | dataproc.jobs.cancel |
projects.regions.jobs.patch 1 | dataproc.jobs.update |
projects.regions.jobs.delete 1 | dataproc.jobs.delete |
projects.regions.jobs.getIamPolicy | dataproc.jobs.getIamPolicy |
projects.regions.jobs.setIamPolicy | dataproc.jobs.setIamPolicy |
Notes:
The
gcloud
command-line tool additionally requiresdataproc.jobs.get
in order for thejobs submit
,jobs wait
,jobs update
,jobs delete
, andjobs kill
commands to function properly.The
gcloud
command-line tool additionally requiresdataproc.clusters.get
permission to submit jobs. For an example of setting the permissions necessary for a user to rungcloud dataproc jobs submit
on a specific cluster using Dataproc Granular IAM, see Submitting Jobs with Granular IAM.
Operations Permissions
Method | Required Permission(s) |
---|---|
projects.regions.operations.get | dataproc.operations.get |
projects.regions.operations.list | dataproc.operations.list |
projects.regions.operations.cancel | dataproc.operations.cancel |
projects.regions.operations.delete | dataproc.operations.delete |
projects.regions.operations.getIamPolicy | dataproc.operations.getIamPolicy |
projects.regions.operations.setIamPolicy | dataproc.operations.setIamPolicy |
Workflow Template Permissions
Method | Required Permission(s) |
---|---|
projects.regions.workflowTemplates.instantiate | dataproc.workflowTemplates.instantiate |
projects.regions.workflowTemplates.instantiateInline | dataproc.workflowTemplates.instantiateInline |
projects.regions.workflowTemplates.create | dataproc.workflowTemplates.create |
projects.regions.workflowTemplates.get | dataproc.workflowTemplates.get |
projects.regions.workflowTemplates.list | dataproc.workflowTemplates.list |
projects.regions.workflowTemplates.update | dataproc.workflowTemplates.update |
projects.regions.workflowTemplates.delete | dataproc.workflowTemplates.delete |
projects.regions.workflowTemplates.getIamPolicy | dataproc.workflowTemplates.getIamPolicy |
projects.regions.workflowTemplates.setIamPolicy | dataproc.workflowTemplates.setIamPolicy |
Notes:
Workflow Template permissions are independent of Cluster and Job permissions. A user without
create cluster
orsubmit job
permissions may create and instantiate a Workflow Template.The
gcloud
command-line tool additionally requiresdataproc.operations.get
permission to poll for workflow completion.The
dataproc.operations.cancel
permission is required to cancel a running workflow.
Autoscaling Policies Permissions
Method | Required Permission(s) |
---|---|
projects.regions.autoscalingPolicies.create | dataproc.autoscalingPolicies.create |
projects.regions.autoscalingPolicies.get | dataproc.autoscalingPolicies.get |
projects.regions.autoscalingPolicies.list | dataproc.autoscalingPolicies.list |
projects.regions.autoscalingPolicies.update | dataproc.autoscalingPolicies.update |
projects.regions.autoscalingPolicies.delete | dataproc.autoscalingPolicies.delete |
projects.regions.autoscalingPolicies.getIamPolicy | dataproc.autoscalingPolicies.getIamPolicy |
projects.regions.autoscalingPolicies.setIamPolicy | dataproc.autoscalingPolicies.setIamPolicy |
Notes:
dataproc.autoscalingPolicies.use
permission is required to enable an autoscaling policy on a cluster with aclusters.patch
method request.
Dataproc Roles
Dataproc IAM roles
are a bundle of one or more permissions.
You grant roles to users or groups to allow them to perform actions on the Dataproc resources in your
project. For example, the Dataproc Viewer role contains the
dataproc.*.get
and dataproc.*.list
permissions, which
allow a user to get and list Dataproc clusters, jobs, and operations in a project.
The following table lists the Dataproc IAM roles and the permissions associated with each role:
Role ID | Permissions |
---|---|
roles/dataproc.admin | dataproc.*.getIamPolicy dataproc.*.setIamPolicy dataproc.*.create dataproc.*.get dataproc.*.list dataproc.*.delete dataproc.*.update dataproc.clusters.use dataproc.clusters.start dataproc.clusters.stop dataproc.jobs.cancel dataproc.workflowTemplates.instantiate dataproc.workflowTemplates.instantiateInline compute.machineTypes.get compute.machineTypes.list compute.networks.get compute.networks.list compute.projects.get compute.regions.get compute.regions.list compute.zones.get compute.zones.list resourcemanager.projects.get resourcemanager.projects.list |
roles/dataproc.editor | dataproc.*.create dataproc.*.get dataproc.*.list dataproc.*.delete dataproc.*.update dataproc.clusters.use dataproc.clusters.start dataproc.clusters.stop dataproc.jobs.cancel dataproc.workflowTemplates.instantiate dataproc.workflowTemplates.instantiateInline compute.machineTypes.get compute.machineTypes.list compute.networks.get compute.networks.list compute.projects.get compute.regions.get compute.regions.list compute.zones.get compute.zones.list resourcemanager.projects.get resourcemanager.projects.list |
roles/dataproc.viewer | dataproc.*.get dataproc.*.list compute.machineTypes.get compute.regions.get compute.regions.list compute.zones.get resourcemanager.projects.get resourcemanager.projects.list |
roles/dataproc.worker (for service accounts only) | dataproc.agents.* dataproc.tasks.* logging.logEntries.create monitoring.metricDescriptors.create monitoring.metricDescriptors.get monitoring.metricDescriptors.list monitoring.monitoredResourceDescriptors.get monitoring.monitoredResourceDescriptors.list monitoring.timeSeries.create storage.buckets.get storage.objects.create storage.objects.get storage.objects.list storage.objects.update storage.objects.delete storage.objects.getIamPolicy storage.objects.setIamPolicy |
Notes:
- "*" signifies "clusters," "jobs," or "operations," except the
only permissions associated with
dataproc.operations.
areget
,list
, anddelete
. - The
compute
permissions listed above are needed or recommended to create and view Dataproc clusters when using the Google Cloud Console or the Cloud SDKgcloud
command-line tool. - To allow a user to upload files, grant the
Storage Object Creator
role. To allow a user to view job output, grant theStorage Object Viewer
role. Note that granting either of these Storage roles gives the user the ability to access any bucket in the project. - A user must have
monitoring.timeSeries.list
permission in order to view graphs on the Google Cloud Console→Dataproc→Cluster details Overview tab. - A user must have
compute.instances.list
permission in order to view instance status and the master instance SSH menu on the Google Cloud Console→Dataproc→Cluster details VM Instances tab. For information on Google Compute Engine roles, see Compute Engine→Available IAM roles). - To create a cluster with a user-specified service account, the specified
service account must have all permissions granted by the
Dataproc Worker
role. Additional roles may be required depending on configured features. See Service Accounts for list of additional roles.
Project Roles
You can also set permissions at the project level by using the IAM Project roles. Here is a summary of the permissions associated with IAM Project roles:
Project Role | Permissions |
---|---|
Project Viewer | All project permissions for read-only actions that preserve state (get, list) |
Project Editor | All Project Viewer permissions plus all project permissions for actions that modify state (create, delete, update, use, cancel, stop, start) |
Project Owner | All Project Editor permissions plus permissions to manage access control for the project (get/set IamPolicy) and to set up project billing |
IAM Roles and Dataproc Operations Summary
The following table summarizes the Dataproc operations available based on the role granted to the user, with caveats noted.
Operation | Project Editor | Project Viewer | Dataproc Admin | Dataproc Editor | Dataproc Viewer |
---|---|---|---|---|---|
Get/Set Dataproc IAM permissions | No | No | Yes | No | No |
Create cluster | Yes | No | Yes | Yes | No |
List clusters | Yes | Yes | Yes | Yes | Yes |
Get cluster details | Yes | Yes | Yes 1, 2 | Yes 1, 2 | Yes 1, 2 |
Update cluster | Yes | No | Yes | Yes | No |
Delete cluster | Yes | No | Yes | Yes | No |
Start/Stop cluster | Yes | No | Yes | Yes | No |
Submit job | Yes | No | Yes 3 | Yes 3 | No |
List jobs | Yes | Yes | Yes | Yes | Yes |
Get job details | Yes | Yes | Yes 4 | Yes 4 | Yes 4 |
Cancel job | Yes | No | Yes | Yes | No |
Delete job | Yes | No | Yes | Yes | No |
List operations | Yes | Yes | Yes | Yes | Yes |
Get operation details | Yes | Yes | Yes | Yes | Yes |
Delete operation | Yes | No | Yes | Yes | No |
Notes:
- The performance graph is not available unless the user also has a
role with the
monitoring.timeSeries.list
permission. - The list of VMs in the cluster will not include status information
or an SSH link for the master instance unless the user also has a role with
the
compute.instances.list
permission. - Jobs that include files to be uploaded cannot be submitted unless the user also has the Storage Object Creator role or has been granted write access to the staging bucket for the project.
- Job output is not available unless the user also has the Storage Object Viewer role or has been granted read access to the staging bucket for the project.
Service accounts
When you call Dataproc APIs to perform actions in a project where your cluster is located, such as creating VM instances within the project, Dataproc performs these actions on your behalf by using a service account that has the permissions required to perform the actions. The service accounts listed below have the permissions required to perform Dataproc actions in the project where your cluster is located:
- Dataproc first attempts to use
service-[project-number]@dataproc-accounts.iam.gserviceaccount.com
. - If that service account doesn't exist, Dataproc will fall back
to using the Google APIs service
account,
[project-number]@cloudservices.gserviceaccount.com
.
If the cluster uses a Shared VPC network, a Shared VPC Admin must grant both of the above service accounts the role of Network User for the Shared VPC host project. For more information, refer to:
- Creating a cluster that uses a VPC network in another project
- Shared VPC documentation: configuring service accounts as Service Project Admins
IAM management
You can get and set IAM policies using the Google Cloud Console, the IAM API, or the
gcloud
command-line tool.
- For the Google Cloud Console, see Access control via the Google Cloud Console.
- For the API, see Access control via the API.
- For the
gcloud
command-line tool, see Access control via the gcloud command-line tool.