Cloud Dataflow Access Control Guide

Overview

You can use Dataflow IAM roles to limit access for users within a project or organization, to just Dataflow-related resources, as opposed to granting users viewer, editor, or owner access to the entire Cloud Platform project.

This page focuses on how to use Dataflow's IAM roles. For a detailed description of IAM and its features, see the Google Cloud Identity and Access Management developer's guide.

Every Dataflow method requires the caller to have the necessary permissions. For a list of the permissions and roles Dataflow supports, see the following section.

Permissions and Roles

This section summarizes the permissions and roles Dataflow IAM supports.

Required Permissions

The following table lists the permissions that the caller must have to call each method:

Method Required Permission(s)
dataflow.jobs.create dataflow.jobs.create
dataflow.jobs.cancel dataflow.jobs.cancel
dataflow.jobs.updateContents dataflow.jobs.updateContents
dataflow.jobs.list dataflow.jobs.list
dataflow.jobs.get dataflow.jobs.get
dataflow.jobs.drain dataflow.jobs.drain
dataflow.messages.list dataflow.messages.list
dataflow.metrics.get dataflow.metrics.get

Note: The Dataflow Worker role (roles/dataflow.worker) provides the permissions (dataflow.workItems.lease, dataflow.workItems.update, and dataflow.workItems.sendMessage) necessary for a Compute Engine service account to execute work units for a Dataflow pipeline. It should typically only be assigned to such an account, and only includes the ability to request and update work from the Dataflow service.

Roles

The following table lists the Dataflow IAM roles with a corresponding list of all the permissions each role includes. Note that every permission is applicable to a particular resource type.

Role includes permission(s) for resource types:
roles/dataflow.viewer dataflow.<resource-type>.list
dataflow.<resource-type>.get
jobs, messages, metrics
roles/dataflow.developer All of the above, as well as:
dataflow.jobs.create
dataflow.jobs.drain
dataflow.jobs.cancel

jobs
roles/dataflow.admin All of the above, as well as:
compute.machineTypes.get
storage.buckets.get
storage.objects.create
storage.objects.get
storage.objects.list
NA
roles/dataflow.worker (for service accounts only) dataflow.jobs.get
dataflow.jobs.list
dataflow.workItems.lease
dataflow.workItems.update
dataflow.workItems.sendMessage
NA

Creating Jobs

In order to a create a job, the following permissions are required, at minimum:

  • The dataflow.developer role, to instantiate the job itself.
  • The Viewer permission on the project, to access machine type information and view other settings.
  • The storage.objectAdmin role, to provide permission to stage files on Cloud Storage.

Alternatively, dataflow.admin can be granted alone for the same purpose, which includes the minimal set of permissions to run and examine jobs.

Example Role Assignment

To illustrate the utility of the different Dataflow roles, consider the following breakdown:

  • The developer interacting with the Dataflow job will need the dataflow.developer role.
    • They will need the storage.objectAdmin role in order to stage the required files.
    • For debugging and quota checking, they will need the project Viewer role.
    • Absent other role assignments, this will allow the developer to create and cancel Dataflow jobs, but not interact with the individual VMs or access other Cloud services.
  • The developer who only needs to create and examine jobs will need the dataflow.admin role.
  • The cloudservices account needs the iam.serviceAccountActor and compute.instanceAdmin.v1 roles on the project, in order to start VMs with the Compute Engine service account. Additionally, if native sources such as BigQuery are being used, it will need appropriate access to those systems.
  • The Compute Engine service account needs the dataflow.worker role to process data for the Dataflow service.

Assigning Dataflow roles

Dataflow roles can currently be set on organizations and projects only.

To manage roles at the organizational level, see Access Control for Organizations Using IAM.

To set project-level roles, see Access control via the Google Cloud Platform Console.

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Cloud Dataflow Documentation