Role-based access control (RBAC) overview

This page describes fine-grained authorization with role-based access control (RBAC), which is available in Cloud Data Fusion versions 6.5 and later.

RBAC restricts access within the environments where you develop pipelines in Cloud Data Fusion. RBAC helps you manage who has access to Cloud Data Fusion resources, what they can do with those resources, and what areas (such as instances or namespaces) they can access. Cloud Data Fusion RBAC is an authorization system that provides fine-grained access management powered by Identity and Access Management (IAM).

When to use RBAC

Role-based access control provides namespace-level isolation within a single Cloud Data Fusion instance. It's recommended for the following use cases:

  • Helping minimize the number of instances used by your organization.
  • Having multiple developers, teams, or business units use a single Cloud Data Fusion instance.

With Cloud Data Fusion RBAC, organizations can:

  • Allow a user to only run a pipeline within a namespace, but not modify artifacts or runtime compute profiles.
  • Allow a user to only view the pipeline, but not modify or run a pipeline.
  • Allow a user to create, deploy, and run a pipeline.

Recommended: Even when you use RBAC, to maintain isolation, security, and performance stability, use separate projects and instances for development and production environments.

Limitations

  • A user can be granted with one or multiple roles at either instance or namespace level.
  • RBAC is only available in the Cloud Data Fusion Enterprise edition.
  • Number of namespaces: No hard limit on the number of namespaces per instance.
  • Users: A maximum of 50 users per instance are supported.
  • Custom roles: Creating custom RBAC roles isn't supported.
  • Cloud Data Fusion RBAC doesn't support authorization on Connection Management.
  • When using service account OAuth access tokens to access version 6.5 RBAC-enabled instances, the following scopes must be specified, especially the userinfo.email scope. Without them, you will encounter permission denied errors.
    • https://www.googleapis.com/auth/userinfo.email
    • https://www.googleapis.com/auth/cloud-platform or https://www.googleapis.com/auth/servicecontrol

Role assignments

A role assignment consists of three elements: principal, role definition, and scope.

Principal

A principal (formerly known as a member) can be a Google Account (for end users), a service account (for apps and virtual machines), or a Google group that is requesting access to Cloud Data Fusion resources. You can assign a role to any of these principals.

Role definition

A role contains a set of permissions that allows you to perform specific actions on Google Cloud resources.

Cloud Data Fusion provides several predefined roles that you can use.

Examples:

  • The Instance Admin role (datafusion.admin) lets principals create and delete namespaces, and grant permissions.
  • The Developer role (datafusion.developer) lets principals create and delete pipelines, deploy pipelines, and run previews.

Scope

Scope is the set of resources that the access applies to. When you assign a role, you can further limit the actions allowed by defining a scope (for example, an instance or namespace). This is helpful if you want to assign somebody the Developer role, but only for one namespace.

Predefined Cloud Data Fusion roles

Cloud Data Fusion RBAC includes several predefined roles that you can use:

Instance Access role (datafusion.accessor)
Grants the principal access to a Cloud Data Fusion instance, but not to any resources within the instance. Use this role in combination with other namespace-specific roles to provide fine-grained access to namespace.
Viewer role (datafusion.viewer)
Grants access to a principal on a namespace to view pipelines, but not to author or run pipelines.
Operator role (datafusion.operator)
Grants access to a principal on a namespace to access and run pipelines, change the compute profile, create compute profiles, or upload artifacts. Can perform the same actions as a developer, with the exception of previewing pipelines.
Developer role (datafusion.developer)
Grants access to a principal on a namespace to create and modify limited resources, such as pipelines, within the namespace.
Editor role (datafusion.editor)
Grants the principal full access to all Cloud Data Fusion resources under a namespace within a Cloud Data Fusion instance. This role must be granted in addition to the Instance Accessor role to the principal. With this role, the principal can create, delete, and modify resources in the namespace.
Instance Admin role (datafusion.admin)
Grants access to all resources within a Cloud Data Fusion instance. Assigned through IAM. Not assigned at the namespace level through RBAC.
Operation datafusion.accessor datafusion.viewer datafusion.operator datafusion.developer datafusion.editor datafusion.admin
Instances
Access instance
Namespaces
Create namespace *
Access namespace with explicit access granted
Access namespace without explicit access granted *
Edit namespace
Delete namespace
Namespace service account
Add service account
Edit service account
Remove service account
Use service account
RBAC
Grant or revoke permissions for other principals in the namespace *
Schedules
Create schedule
View schedule
Change schedule
Compute profiles
Create compute profiles
View compute profiles
Edit compute profiles
Delete compute profiles
Connections
Create connections
View connections
Edit connections
Delete connections
Use connections
Pipelines
Create pipelines
View pipelines
Edit pipelines
Delete pipelines
Preview pipelines
Deploy pipelines
Run pipelines
Secure keys
Create secure keys
View secure keys
Delete secure keys
Tags
Create tags
View tags
Delete tags
Cloud Data Fusion Hub
Deploy plugins
Source Control Management
Configure source control repository
Sync pipelines from a namespace
Lineage
View lineage
Logs
View logs

* The principal must have the Data Fusion Admin IAM role, not the Instance Admin RBAC role.

Security recommendations

Adopting a security model and catering it to your organization's needs and requirements can be challenging. The following recommendations are intended to help you simplify your journey to adopt Cloud Data Fusion's RBAC model:

  • Instance Admin role should be granted cautiously. This role enables full access to an instance and all its underlying Cloud Data Fusion resources. A principal with this role can grant permissions to others by using the REST API.
  • Instance Admin role shouldn't be granted when principals are required to have access to individual namespaces within a Cloud Data Fusion instance. Instead, grant the Instance Accessor role with one of the Viewer/Developer/Operator/Editor roles granted on a subset of the namespaces.
  • Instance Accessor role is safe to assign first, as it enables principals' access to instance, but won't grant access to any resources within the instance. This role is typically used along with one of Viewer/Developer/Operator/Editor to give access to one or a subset of the namespaces within an instance.
  • Viewer role is recommended to be assigned to users or Google groups who would like to self-serve for understanding the status of running jobs, or viewing pipelines or logs with Cloud Data Fusion instances. For example, consumers of daily reports who would like to know whether processing has been completed.
  • Developer role is recommended for ETL developers who are responsible for creating, testing, and managing pipelines.
  • Operator role for a namespace is recommended for users who are providing operations administrator or DevOps services. They are able to perform all actions that developers can perform (except for previewing pipelines) and also deploy artifacts and manage compute profiles.
  • Editor role for a namespace is a privileged role that gives the user or Google group full access to all resources in the namespace. Editor can be considered the union of the developer and operator roles.
  • Operators and Admins should be wary of installing untrusted plugins or artifacts as this can present a security risk.

Troubleshooting

This page section shows you how to resolve issues related to RBAC in Cloud Data Fusion.

A principal who has the Cloud Data Fusion Viewer role for a namespace in RBAC can edit pipelines

Access is based on a combination of IAM and RBAC roles. IAM roles have precedence over RBAC roles. Check if the principal has Project Editor or Cloud Data Fusion Admin IAM roles.

A principal who has the Instance Admin role in RBAC can't view Cloud Data Fusion instances in the Google Cloud console

There is a known issue in Cloud Data Fusion where principals with the Instance Admin role cannot view instances in the Google Cloud console. To fix the issue, grant either the Project Viewer or one of the Cloud Data Fusion IAM roles to the principal in addition to making them Admin to an instance. This grants Viewer access to the principal for all instances in the project.

Prevent a principal from viewing namespaces where they have no role

To prevent a principal from viewing namespaces where they have no role, they must not have the Project Viewer or any of the Cloud Data Fusion IAM roles. Instead, only grant RBAC roles to the principal in the namespace where they need to operate.

The principal with this kind of access won't see the list of Cloud Data Fusion instances in the Google Cloud console. Instead, give them a direct link to the instance, similar to the following: https://INSTANCE_NAME-PROJECT_ID.REGION_NAME.datafusion.googleusercontent.com/

When the principal opens the instance, Cloud Data Fusion displays a list of namespaces where the principal is granted RBAC role.

Grant the Cloud Data Fusion Accessor role to a principal

The Accessor role is implicitly assigned to a principal when any other RBAC role is assigned to them for any Cloud Data Fusion instance. To verify if a principal has that role on a particular instance, see the IAM Policy Analyzer.

What's next