This document describes the new fine-grained authorization feature, available in Cloud Data Fusion version 6.5 and later.
You can use role-based access control (RBAC) to restrict access within the environments where you develop your pipelines in Cloud Data Fusion. RBAC helps you manage who has access to which Cloud Data Fusion resources, what they can do with those resources, and what areas (for example, instances or namespaces) they have access to. Cloud Data Fusion RBAC is an authorization system that provides fine-grained access management powered by Identity and Access Management (IAM).
When to use RBAC
Role-based access control provides namespace-level isolation within a single Cloud Data Fusion instance. It is recommended for the following use cases:
- Helping minimize the number of instances used by your organization.
- Having multiple developers, teams, or business units use a single Cloud Data Fusion instance.
With Cloud Data Fusion RBAC, organizations can:
- Allow a user to only execute a pipeline within a namespace, but not modify artifacts or runtime compute profiles.
- Allow a user to only view the pipeline, but not modify or execute a pipeline.
- Allow a user to create, deploy, and execute a pipeline.
Recommended: Even when you use RBAC, to maintain isolation, security, and performance stability, it is recommended that you have separate projects and instances for development and production environments.
- A Cloud Data Fusion IAM role contains a number of permissions.
- A user can be granted with one or multiple roles at either instance or namespace level.
- RBAC is only available in the Cloud Data Fusion Enterprise edition.
- Number of namespaces: No hard limit on the number of namespaces per instance.
- Namespace deletion is currently unsupported.
- Users: A maximum of 50 users per instance is supported.
- Custom roles: Creating custom roles is strongly discouraged. Future versions might be incompatible with custom roles you create in the current version.
- Cloud Data Fusion RBAC does not currently support authorization on Connection Management.
A role assignment consists of three elements: principal, role definition, and scope.
A principal (formerly known as a member) can be a Google Account (for end users), a service account (for apps and virtual machines), or a Google group that is requesting access to Cloud Data Fusion resources. You can assign a role to any of these principals.
A role contains a set of permissions that allows you to perform specific actions on Google Cloud resources.
Cloud Data Fusion provides several predefined roles that you can use.
- The Instance Admin role (
datafusion.admin) lets principals create and delete namespaces, and grant permissions.
- The Developer role (
datafusion.developer) lets principals create and delete pipelines, deploy pipelines, and run previews.
Scope is the set of resources that the access applies to. When you assign a role, you can further limit the actions allowed by defining a scope (for example, an instance or namespace). This is helpful if you want to assign somebody a developer role, but only for one namespace.
Predefined Cloud Data Fusion roles
Cloud Data Fusion RBAC includes several predefined roles that you can use:
|Predefined role||Description||Operations a principal can perform when this role is assigned to them in a namespace|
|Instance Admin role (
||Grants access to all resources within a Cloud Data Fusion instance.||N/A. Not assigned at the namespace level.|
|Instance Access role (
||Grants the principal access to a Cloud Data Fusion instance, but not to any resources within the instance. Use this role in combination with other namespace-specific roles to provide fine grained access to namespace.||
|Editor role (
||Grants the principal full access to all Cloud Data Fusion resources under a namespace within a Cloud Data Fusion instance. This role has to be granted in addition to the Instance Accessor role to the principal. With this role, the principal can create, delete and modify resources in the namespace.||
|Developer role (
||Grants access to a principal on a namespace to create and modify limited resources within the namespace (e.g. pipelines).||
|Operator role (
||Grants access to a principal on a namespace to access and run pipelines, change the compute profile, create compute profiles, or upload artifacts. Can perform the same actions as a developer, with the exception of previewing pipelines.||
|Viewer role (
||Grants access to a principal on a namespace to view pipelines, but not to author or run pipelines.||
Adopting a security model and catering it to your organization's needs and requirements can be challenging. The following recommendations are intended to help you simplify your journey to adopt Cloud Data Fusion's RBAC model:
- Instance Admin role should be granted cautiously. This role enables full access to an instance and all its underlying Cloud Data Fusion resources. A principal with this role can grant permissions to others via the REST API.
- Instance Admin role should not be granted when principals are required to have access to individual namespaces within a Cloud Data Fusion instance. Instead, grant the Instance Accessor role with one of the Viewer/Developer/Operator/Editor roles granted on a subset of the namespaces.
- Instance Accessor role is safe to assign first, as it enables principals' access to instance, but will not grant access to any resources within the instance. This role is typically used along with one of Viewer/Developer/Operator/Editor to give access to one or a subset of the namespaces within an instance.
- Viewer role is recommended to be assigned to users or Google groups who would like to self-serve for understanding the status of running jobs, or viewing pipelines or logs with Cloud Data Fusion instances. For example, consumers of daily reports who would like to know whether processing has been completed.
- Developer role is recommended for ETL developers who are responsible for creating, testing, and managing pipelines.
- Operator role for a namespace is recommended for users who are providing operations admin or devops services. They are able to perform all actions that developers can perform (except for previewing pipelines) and also deploy artifacts and manage compute profiles.
- Editor role for a namespace is a privileged role that gives the user or Google group full access to all resources in the namespace. Editor can be considered the union of the developer and operator roles.
- Operators and Admins should be wary of installing untrusted plugins or artifacts as this can present a security risk.
- Learn how to get started with RBAC in Cloud Data Fusion.