Role based access control (RBAC) overview

This document describes the new fine-grained authorization feature, available in Cloud Data Fusion version 6.5 and later.

You can use role-based access control (RBAC) to restrict access within the environments where you develop your pipelines in Cloud Data Fusion. RBAC helps you manage who has access to which Cloud Data Fusion resources, what they can do with those resources, and what areas (for example, instances or namespaces) they have access to. Cloud Data Fusion RBAC is an authorization system that provides fine-grained access management powered by Identity and Access Management (IAM).

When to use RBAC

Role-based access control provides namespace-level isolation within a single Cloud Data Fusion instance. It is recommended for the following use cases:

  • Helping minimize the number of instances used by your organization.
  • Having multiple developers, teams, or business units use a single Cloud Data Fusion instance.

With Cloud Data Fusion RBAC, organizations can:

  • Allow a user to only execute a pipeline within a namespace, but not modify artifacts or runtime compute profiles.
  • Allow a user to only view the pipeline, but not modify or execute a pipeline.
  • Allow a user to create, deploy, and execute a pipeline.

Recommended: Even when you use RBAC, to maintain isolation, security, and performance stability, it is recommended that you have separate projects and instances for development and production environments.

Limitations

  • A Cloud Data Fusion IAM role contains a number of permissions.
  • A user can be granted with one or multiple roles at either instance or namespace level.
  • RBAC is only available in the Cloud Data Fusion Enterprise edition.
  • Number of namespaces: No hard limit on the number of namespaces per instance.
  • Namespace deletion is currently unsupported.
  • Users: A maximum of 50 users per instance is supported.
  • Custom roles: Creating custom roles is strongly discouraged. Future versions might be incompatible with custom roles you create in the current version.
  • Cloud Data Fusion RBAC does not currently support authorization on Connection Management.

Role assignments

A role assignment consists of three elements: principal, role definition, and scope.

Principal

A principal (formerly known as a member) can be a Google Account (for end users), a service account (for apps and virtual machines), or a Google group that is requesting access to Cloud Data Fusion resources. You can assign a role to any of these principals.

Role definition

A role contains a set of permissions that allows you to perform specific actions on Google Cloud resources.

Cloud Data Fusion provides several predefined roles that you can use.

Examples:

  • The Instance Admin role (datafusion.admin) lets principals create and delete namespaces, and grant permissions.
  • The Developer role (datafusion.developer) lets principals create and delete pipelines, deploy pipelines, and run previews.

Scope

Scope is the set of resources that the access applies to. When you assign a role, you can further limit the actions allowed by defining a scope (for example, an instance or namespace). This is helpful if you want to assign somebody a developer role, but only for one namespace.

Predefined Cloud Data Fusion roles

Cloud Data Fusion RBAC includes several predefined roles that you can use:

Predefined role Description Operations a principal can perform when this role is assigned to them in a namespace
Instance Admin role (datafusion.admin) Grants access to all resources within a Cloud Data Fusion instance. N/A. Not assigned at the namespace level.
Instance Access role (datafusion.accessor) Grants the principal access to a Cloud Data Fusion instance, but not to any resources within the instance. Use this role in combination with other namespace-specific roles to provide fine grained access to namespace.
  • Can access instance
  • Cannot access namespace
  • Cannot create namespace
  • Cannot delete namespace
  • Cannot create pipeline
  • Cannot deploy pipeline
  • Cannot deploy Hub plugin
  • Cannot create schedule
  • Cannot change schedule
  • Cannot run pipeline
  • Cannot preview pipeline
  • Cannot delete pipeline
  • Cannot view tag
  • Cannot delete tag
  • Cannot view logs
  • Cannot view compute profile
  • Cannot modify compute profile
  • Cannot delete compute profile
Editor role (datafusion.editor) Grants the principal full access to all Cloud Data Fusion resources under a namespace within a Cloud Data Fusion instance. This role has to be granted in addition to the Instance Accessor role to the principal. With this role, the principal can create, delete and modify resources in the namespace.
  • Can view pipeline
  • Can create pipeline
  • Can deploy pipeline
  • Can delete pipeline
  • Can deploy Hub plugin
  • Can run pipeline
  • Can create schedule
  • Can change schedule
  • Can view logs
  • Can view lineage
  • Can create tag
  • Can delete tag
  • Can preview pipeline
  • Can create compute profile
  • Can modify compute profile
  • Can delete compute profile
  • Cannot grant or revoke permissions to other principals for this namespace
  • Cannot create namespace
  • Cannot modify namespace
  • Cannot delete namespace
  • Cannot access namespace they don't have access to
Developer role (datafusion.developer) Grants access to a principal on a namespace to create and modify limited resources within the namespace (e.g. pipelines).
  • Can view pipeline
  • Can create pipeline
  • Can deploy pipeline
  • Can modify pipeline
  • Can delete pipeline
  • Can run pipeline
  • Can view logs
  • Can view lineage
  • Can create tag
  • Can delete tag
  • Can preview pipeline
  • Can create schedule
  • Can change schedule
  • Can view secure keys
  • Can create secure keys
  • Can delete secure keys
  • Cannot deploy Hub plugin
  • Cannot create compute profile
  • Cannot modify compute profile
  • Cannot delete compute profile
  • Cannot grant or revoke permissions to other principals for namespace
  • Cannot create namespace
  • Cannot delete namespace
  • Cannot access namespaces they don't have access to
Operator role (datafusion.operator) Grants access to a principal on a namespace to access and run pipelines, change the compute profile, create compute profiles, or upload artifacts. Can perform the same actions as a developer, with the exception of previewing pipelines.
  • Can execute pipeline
  • Can deploy pipeline
  • Can deploy Hub plugin
  • Can create schedule
  • Can change schedule
  • Can view pipeline
  • Can run pipeline
  • Can view logs
  • Can view tags
  • Can delete tags
  • Can create tags
  • Can view lineage
  • Can view secure keys
  • Can create secure keys
  • Can delete secure keys
  • Can create compute profile
  • Can modify compute profile
  • Can delete compute profile
  • Cannot create pipeline
  • Cannot modify pipeline
  • Cannot preview pipeline
  • Cannot create namespace
  • Cannot delete namespace
Viewer role (datafusion.viewer) Grants access to a principal on a namespace to view pipelines, but not to author or run pipelines.
  • Can view a pipeline
  • Can view logs
  • Can view lineage
  • Can view tags
  • Cannot view secure keys
  • Cannot create secure keys
  • Cannot delete secure keys
  • Cannot deploy pipeline
  • Cannot deploy Hub plugin
  • Cannot delete pipeline
  • Cannot run pipeline
  • Cannot create namespace
  • Cannot modify namespace
  • Cannot delete namespace
  • Cannot create compute profile
  • Cannot modify compute profile
  • Cannot delete compute profile

Security recommendations

Adopting a security model and catering it to your organization's needs and requirements can be challenging. The following recommendations are intended to help you simplify your journey to adopt Cloud Data Fusion's RBAC model:

  • Instance Admin role should be granted cautiously. This role enables full access to an instance and all its underlying Cloud Data Fusion resources. A principal with this role can grant permissions to others via the REST API.
  • Instance Admin role should not be granted when principals are required to have access to individual namespaces within a Cloud Data Fusion instance. Instead, grant the Instance Accessor role with one of the Viewer/Developer/Operator/Editor roles granted on a subset of the namespaces.
  • Instance Accessor role is safe to assign first, as it enables principals' access to instance, but will not grant access to any resources within the instance. This role is typically used along with one of Viewer/Developer/Operator/Editor to give access to one or a subset of the namespaces within an instance.
  • Viewer role is recommended to be assigned to users or Google groups who would like to self-serve for understanding the status of running jobs, or viewing pipelines or logs with Cloud Data Fusion instances. For example, consumers of daily reports who would like to know whether processing has been completed.
  • Developer role is recommended for ETL developers who are responsible for creating, testing, and managing pipelines.
  • Operator role for a namespace is recommended for users who are providing operations admin or devops services. They are able to perform all actions that developers can perform (except for previewing pipelines) and also deploy artifacts and manage compute profiles.
  • Editor role for a namespace is a privileged role that gives the user or Google group full access to all resources in the namespace. Editor can be considered the union of the developer and operator roles.
  • Operators and Admins should be wary of installing untrusted plugins or artifacts as this can present a security risk.

What's next