Data Catalog IAM

This document describes Identity and Access Management (IAM) roles that allow users to use Data Catalog to search and tag Google Cloud resources.

IAM terminology

Permissions
Checked at runtime to allow users to perform an operation or access a Google Cloud resource. Users are not granted permissions directly, but, instead, are granted roles that contain permissions.
Roles
A role is a predefined collection of permissions. Custom roles consisting of a custom collection of permissions are also allowed.

View Data Catalog roles

Within the Google Cloud console, perform the following steps:

  1. Go to the IAM & Admin > Roles page.

    Go to Roles

  2. In the Filter field, select Used in, type Data Catalog or Data Lineage, and click Enter.

  3. Click a role to view the permissions of the role in the right pane.

    For example, the Data Catalog Admin role has full access to all Data Catalog resources.

Predefined Data Catalog roles

Some predefined Data Catalog roles include the Data Catalog Admin, Data Catalog Viewer, and Data Catalog TagTemplate Creator. Some of these roles are described in the subsequent sections.

For a list and description of Data Catalog predefined roles and the permissions associated with each role, see Data Catalog roles.

Data Catalog Admin role

The roles/datacatalog.admin role has access to all the Data Catalog resources. A Data Catalog admin can add different types of users to a Data Catalog project.

DataCatalog Data Steward role

The roles/datacatalog.dataSteward role lets you add, edit, or delete the data stewards and the rich text overview for a data entry such as a BigQuery table.

Data Catalog Viewer role

To simplify gaining access to Google Cloud resources, Data Catalog provides the roles/datacatalog.viewer role with metadata read permission for all cataloged Google Cloud resources.

This role also grants the permissions to view Data Catalog tag templates and tags.

Grant the Data Catalog Viewer role on your project to allow users to view Google Cloud resources in Data Catalog.

Data Catalog TagTemplate Creator role

The roles/datacatalog.tagTemplateCreator role lets users create tag templates.

DataCatalog Search Admin role

The roles/datacatalog.searchAdmin role lets users retrieve, through search, all cataloged Google Cloud resources within a project or organization.

Predefined data lineage roles

To access the lineage graph for any Data Catalog entry, the user needs access to the entry in Data Catalog. To access the Data Catalog entry, the user needs a viewer role on the corresponding system resource or Data Catalog Viewer (roles/datacatalog.viewer) on the project that stores the Data Catalog entry. This section describes roles that are required to view and manipulate the lineage graph.

Lineage viewer role

The Data Lineage Viewer (roles/datalineage.viewer) role allows users to view Dataplex lineage graphs in the Google Cloud console and read lineage information using the Data Lineage API. The runs, and events for a given process are all stored in the same project as the process. In the case of automated lineage, the process, runs, and events are stored in the project in which the job that generated the lineage was running. This could be for example the project in which a BigQuery job was running.

You need different roles to view the lineage between assets in the graph and to view metadata of the assets on the graph. For the former, you need Data Lineage Viewer (roles/datalineage.viewer). For the latter, you need the same roles as used for accessing metadata entries in Data Catalog. The following two subsections provide more detail.

Roles to view lineage between two assets

To view lineage between assets on the lineage graph the user needs Data lineage Viewer (roles/datalineage.viewer) on the following projects:

  • The project where the user is viewing lineage from (known as active project), that is the project in the drop-down at the top of the Google Cloud console or the project from which API calls are made. This would normally be the Data Catalog resource project.
  • The projects in which lineage is recorded (known as compute project). Lineage is stored in the project in which the corresponding process was executed, as described above. This project can be different from the project that stores the asset that the user is viewing lineage for.

For more information about granting roles, see Manage access. You might also be able to get the required permissions through custom roles or other predefined roles.

Depending on the use case,you might must grant Data lineage Viewer (roles/datalineage.viewer) on folder or organization level to ensure that a user is guaranteed to access the full lineage graph (see Grant or revoke a single role). Roles required for Data lineage can be granted only through the Google Cloud CLI.

Roles to view metadata of assets on the lineage graph

When metadata about an asset on the graph is stored in Data Catalog, the user only gets to view that metadata if they have a viewer role on the corresponding system resource or Data Catalog Viewer (roles/datacatalog.viewer) on the project in which the Data Catalog entry is stored. Access to metadata of assets on the graph is independent of access to lineage. It is possible that the user has access to assets on the graph through appropriate viewer roles but cannot access the lineage between them. This is the case when the user does not have Data lineage Viewer (roles/datalineage.viewer) on the project in which the lineage was recorded. In this case, the Data Lineage API and UI will not show the lineage and not return any error, to prevent leaking information about the existence of lineage. Therefore absence of lineage for an asset does not mean that there is no lineage for that asset. The user might not have access to that lineage.

Data Lineage Events Producer role

The roles/datalineage.producer role lets users manually record lineage information using the data lineage API.

Data Lineage Editor role

The roles/datalineage.editor role lets users manually modify lineage information using the data lineage API.

Data Lineage Administrator role

The roles/datalineage.admin role lets users perform all lineage operations listed in this section.

Roles to view public and private tags

You can search for public tags using simple search. You can view a data entry, including its public tags, as long as you have the required permissions to view the data entry. No additional permissions on the tag template are required. For permissions required to view the data entry, see the table in this section.

However, we recommend to also grant the datacatalog.tagTemplates.get permission to the users who are expected to search for these public tags. This permission allows the users to also use the search predicate tag: or use the tag template search facet in the Data Catalog search page.

For private tags, you need view permissions on both the tag template and the data entry to search for the tag and to see the tag in the entry detail page. Users must use the tag: search predicate or the tag template search facet to find the tags; simple search for private tags is not supported.

Notes:

  • The view permission needed on the private tag template is datacatalog.tagTemplates.getTag.

  • The view permissions on the data entry for both public and private tags is included in the following table.

Resource Permission Role
BigQuery datasets, tables, models, routines, and connections bigquery.datasets.get
bigquery.tables.get
bigquery.models.getMetadata
bigquery.routines.get
bigquery.connections.get
roles/datacatalog.tagTemplateViewer
roles/bigquery.metadataViewer
roles/bigquery.connectionUser
Pub/Sub topics pubsub.topics.get roles/datacatalog.tagTemplateViewer
roles/pubsub.viewer
Spanner instances, databases, tables and views Instance: spanner.instances.get
Database:spanner.databases.get
Table: spanner.databases.get
Views: spanner.databases.get
datacatalog.tagTemplates.getTag
No predefined roles are available.
Bigtable instances and tables bigtable.instances.get
bigtable.tables.get
datacatalog.tagTemplates.getTag
roles/datacatalog.tagTemplateViewer
roles/bigtable.viewer
Dataproc Metastore services, databases, and tables metastore.tables.get
metastore.databases.get
metastore.services.get
roles/datacatalog.tagTemplateViewer
roles/metastore.metadataViewer
Custom entries datacatalog.entries.get No predefined roles are available.

Roles to search Google Cloud resources

Before searching, discovering, or displaying Google Cloud resources, Data Catalog checks that the user has been granted an IAM role with the metadata read permissions required by BigQuery, Pub/Sub, Dataproc Metastore, or other source system to access the resource.

Example: Data Catalog checks that the user has been granted a role with bigquery.tables.get permission before displaying BigQuery table metadata.

The following table lists the permissions and the associated roles needed for a user to use Data Catalog to search the listed Google Cloud resources.

Resource Permission Role
BigQuery datasets, tables, models, routines, and connections bigquery.datasets.get
bigquery.tables.get
bigquery.models.getMetadata
bigquery.routines.get
bigquery.connections.get
roles/bigquery.metadataViewer
roles/bigquery.connectionUser
Also see Data Catalog Viewer role
Pub/Sub topics pubsub.topics.get roles/pubsub.viewer
Also see Data Catalog Viewer role
Spanner databases and tables Instance: spanner.instances.get
Database: spanner.databases.get
Views: spanner.databases.get
No predefined roles are available.
Bigtable instances and tables bigtable.instances.get
bigtable.tables.get
roles/bigtable.viewer
Also see Data Catalog Viewer role
Dataplex lakes, zones, tables, and filesets dataplex.lakes.get
dataplex.zones.get
dataplex.entities.get
dataplex.entities.get
No predefined roles are available.
Dataproc Metastore services, databases, and tables metastore.tables.get
metastore.databases.get
metastore.services.get
roles/metastore.metadataViewer

Roles to attach tags to Google Cloud resources

To attach public and private tags to Google Cloud resources require the same permissions.

Data Catalog lets users extend metadata on Google Cloud resources by attaching tags. One or more tags that can be attached to a resource are defined in a tag template.

When a user attempts to use the tag template to attach a tag to a Google Cloud resource, Data Catalog checks that the user has the required permissions to use the tag template and to update the resource metadata. Permissions are granted through IAM roles, as shown in the following table.

The following table lists the permissions and the associated roles needed for a user to use Data Catalog to attach both public and private tags to listed Google Cloud resources.

Each row in the following table lists the permissions needed to tag resources. The corresponding roles may grant additional permissions. Click on each role to view all permissions associated with it.

Notes:

  • The owner of a data entry has the datacatalog.entries.updateTag permission by default. All other users must be granted the datacatalog.tagEditor role.

  • Thedatacatalog.tagTemplates.use permission is also required for all resources listed in the table.

Resource Permissions Role
BigQuery datasets, tables,
models, routines, and
connections
bigquery.datasets.updateTag
bigquery.tables.updateTag
bigquery.models.updateTag
bigquery.routines.updateTag
bigquery.connections.updateTag
roles/datacatalog.tagTemplateUser
roles/datacatalog.tagEditor
roles/bigquery.dataEditor
Pub/Sub topics pubsub.topics.updateTag roles/datacatalog.tagTemplateUser
roles/datacatalog.tagEditor
roles/pubsub.editor
Spanner databases and tables. Instance: spanner.instances.UpdateTag
Database: spanner.databases.UpdateTag
Table: spanner.databases.UpdateTag
Views:spanner.databases.UpdateTag
No predefined roles are available.
Bigtable instances and tables bigtable.instances.update
bigtable.tables.update
roles/datacatalog.tagTemplateUser
roles/datacatalog.tagEditor
roles/bigtable.admin
Dataplex lakes, zones, tables, and filesets dataplex.lakes.update
dataplex.zones.update
dataplex.entities.update
dataplex.entities.update
No predefined roles are available.
Dataproc Metastore services, databases, and tables metastore.tables.update
metastore.databases.update
metastore.services.update
roles/datacatalog.tagTemplateUser
roles/datacatalog.tagEditor
roles/metastore.editor
roles/metastore.metadataEditor

Custom roles for Google Cloud resources

Predefined editor roles for data entries from other Google Cloud systems might provide broader write access than required. Use custom roles to specify *.updateTag permissions only on a Google Cloud resource.

Roles to modify rich text overview and data stewards in Data Catalog

Users need the following roles to attach rich text overview and assign data stewards to entries in Data Catalog:

Resource Permissions Role
Google Cloud projects datacatalog.entries.updateOverview
datacatalog.entries.updateContacts
roles/datacatalog.dataSteward

Identity federation in Data Catalog

Identity federation lets you use an external identity provider (IdP) to authenticate and authorize users to Google Cloud services with IAM.

Data Catalog supports identity federation with the following limitations:

For more information