Use data lineage with Google Cloud systems

Enable data lineage in a Google Cloud project to begin automatically tracking lineage information for supported systems.

Roles and permissions

Data Catalog tracks lineage information automatically when you enable the Data Lineage API. You don't need any admin or editor roles to capture lineage for your data assets and access the lineage in the the Google Cloud console. Standard viewer roles as described in the Identity and Access Management section are sufficient. For more information about granting roles, see Manage access. You can assign a role at a higher folder or organization level (see Grant or revoke a single role).

Enable data lineage

  1. In the Google Cloud console, on the project selector page, select the project that contains the resources for which you want to track lineage.

    Go to project selector

  2. Enable Data Lineage API and Data Catalog API.

    Enable the APIs

View lineage graphs in Dataplex UI

Lineage visualization graph displays the relations between your project resources and the processes that created them. You can view data lineage information in the form of a graph visualization in the Google Cloud console, or retrieve it from the Data Lineage API in the form of JSON data.

  1. Open the Dataplex search page and find the asset for which you want to view lineage information.

    Open the Dataplex search page

    For more information see How to search for data assets.

  2. On the entry details page, select the Lineage tab.

A sample graph shows data from two tables being transformed and then merged.
Figure 1. Example of a lineage visualization graph in Dataplex UI.

Select the process or data source buttons to display the details panel.

View lineage graphs in BigQuery UI

You can view lineage graph directly in the BigQuery UI.

  1. In the Google Cloud console, go to the BigQuery page.

    Open the BigQuery page

  2. Open the table for which you want to see the data lineage.
  3. Click the Lineage tab.
  4. Select the process or data source buttons to display the details panel.

View lineage graphs in Vertex AI UI

Systems like Vertex AI Pipelines generate lineage data for Vertex AI models and datasets. You can view the lineage graph directly in the Vertex AI UI.

View lineage graphs for a managed dataset in Vertex AI

To view the lineage graph for a dataset, follow these instructions:

  1. In the Google Cloud console, go to the Datasets page.

    Open the Datasets page

  2. Click the dataset for which you want to see the data lineage.
  3. Click the Lineage tab.
  4. Select the process or data source buttons to display the details panel.

View lineage graphs for a model in Vertex AI

To view the lineage graph for a model, follow these instructions:

  1. In the Google Cloud console, go to the Model Registry page.

    Open the Model Registry page

  2. Click the model for which you want to see the data lineage.
  3. Click the Lineage tab.
  4. Select the process or data source buttons to display the details panel.

What's next