View lineage in Dataplex

Stay organized with collections Save and categorize content based on your preferences.

This page describes how to view the data lineage generated by your Cloud Data Fusion pipelines with other data movement on Google Cloud, for discovery and governance purposes. You can view the lineage graphs for supported data sources on the Dataplex page in the console, or use the Data Lineage API to retrieve complete data lineage records.

Plugins that support data lineage in Dataplex

Cloud Data Fusion and Dataplex support asset-level lineage for the following plugins:

  • Amazon S3
  • BigQuery
  • Cloud Spanner
  • Cloud Storage
  • Cloud SQL for MySQL
  • Cloud SQL for PostgreSQL
  • Dataplex
  • FTP
  • Generic Database
  • MSSQL/SQL Server
  • MySQL
  • Oracle
  • PostgreSQL
  • SAP OData
  • SAP ODP
  • SAP Table

For more information, see Cloud Data Fusion plugins.

Before you begin

To enable viewing Cloud Data Fusion lineage graphs on the Dataplex page in the console, do the following:

  1. Create a data pipeline that uses only the supported plugins.

  2. Enable the Data Lineage API in the project that contains your Cloud Data Fusion instance.

  3. Grant the Data Lineage Events Producer (roles/datalineage.producer) role to the Cloud Data Fusion-managed service account. For more information, see Data Catalog's predefined lineage roles.

When lineage is available

Viewing lineage in Dataplex has the following limitations:

View data lineage graphs

To view lineage graphs for entities across all Google Cloud services, do the following:

  1. Go to your instance in Cloud Data Fusion and run a data pipeline that uses supported plugins.

  2. View the lineage graphs on the Dataplex page in the console and find the asset for which you want to view lineage information.

What's next