Troubleshoot data lineage issues

This page shows you how to resolve issues with Data Catalog data lineage.

Project types

As data assets can reside in different projects, here is a summary of possible projects and their asset names.

BigQuery storage project

This project stores your BigQuery data assets. You can find it in asset details as a part of Table ID, before the first dot.

In the BigQuery UI, the storage project name is shown in the
    Table ID field, before the first dot in the fully qualified table name.
Figure 1. The name of a BigQuery storage project.

Compute project

This project stores the data lineage metadata. For BigQuery, this is where you run a job. If you run a job from the UI, you can find the compute project name in the project selector:

The BigQuery UI shows a compute project called docs-compute on
    the page where you run SQL queries.
Figure 2. The name of a compute project that runs BigQuery jobs.

When sending requests to the BigQuery API, specify the compute project in the URL, for example:

POST /bigquery/v2/projects/docs-compute/jobs HTTP/1.1
Host: bigquery.googleapis.com
User-Agent: Go-http-client/1.1
Authorization: <REDACTED 1031 BYTES>
Accept-Encoding: gzip
{
  "configuration": {
    "query": {
      "useLegacySql": false,
      "query": "CREATE OR REPLACE TABLE `docs-target.dataset.target-002` AS SELECT * FROM `docs-source.dataset.source-002`;"
    }
  },
  "jobReference": {
    "projectId": "docs-compute",
    "jobId": "docs-compute-job-id",
    "location": "us",
  }
}

Active project

This is the project from which you are viewing the data lineage. The Google Cloud console shows the active project in the project selector. If you're using the API, the active project is the project from which you're making API calls.

The BigQuery UI shows the data lineage for a
    dataset called source-001, which is in a project called docs-source.
Figure 3. The active project in the the Google Cloud console.

BigQuery data lineage not showing

The following issue occurs after running a BigQuery job. In this case the problem can be caused by three scenarios:

  • The Data Lineage API is disabled in the active project or the compute project,
  • You don't have Data lineage Viewer (roles/datalineage.viewer) in the active or the compute project.
  • The data lineage has not arrived yet. Depending on the volume and complexity of the data being processed, it can take from standard 30 minutes up to 24 hours for the data lineage to display.

If you see the message "Fetching lineage failed due to missing permissions." on the bottom of the page, you are missing permissions on the active project. Otherwise you are missing permissions on the compute project.

A screenshot that shows empty lineage graph.
Figure 4. Example of lineage not showing in BigQuery UI.

To resolve this issue, first check if the Data Lineage API is enabled for the compute project. After enabling the API you need to run a job to see the data lineage Depending on the volume and complexity of the data being processed, it can take from standard 30 minutes up to 24 hours for the data lineage to display.

Next check if the Data Lineage API is enabled for the active project. After enabling the API, if you have the required permissions (see below), you will see lineage.

When the Data Lineage API was enabled, grant Data lineage Viewer (roles/datalineage.viewer) in both the active and the compute project.

BigQuery process metadata not showing

Problem description

The following issue occurs when you open the table details pane, which doesn't show all the details like the SQL statement or the Process type property. This happens even though the data lineage displays properly.

This can happen when you don't have permissions to see metadata in the compute project.

Example:

  • BigQuery source table: docs-source.dataset.source-001
  • BigQuery target table: docs-target.dataset.target-001
  • Data lineage between docs-source.dataset.source-001 and docs-target.dataset.target-001 in compute project docs-compute
  • You have the Data lineage Viewer role for active and compute docs-compute projects.

Clicking the BigQuery process details displays the following message:

You don't have permission to view BigQuery process metadata in project X.

In the the Google Cloud console:

In the BigQuery UI, on the Lineage tab, the Details pane shows
    an error message.
Figure 5. Example of BigQuery process details not showing in BigQuery UI.

To resolve this issue, grant the user bigquery.jobs.get permission (for example included in BigQuery Resource Viewer role) in the compute project.

BigQuery table details not showing

The following issue occurs when you open the table details pane, which shows only the "Fully qualified name" property. This happens even though the data lineage displays properly. This can happen when you don't have all required permissions in the table's storage projects.

Example:

  • BigQuery table docs-source.dataset.source-001,
  • BigQuery table docs-target.dataset.target-001,
  • data lineage between docs-source.dataset.source-001 and docs-target.dataset.target-001 with compute project docs-compute,
  • User that has Data lineage Viewer role for the active and compute docs-compute projects.

In this case, the user, when clicking on BigQuery node details, can see a message Entry with this fully qualified name is not available in the Data Catalog.

A screenshot that shows empty table panel.
Figure 6. Example of BigQuery table details not showing in BigQuery UI.

To resolve this issue, grant the user bigquery.tables.get permissions (for example included in BigQuery Data Viewer role) in the storage project.