VPC Service Controls perimeters and Data Catalog

VPC Service Controls can help your organization mitigate data exfiltration risks from Google-managed services like Cloud Storage and BigQuery. This page shows how Data Catalog interacts with resources inside a VPC Service Controls service perimeter.

The examples in this document use BigQuery to demonstrate how Data Catalog interacts with perimeters. However, Data Catalog respects perimeters around all Google storage systems in the same way, including Cloud Storage and Pub/Sub.

Example

To understand how Data Catalog interacts with perimeters, consider the following diagram.

In the diagram, there are two Google Cloud projects: Project A and Project B. A service perimeter is established around Project A, and the BigQuery service is protected by the perimeter. The user hasn't been granted access to the perimeter through an IP on an allowlist or a user identity. Project B isn't inside the perimeter.

Due to the VPC perimeter around Project A, the user accesses only Project
  B metadata through Data Catalog.
Figure 1. User has Data Catalog access to BigQuery Project B but not Project A.

The following is the result of this configuration:

  • Data Catalog continues to sync BigQuery metadata from both projects.
  • The user can access data and metadata for Project B from BigQuery, and search or tag its metadata with Data Catalog.
  • The user can't access Project A data in BigQuery, as they're blocked by the perimeter. The user also cannot search or tag its metadata with Data Catalog.

Custom integrated assets

Data Catalog is capable of integrating assets from other clouds and on-premises data sources. They are called custom integrated assets. If Data Catalog isn't added to the VPC Service Controls perimeter, users can still access custom integrated assets, even for projects in perimeters where they aren't in an allowlist.

In the following example, custom integrated assets have been added to both Project A and Project B from the first example. The user in this example still don't have perimeter access.

Due to the VPC perimeter around Project A, the user accesses only Project
  B and custom integrated data in Projects A and B.
Figure 2. The user has Data Catalog access to BigQuery Project B and custom integrated metadata in Projects A and B.

The following is the result of this configuration:

  • The user can access data and metadata for Project B from BigQuery, and search or tag its metadata with Data Catalog.
  • The user can't access Project A data or metadata from BigQuery because they're blocked by the perimeter. They also can't search or tag its metadata with Data Catalog.
  • The user can use Data Catalog to search or tag metadata for the custom integrated assets in both Project A and Project B.

Limiting access to custom integrated assets

You can limit access to custom integrated assets by using a service perimeter to protect the Data Catalog API. The following example expands on the second example by adding a perimeter around the Data Catalog service for Project B:

Due to the VPC perimeter around Project A and custom integrated data in
  Project B, the user accesses only Project B and custom data in Project A.
Figure 3. User has Data Catalog access to Project B, and custom integrated metadata in Project A.

The following is the result of this configuration:

  • Data Catalog isn't added to the perimeter for Project A, so the user can search or tag metadata for the custom integrated assets in Project A.
  • Data Catalog is added to the perimeter for Project B, so the user can't search or tag metadata for the custom integrated assets in Project B.
  • As in the first example, the user can't access Project A data or metadata from BigQuery because they're blocked by the perimeter. They also can't search or tag BigQuery metadata with Data Catalog.
  • Even though a service perimeter is established for Project B, the BigQuery service is added to it. This means that the user can access Project B data or metadata from BigQuery, and search or tag BigQuery metadata with Data Catalog.

Data lineage support

Data Lineage is supported by restricted Virtual IP (VIP). For more information, see Services supported by the restricted VIP.