VPC Service Controls perimeters and Data Catalog

VPC Service Controls can help your organization mitigate data exfiltration risks from Google-managed services like Cloud Storage and BigQuery. This page shows how Data Catalog interacts with resources inside a VPC Service Controls service perimeter.

The examples below use BigQuery to demonstrate how Data Catalog interacts with perimeters. However, Data Catalog respects perimeters around all Google storage systems in the same way, including Cloud Storage and Pub/Sub.

Example

To understand how Data Catalog interacts with perimeters, consider the diagram below. In the diagram, there are two Google Cloud projects: Project A and Project B. A service perimeter has been established around Project A, and the BigQuery service is protected by the perimeter. The user has not been granted access to the perimeter through an allowlisted IP or a user identity. Project B is not inside the perimeter.

Due to the VPC perimeter around Project A, the user accesses only Project
  B metadata through Data Catalog.
Figure 1. User has Data Catalog access to BiqQuery Project B but not Project A.

The result of this configuration is that:

  • Data Catalog continues to sync BigQuery metadata from both projects.
  • The user can access data and metadata for Project B from BigQuery, and search/tag its metadata with Data Catalog.
  • The user cannot access Project A data in BigQuery, as they are blocked by the perimeter. The user also cannot search or tag its metadata with Data Catalog.

Note that the Data Catalog service has not been added to the perimeter. Rather, Data Catalog respects the existing perimeter around Project A and BigQuery.

Custom integrated assets

Data Catalog is capable of integrating assets from other clouds and on-premises data sources. These are called custom integrated assets. If Data Catalog has not been added to the VPC Service Controls perimeter, users can still access custom integrated assets, even for projects in perimeters where they have not been allowlisted.

In the example below, custom integrated assets have been added to both Project A and Project B from the first example. The user in this example still does not have perimeter access.

Due to the VPC perimeter around Project A, the user accesses only Project
  B and custom integrated data in Projects A and B.
Figure 2. The user has Data Catalog access to BigQuery Project B and custom integrated metadata in Projects A and B.

The result of this configuration is that:

  • The user can access data and metadata for Project B from BigQuery, and search or tag its metadata with Data Catalog.
  • The user cannot access Project A data or metadata from BigQuery, as they are blocked by the perimeter. They also cannot search or tag its metadata with Data Catalog.
  • The user can use Data Catalog to search or tag metadata for the custom integrated assets in both Project A and Project B.

Limiting access to custom integrated assets

You can limit access to custom integrated assets by using a service perimeter to protect the Data Catalog API. The example below expands on the second example by adding a perimeter around the Data Catalog service for Project B:

Due to the VPC perimeter around Project A and custom integrated data in
  Project B, the user accesses only Project B and custom data in Project A.
Figure 3. User has Data Catalog access to Project B, and custom integrated metadata in Project A.

The result of this configuration is that:

  • Data Catalog has not been added to the perimeter for Project A, so the user can search/tag metadata for the custom integrated assets in Project A.
  • Data Catalog has been added to the perimeter for Project B, so the user cannot search/tag metadata for the custom integrated assets in Project B.
  • As in the first example, the user cannot access Project A data/metadata from BigQuery, as they are blocked by the perimeter. They also cannot search/tag BigQuery metadata with Data Catalog.
  • Even though a service perimeter has been established for Project B, the BigQuery service hasn't been added to it. This means the user can access Project B data/metadata from BigQuery, and search/tag BigQuery metadata with Data Catalog.

Data lineage support

Data Lineage is supported by restricted Virtual IP (VIP). For more information, see Services supported by the restricted VIP.