VPC Service Controls perimeters and Data Catalog

VPC Service Controls can help your organization mitigate data exfiltration risks from Google-managed services like Cloud Storage and BigQuery. This page shows how Data Catalog interacts with resources inside a VPC Service Controls service perimeter.

The examples below use BigQuery to demonstrate how Data Catalog interacts with perimeters. However, Data Catalog respects perimeters around all Google storage systems in the same way, including Cloud Storage and Pub/Sub.

Example

To understand how Data Catalog interacts with perimeters, consider the diagram below. In the diagram, two GCP projects are shown: Project A and Project B. A perimeter has been established around the BigQuery {bigquery_name_short}} service for Project A. Project B has not been added to the perimeter.

In the diagram below there are two Google Cloud projects: Project A and Project B. A service perimeter has been established around Project A, and the BigQuery service is protected by the perimeter. The user has not been granted access to the perimeter through a whitelisted IP or a user identity. Project B is not inside the perimeter.

Data Catalog with a perimeter around BigQuery

The result of this configuration is that:

  • Data Catalog continues to sync BigQuery metadata from both projects.
  • The user can access data and metadata for Project B from BigQuery, and search/tag its metadata with Data Catalog.
  • The user cannot access Project A data in BigQuery, as they are blocked by the perimeter. The user also cannot search or tag its metadata with Data Catalog.

Note that the Data Catalog service has not been added to the perimeter. Rather, Data Catalog respects the existing perimeter around Project A and BigQuery.

Custom integrated assets

Data Catalog is capable of integrating assets from other clouds and on-premises data sources. These are called custom integrated assets. If Data Catalog has not been added to the VPC Service Controls perimeter, users can still access custom integrated assets, even for projects in perimeters where they have not been whitelisted.

In the example below, custom integrated assets have been added to both Project A and Project B from the first example. The user in this example still does not have perimeter access.

Data Catalog with custom integrated assets

The result of this configuration is that:

  • The user can access data and metadata for Project B from BigQuery, and search or tag its metadata with Data Catalog.
  • The user cannot access Project A data or metadata from BigQuery, as they are blocked by the perimeter. They also cannot search or tag its metadata with Data Catalog.
  • The user can use Data Catalog to search or tag metadata for the custom integrated assets in both Project A and Project B.

Limiting access to custom integrated assets

You can limit access to custom integrated assets by using a service perimeter to protect the Data Catalog API. The example below expands on the second example by adding a perimeter around the Data Catalog service for Project B:

Data Catalog with custom integrated assets and a perimeter

The result of this configuration is that:

  • Data Catalog has not been added to the perimeter for Project A, so the user can search/tag metadata for the custom integrated assets in Project A.
  • Data Catalog has been added to the perimeter for Project B, so the user cannot search/tag metadata for the custom integrated assets in Project B.
  • As in the first example, the user cannot access Project A data/metadata from BigQuery, as they are blocked by the perimeter. They also cannot search/tag BigQuery metadata with Data Catalog.
  • Even though a service perimeter has been established for Project B, the BigQuery service hasn't been added to it. This means the user can access Project B data/metadata from BigQuery, and search/tag BigQuery metadata with Data Catalog.