Configure VPC Service Controls for Dataform

VPC Service Controls is a Google Cloud feature that lets you set up a perimeter that helps guard against data exfiltration. This guide shows how to use VPC Service Controls with Dataform to help make your services more secure.

VPC Service Controls provides an extra layer of defense for Google Cloud services that is independent of the protection provided by Identity and Access Management (IAM).

To learn more about VPC Service Controls, see Overview of VPC Service Controls.

Limitations

Dataform supports VPC Service Controls with the following limitations:

Security considerations

When you set up a VPC Service Controls perimeter for Dataform, you should review permissions granted to your Dataform service accounts and ensure that they match your security architecture.

Depending on the permissions that you grant to a Dataform service account, that service account might have access to BigQuery or Secret Manager data in the project that service account belongs to, regardless of VPC Service Controls. In such a case, restricting Dataform with a VPC Service Controls perimeter does not block communication with BigQuery or Secret Manager.

You should block communication with BigQuery if you don't need to execute any workflow invocations originating from your Dataform repositories. For more information about blocking communication with BigQuery, see Block communication with BigQuery.

You should block communication with Secret Manager if none of your Dataform repositories connect to a third-party Git repository. For more information about blocking communication with Secret Manager, see Block communication with Secret Manager.

Before you begin

Before you configure a VPC Service Controls service perimeter for Dataform, follow the Restrict remote repositories guide to set the dataform.restrictGitRemotes organization policy.

The dataform.restrictGitRemotes organization policy is required to ensure that VPC Service Controls checks are enforced when using Dataform and that third-party access to Dataform Git repositories is restricted.

Required roles

To get the permissions that you need to configure a VPC Service Controls service perimeter, ask your administrator to grant you the Access Context Manager Editor (roles/accesscontextmanager.policyEditor) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

For more information about VPC Service Controls permissions, see Access control with IAM.

Configure VPC Service Controls

You can restrict Dataform with a VPC Service Controls service perimeter in the following ways:

  • Add Dataform to an existing service perimeter that restricts BigQuery.
  • Create a service perimeter that restricts both Dataform and BigQuery.

To add Dataform to a service perimeter that restricts BigQuery, follow the Update a service perimeter guide in the VPC Service Controls documentation.

To create a new service perimeter that restricts both Dataform and BigQuery, follow the Create a service perimeter guide in the VPC Service Controls documentation.

Optional: Block communication with BigQuery

The way Dataform communicates with BigQuery depends on the type of service account used in Dataform.

The default Dataform service account uses the bigquery.jobs.create permission to communicate with BigQuery. You grant the default Dataform service account roles that contain this permission when you grant the roles that are required for Dataform to run SQL workflows in BigQuery.

To block communication between the default Dataform service account and BigQuery, you need to revoke all predefined and custom roles that contain the bigquery.jobs.create permission, which have been granted to the default Dataform service account. To revoke roles, follow the Manage access to projects, folders, and organizations guide.

custom Dataform service accounts use the following permissions and roles to communicate with BigQuery:

  • Thebigquery.jobs.create permission, given to the custom service account.
  • The Service Account Token Creator (roles/iam.serviceAccountTokenCreator) role, granted to the default Dataform service account on the custom service account.

You can block communication between a custom Dataform service account and BigQuery in either of the following ways:

  • Revoke the Service Account Token Creator (roles/iam.serviceAccountTokenCreator) role, granted to the default service account on the selected custom Dataform service account. To revoke the Service Account Token Creator (roles/iam.serviceAccountTokenCreator) role, follow the Manage access to service accounts guide.

  • Revoke all predefined and custom roles granted at the project level to the custom service account that contain the bigquery.jobs.create permission. To revoke roles, follow the Manage access to projects, folders, and organizations guide.

The bigquery.jobs.create permission is included in the following predefined BigQuery IAM roles that must be revoked:

Optional: Block communication with Secret Manager

Dataform uses the secretmanager.versions.access permission to access individual Secret Manager secrets. You give this permission to the default Dataform service account on a selected Secret Manager secret when you connect a Dataform repository to a third-party repository.

To block communication between Dataform and Secret Manager, you need to revoke access to all secrets from the default Dataform service account.

To revoke access to a Secret Manager secret from the default Dataform service account, follow the Manage access to secrets guide in the Secret Manager documentation. You must revoke all predefined and custom roles that contain the secretmanager.versions.access permission, granted to the default Dataform service account on the selected secret.

The secretmanager.versions.access permission is included in the following predefined Secret Manager IAM roles:

What's next