VPC Service Controls is a Google Cloud feature that lets you set up a perimeter that helps guard against data exfiltration. This guide shows how to use VPC Service Controls with Dataform to help make your services more secure.
VPC Service Controls provides an extra layer of defense for Google Cloud services that is independent of the protection provided by Identity and Access Management (IAM).
To learn more about VPC Service Controls, see Overview of VPC Service Controls.
Limitations
Dataform supports VPC Service Controls with the following limitations:
You must set the
dataform.restrictGitRemotes
organization policy.Dataform and BigQuery must be restricted by the same VPC Service Controls service perimeter.
To allow specific users to authenticate with their Google Account user credentials when scheduling runs, manually triggering runs, or running pipelines with VPC Service Controls configured, you need to add their user identities to your ingress rules. For more information, see Updating ingress and egress policies for a service perimeter and Ingress rules reference.
Security considerations
When you set up a VPC Service Controls perimeter for Dataform, you should review permissions granted to your Dataform service agents and accounts to verify that they match your security architecture.
Depending on the permissions that you grant to the Dataform service agent or custom service account, that service agent or service account might have access to BigQuery or Secret Manager data in the project that the service agent or service account belongs to, regardless of VPC Service Controls. In such a case, restricting Dataform with a VPC Service Controls perimeter doesn't block communication with BigQuery or Secret Manager.
You should block communication with BigQuery if you don't need to run any workflow invocations originating from your Dataform repositories. For more information about blocking communication with BigQuery, see Block communication with BigQuery.
You should block communication with Secret Manager if none of your Dataform repositories connect to a third-party Git repository. For more information about blocking communication with Secret Manager, see Block communication with Secret Manager.
Before you begin
Before you configure a VPC Service Controls service perimeter for
Dataform, follow the
Restrict remote repositories guide
to set the dataform.restrictGitRemotes
organization policy.
The dataform.restrictGitRemotes
organization policy is required to ensure
that VPC Service Controls checks are enforced when using
Dataform and that third-party access to Dataform Git
repositories is restricted.
Required roles
To get the permissions that
you need to configure a VPC Service Controls service perimeter,
ask your administrator to grant you the
Access Context Manager Editor (roles/accesscontextmanager.policyEditor
)
IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
For more information about VPC Service Controls permissions, see Access control with IAM.
Configure VPC Service Controls
You can restrict Dataform with a VPC Service Controls service perimeter in the following ways:
- Add Dataform to an existing service perimeter that restricts BigQuery.
- Create a service perimeter that restricts both Dataform and BigQuery.
To add Dataform to a service perimeter that restricts BigQuery, follow the Update a service perimeter guide in the VPC Service Controls documentation.
To create a new service perimeter that restricts both Dataform and BigQuery, follow the Create a service perimeter guide in the VPC Service Controls documentation.
Optional: Block communication with BigQuery
The way Dataform communicates with BigQuery depends on the type of service account used in Dataform.
The default Dataform service agent uses the bigquery.jobs.create
permission to communicate with BigQuery. You grant the
default Dataform service agent roles that contain
this permission when you
grant the roles that are required for Dataform to run workflows in BigQuery.
To block communication between the default Dataform service agent
and BigQuery, you need to revoke all predefined and custom roles
that contain the bigquery.jobs.create
permission, which have been granted to
the default Dataform service agent. To revoke roles, follow the
Manage access to projects, folders, and organizations
guide.
Custom service accounts use the following permissions and roles to communicate with BigQuery:
- The
bigquery.jobs.create
permission, given to the custom service account. - The Service Account Token Creator (
roles/iam.serviceAccountTokenCreator
) role, granted to the default Dataform service agent on the custom service account.
You can block communication between a custom service account and BigQuery in either of the following ways:
Revoke the Service Account Token Creator (
roles/iam.serviceAccountTokenCreator
) role, granted to the default service account on the selected custom service account. To revoke the Service Account Token Creator (roles/iam.serviceAccountTokenCreator
) role, follow the Manage access to service accounts guide.Revoke all predefined and custom roles granted at the project level to the custom service account that contain the
bigquery.jobs.create
permission. To revoke roles, follow the Manage access to projects, folders, and organizations guide.
The bigquery.jobs.create
permission is included in the following
predefined BigQuery IAM roles
that must be revoked:
- BigQuery Admin (
roles/bigquery.admin
) - BigQuery Job User (
roles/bigquery.jobUser
) - BigQuery User (
roles/bigquery.user
) - BigQuery Studio Admin (
roles/bigquery.studioAdmin
) - BigQuery Studio User (
roles/bigquery.studioUser
)
Optional: Block communication with Secret Manager
Dataform uses the secretmanager.versions.access
permission to
access individual Secret Manager secrets. You give this permission
to the default Dataform service agent on a selected Secret Manager
secret when you
connect a Dataform repository to a third-party repository.
To block communication between Dataform and Secret Manager, you need to revoke access to all secrets from the default Dataform service agent.
To revoke access to a Secret Manager secret from the
default Dataform service agent, follow the
Manage access to secrets
guide in the Secret Manager documentation. You must revoke all
predefined and custom roles that contain the
secretmanager.versions.access
permission, granted to the
default Dataform service agent on the selected secret.
The secretmanager.versions.access
permission is included in the following
predefined Secret Manager IAM roles:
- Secret Manager Admin (
roles/secretmanager.admin
) - Secret Manager Secret Accessor (
roles/secretmanager.secretAccessor
) - Secret Manager Secret Version Manager (
roles/secretmanager.secretVersionManager
)
What's next
- To learn more about VPC Service Controls, see Overview of VPC Service Controls.
- To learn more about the Organization Policy, see Introduction to the Organization Policy Service.
- To learn more about service accounts in Dataform, see About Dataform service agents and custom service accounts.