This document shows you how to do the following in Dataform:
- Grant Dataform required access.
- Control access to Dataform with IAM.
- Control access to individual tables with IAM.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the BigQuery and Dataform APIs.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the BigQuery and Dataform APIs.
Grant Dataform required access
This section shows you how to grant the Identity and Access Management (IAM) roles that Dataform service accounts require to execute workflows in BigQuery.
About service accounts in Dataform
When you create your first Dataform repository, Dataform automatically generates a default service account. Dataform uses the default service account to interact with BigQuery on your behalf. The default Dataform service account is not granted any BigQuery roles or permissions by default. You must grant the required access to the default Dataform service account.
Your default Dataform service account ID is in the following format:
service-PROJECT_NUMBER@gcp-sa-dataform.iam.gserviceaccount.com
Replace PROJECT_NUMBER with the numeral ID of your Google Cloud project. You can find your Google Cloud project ID in the Google Cloud console dashboard. For more information, see Identifying projects.
In addition to the default Dataform service account, you can use other service accounts to execute workflows on your behalf. You can configure custom service accounts in the following ways:
At the repository level, to run all workflows in a given repository.
Individually for each workflow configuration.
When you create a Dataform repository or workflow configuration, you can select any service account associated with your Google Cloud project that you have access to. You must configure the required permissions for all the service accounts associated with your Dataform resources.
A custom service account associated with a Dataform repository is only used to execute workflows from that repository. All other repository operations are still performed by the default Dataform service account.
Required roles for Dataform service accounts
Default and custom service accounts used in Dataform require the following BigQuery IAM roles to be able to execute workflows in BigQuery:
- BigQuery Data Editor on projects to which Dataform needs both read and write access. These usually include the project hosting your Dataform repository.
- BigQuery Data Viewer on projects to which Dataform needs read-only access.
- BigQuery Job User on the project hosting your Dataform repository.
- BigQuery Data Owner if you want to query BigQuery datasets.
- BigQuery roles for column-level access control if you want to use BigQuery policy tags.
Additionally, you need to grant the default Dataform service
account the
Service Account Token Creator role
(roles/iam.serviceAccountTokenCreator
) so that it can access any custom
service accounts that you want to use in Dataform.
Security considerations for Dataform service accounts
Granting the roles required by Dataform to a service account comes with the following security considerations:
Any service account granted the required roles might gain access to BigQuery or Secret Manager in the project that the service account belongs to, regardless of VPC Service Controls.
For more information, see Configure VPC Service Controls for Dataform.
Any user who has the
dataform.repositories.create
IAM permission can execute code using the default Dataform service account and all the permissions granted to that service account.For more information, see Security considerations for Dataform permissions.
To restrict the data that a user or service account can read or write in BigQuery, you can grant granular BigQuery IAM permissions to selected BigQuery datasets or tables. For more information, see Controlling access to datasets and Controlling access to tables and views.
Grant required BigQuery roles to a service account used in Dataform
To grant the required BigQuery IAM roles to your default Dataform service account, or a custom service account that you want to use in Dataform, follow these steps:
In the Google Cloud console, go to the Dataform page.
Select or create a repository.
In the Google Cloud console, go to the IAM page.
Click Grant Access.
In the New principals field, enter the service account ID.
In the Select a role list, select the BigQuery Job User role.
Click Add another role, and then in the Select a role list, select the BigQuery Data Editor role.
Click Add another role, and then in the Select a role list, select the BigQuery Data Viewer role.
Click Save.
Grant token creation access to a custom service account
To use a custom service account in Dataform, the default Dataform service account must be able to access the custom service account. To grant this access, you need to add the default Dataform service account as a principal to the custom service account with the Service Account Token Creator role.
To grant the default Dataform service account access to a custom service account, follow these steps:
In the Google Cloud console, go to IAM > Service accounts.
Select a project.
On the Service accounts for project "PROJECT_NAME" page, select your custom Dataform service account.
Go to Permissions, and then click Grant Access.
In the New principals field, enter your default Dataform service account ID.
Your default Dataform service account ID is in the following format:
service-PROJECT_NUMBER@gcp-sa-dataform.iam.gserviceaccount.com
In the Select a role list, select the Service Account Token Creator role.
Click Save.
Control access to Dataform with IAM
This document describes the access control options for Dataform and shows you how to view and grant Dataform roles. Dataform uses Identity and Access Management (IAM) for access control. For more information about roles and permissions in IAM, see Understanding roles and permissions.
Predefined Dataform roles
The following table lists the predefined roles that give you access to Dataform resources:
Role | Permissions |
---|---|
Dataform Admin( Full access to all Dataform resources. |
|
Code Commenter Beta( Permissions to comment, at the repository level. Grants CRUD access over commentThread and comment resources. |
|
Code Creator( Access only to private and shared code resources. The permissions in the Code Creator let you create and list code in Dataform, and access only the code that you created and code that was explicitly shared with you. |
|
Code Editor( Edit access code resources. |
|
Code Owner( Full access to code resources. |
|
Code Viewer( Read-only access to all code resources. |
|
Dataform Editor( Edit access to Workspaces and Read-only access to Repositories. |
|
Dataform Viewer( Read-only access to all Dataform resources. |
|
Custom Dataform roles
Custom roles can include any permissions that you specify. You can create custom roles that include permissions to perform specific administrative operations, like creating development workspaces or creating files and directories within a development workspace. To create custom roles, see Creating and managing custom roles.
Security considerations for Dataform permissions
Any user who has the dataform.repositories.create
permission can execute code
in BigQuery using the default Dataform service
account and all permissions granted to that service account. This includes
execution of Dataform workflows.
The dataform.repositories.create
permissions is included in the following
IAM roles:
- BigQuery Admin (
roles/bigquery.admin
) - BigQuery Job User (
roles/bigquery.jobUser
) - BigQuery Studio User (
roles/bigquery.studioUser
) - BigQuery User (
roles/bigquery.user
) - Code Creator (
roles/dataform.codeCreator
) - Code Editor (
roles/dataform.codeEditor
) - Code Owner (
roles/dataform.codeOwner
) - Colab Enterprise User (
roles/aiplatform.colabEnterpriseUser
) - Dataform Admin (
roles/dataform.admin
)
To restrict the data that a user or service account can read or write in BigQuery, you can grant granular BigQuery IAM permissions to selected BigQuery datasets or tables. For more information, see Controlling access to datasets and Controlling access to tables and views.
For more information about the default Dataform service account and the roles and permissions it requires, see Grant Dataform required access.
View Dataform roles
Within the Google Cloud console, perform the following steps:
Go to the IAM & Admin > Roles page.
In the Filter field, select Used in, type
Dataform
, and then press Enter.Click one of the listed roles to view the permissions of the role in the right pane.
For example, the Dataform Admin role has full access to all Dataform resources.
For more information about granting a role on a project, see Grant a role. You can grant predefined or custom roles in this way.
Control access to an individual repository
To control access to Dataform with granular permissions,
you can set Dataform IAM roles on individual
repositories by using the Dataform API
repositories.setIamPolicy
request.
To set Dataform IAM roles on an individual Dataform repository, follow these steps:
In the terminal, pass the Dataform API
repositories.setIamPolicy
request with an access policy.In the policy, bind a user, group, domain, or service account to a selected role in the following format:
{ "policy": { "bindings": [ { "role": "roles/ROLE", "members": [ "TYPE:IDENTIFIER", ] }, ], } }
Replace the following:
ROLE
: the Dataform IAM role that you want to grant on the repository.TYPE
:user
,group
,domain
, orserviceAccount
.IDENTIFIER
: the user, group, domain, or service account that you want to grant the role to.
In the IAM page, ensure that all users can view the full list of Dataform repositories through a Dataform role with the
dataform.repositories.list
permission.In IAM, ensure that only users who require full access to all Dataform repositories are granted the Dataform Admin role on all repositories.
The following command passes the repositories.setIamPolicy
Dataform API
request that grants the Dataform Editor role on the sales
repository
to a single user:
curl -H "Content-Type: application/json" -X POST -d '{ "policy": { "bindings": [{ "role": "roles/dataform.editor", "members": ["user:sasha@examplepetstore.com"]}] }}' "https://dataform.googleapis.com/v1beta1/projects/examplepetstore/locations/us-central1/repositories/sales:setIamPolicy"
Grant public access to a repository
You can grant public access to a Dataform repository by granting
IAM roles on the repository to the allAuthenticatedUsers
principal.
When you assign an IAM role to the allAuthenticatedUsers
principal,
service accounts and all users on the internet who have authenticated with a
Google Account are granted that role. This includes accounts that aren't
connected to a Google Workspace account or Cloud Identity domain,
such as personal Gmail accounts. Users who aren't authenticated,
such as anonymous visitors, aren't included. For more information, see
All authenticated users.
For example, when you grant the Dataform Viewer role to
allAuthenticatedUsers
on the sales
repository, all service accounts and
users on the internet who have authenticated with a Google Account have
read-only access to all sales
code resources.
To grant public access to a Dataform repository, follow these steps:
In the terminal, pass the Dataform API
repositories.setIamPolicy
request with an access policy.In the policy, bind the
allAuthenticatedUsers
principal to a selected role in the following format:{ "policy": { "bindings": [ { "role": "roles/ROLE", "members": [ "allAuthenticatedUsers", ] }, ], } }
Replace
ROLE
with a Dataform IAM role that you want to grant to all authenticated users.
The following command passes the repositories.setIamPolicy
Dataform API
request that grants the Dataform Viewer role on the sales
repository
to allAuthenticatedUsers
:
curl -H "Content-Type: application/json" -X POST -d '{ "policy": { "bindings": [{ "role": "roles/dataform.viewer", "members": ["allAuthenticatedUsers"]}] }}' "https://dataform.googleapis.com/v1beta1/projects/examplepetstore/locations/us-central1/repositories/sales:setIamPolicy"
Prevent public access to repositories
To ensure that no access is granted to the public on any Dataform
repository, you can restrict the allAuthenticatedUsers
principal in your
project.
To restrict allAuthenticatedUsers
in your project, you can
set the iam.allowedPolicyMemberDomains
policy,
and remove allAuthenticatedUsers
from the list of allowed_values
.
When you restrict allAuthenticatedUsers
in the iam.allowedPolicyMemberDomains
policy, the allAuthenticatedUsers
principal cannot be used in any
IAM policy in your project, which prevents granting public access
to all resources, including Dataform repositories.
For more information about the iam.allowedPolicyMemberDomains
policy
and also instructions to set it, see
Restricting identities by domain.
Workforce identity federation in Dataform
Workforce identity federation lets you use an external identity provider (IdP) to authenticate and authorize users to Google Cloud services with IAM.
Dataform supports workforce identity federation with no known limitations.
Control access to individual tables with IAM
This section shows you how to grant and revoke BigQuery IAM roles for individual Dataform tables and views.
When Dataform executes a table or view, it creates the resource in BigQuery. During development in Dataform, you can grant BigQuery roles to individual tables and views to control their access in BigQuery after execution.
For more information about granting and revoking access to resources, see Grant access to a resource.
Grant BigQuery roles to a table or view
You can grant BigQuery roles to a table or view in
Dataform by adding a post_operations
block with the
GRANT
DCL statement
to the .sqlx
definition file of the selected table or view.
To grant BigQuery roles to a selected table or view, follow these steps:
In the Google Cloud console, go to the Dataform page.
Select a repository, and then a select a workspace.
In the Files pane, expand the
definitions/
directory.Select the
.sqlx
definition file of the table or view that you want to grant access to.In the file, enter the following code snippet:
post_operations { GRANT "ROLE_LIST" ON "RESOURCE_TYPE" ${self()} TO "USER_LIST" }
Replace the following:
ROLE_LIST: the BigQuery role or list of comma-separated BigQuery roles that you want to grant.
RESOURCE_TYPE:
TABLE
orVIEW
.USER_LIST: the comma-separated list of users that the role is granted to.
For a list of valid formats, see user_list.
Optional: Click Format.
Execute the table or view.
If you granted access on an incremental table, remove the
GRANT
statement from the table definition file after the first execution.
The following code sample shows the BigQuery Viewer role granted on a table to a user:
config { type: "table" }
SELECT ...
post_operations {
GRANT `roles/bigquery.dataViewer`
ON TABLE ${self()}
TO "user:222larabrown@gmail.com"
}
Revoke BigQuery roles from a table or view
You can revoke BigQuery roles from a table or view by adding a
post_operations
block with the
REVOKE
DCL statement
to the .sqlx
definition file of the selected table or view.
To revoke BigQuery roles from a selected table or view, follow these steps:
In the Google Cloud console, go to the Dataform page.
Select a repository, and then a select a workspace.
In the Files pane, expand the
definitions/
directory.Select the
.sqlx
definition file of the table or view that you want to revoke access to.In the
post_operations
block, enter the followingREVOKE
statement:REVOKE "ROLE_LIST" ON "RESOURCE_TYPE" ${self()} FROM "USER_LIST"
Replace the following:
- ROLE_LIST: the BigQuery role or list of comma-separated BigQuery roles that you want to revoke.
- RESOURCE_TYPE:
TABLE
orVIEW
. - USER_LIST: the comma-separated list of users that the role is revoked from. For a list of valid formats, see user_list.
To revoke the access granted in a
GRANT
statement in the file, replace theGRANT
statement with aREVOKE
statement.Optional: Click Format.
Execute the table or view.
If you revoked access to an incremental table, remove the
REVOKE
statement from the table definition file after the first execution.
The following code sample shows the BigQuery Viewer role revoked from a user on a table:
config { type: "table" }
SELECT ...
post_operations {
REVOKE `roles/bigquery.dataViewer`
ON TABLE ${self()}
FROM "user:222larabrown@gmail.com"
}
Collectively manage BigQuery roles for tables and views
To control BigQuery access to individual tables and views in a
single location, you can create a dedicated type: "operations"
file with
GRANT
and
REVOKE
DCL statements.
To manage BigQuery table access in a single type: "operations"
file, follow these steps:
In the Google Cloud console, go to the Dataform page.
Select a repository, and then a select a workspace.
In the Files pane, next to
definitions/
, click theMore menu.
Click Create file.
In the Add a file path field, enter the name of the file followed by
.sqlx
afterdefinitions/
. For example,definitions/table-access.sqlx
.Filenames can only include numbers, letters, hyphens, and underscores.
Click Create file.
In the Files pane, expand the
definitions/
directory, and select the newly created file.In the file, enter the following code snippet:
config { type: "operations" } GRANT "ROLE_LIST" ON RESOURCE_TYPE RESOURCE_NAME TO "USER_LIST" REVOKE "ROLE_LIST" ON { "<var>" }}RESOURCE_TYPE RESOURCE_NAME TO "USER_LIST"
Replace the following:
- ROLE_LIST: the BigQuery role or list of comma-separated BigQuery roles that you want to grant or revoke.
- RESOURCE_TYPE:
TABLE
orVIEW
. - RESOURCE_NAME: the name of the table or view.
- USER_LIST: the comma-separated list of users that the role is granted to or revoked from. For a list of valid formats, see user_list.
Add
GRANT
andREVOKE
statements as needed.To revoke access granted in a
GRANT
statement in the file, replace theGRANT
statement with aREVOKE
statement.Removing the
GRANT
statement without adding theREVOKE
statement does not revoke access.
Optional: Click Format.
Execute the file after each update.
- If you granted or revoked access on an incremental table, remove the
GRANT
orREVOKE
statement from the file after the first execution of the statement.
- If you granted or revoked access on an incremental table, remove the
What's next
- To learn more about IAM, see IAM overview.
- To learn more about managing access to resources, see Manage access to projects, folders, and organizations
- To learn more about the key concepts of workforce identity federation, see Workforce identity federation.
- To learn more about BigQuery IAM roles and permissions, see Access control with IAM.
- To learn more about granting granular permissions to BigQuery datasets, see Controlling access to datasets.