This document shows you how to create a repository, set and edit the repository service account, and delete a repository in Dataform.
When you create a Dataform repository, you need to set the following repository settings:
- Repository ID
- A unique ID of the repository. IDs can only include numbers, letters, hyphens, and underscores.
- Region
Dataform region for storing the repository and its contents.
This storage region can be different than the processing region where Dataform processes your code and stores the output of executions. By default, the processing region is set to your default BigQuery dataset region. You can edit the processing region in the
dataform.json
file after creating the repository. For more information, see Configure Dataform settings.- Service account
Service account associated with the repository. You can select the default Dataform service account, a service account associated with your Google Cloud project, or manually enter a different service account. By default, Dataform uses a service account derived from your project number in the following format:
service-YOUR_PROJECT_NUMBER@gcp-sa-dataform.iam.gserviceaccount.com
Dataform uses the default service account for all repository operations. You can use a different service account to execute workflows in your repository, but the default service account is still used for all other repository operations.
After you create a repository, you can connect it to GitHub or GitLab.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the BigQuery and Dataform APIs.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the BigQuery and Dataform APIs.
Required roles
To get the permissions that you need to create and delete a repository,
ask your administrator to grant you the
Dataform Admin (roles/dataform.admin
) IAM role on repositories.
For more information about granting roles, see Manage access.
You might also be able to get the required permissions through custom roles or other predefined roles.
To use a service account other than the default Dataform service account, grant access to the non-default service account.
Create a Dataform repository
To create a Dataform repository, follow these steps:
In the Google Cloud console, go to the Dataform page.
Click Create repository.
On the Create repository page, in the Repository ID field, enter a unique ID.
IDs can only include numbers, letters, hyphens, and underscores.
In the Region drop-down list, select a Dataform region for storing the repository and its contents. Select the Dataform region nearest to your location.
For a list of available Dataform regions, see Locations. The repository region does not have to match the location of your BigQuery datasets.
In the
dataform.json
file, you can set the processing region where Dataform processes your code and stores the output of executions. The processing region has to match the location of your BigQuery datasets, but does not need to match the repository region. For more information, see Configure Dataform settings.In the Service account drop-down, select a service account for the repository.
In the drop-down, you can select the default Dataform service account or any service account associated with your Google Cloud project that you have access to. Keep in mind that non-default service accounts are used only for workflow execution. All other repository operations are still performed by the default Dataform service account.
- Optional: To select a service account that is not displayed in the drop-down, click Enter manually and enter a service account ID.
Click Create, and then click Done.
Edit the service account
You can associate a non-default service account with a Dataform repository for workflow execution. All other repository operations are still performed by the default Dataform service account.
To edit the service account for a Dataform repository, follow these steps:
In the Google Cloud console, go to the Dataform page.
Select a repository, and then click Settings.
By the Service account field, click
Edit Service account.In the Service account drop-down, select a service account for the repository.
In the drop-down, you can select the default Dataform service account or any service account associated with your Google Cloud project that you have access to.
- Optional: To select a service account that is not displayed in the drop-down, click Enter manually and enter a service account ID.
Click Save.
Delete a Dataform repository
To delete a repository and all its contents, follow these steps:
In the Google Cloud console, go to the Dataform page.
By the repository that you want to delete, click the
More menu, and then select Delete.
In the Delete repository window, enter the name of the repository to confirm deletion.
Click Delete.
What's next
To learn how configure Dataform processing settings, see Configure Dataform settings.
To learn how to link a Dataform repository to a third-party Git provider, see Connect to a third-party Git repository.
To learn how to create a development workspace, see Create a workspace.