This document helps you understand the concept of repositories in Dataform.
Dataform displays your repositories on the Dataform page in the alphabetical order of repository IDs. You can sort and filter them.
By default, Dataform uses a service account derived from your project number in the following format:
Dataform uses Git to record changes and manage file versions. Each Dataform repository corresponds with a Git repository. After you create a Dataform repository, you can connect it to a remote GitHub, GitLab, or Bitbucket repository.
In a Dataform repository, Dataform stores the repository code. In a connected repository, the third-party repository stores the repository code. Dataform interacts with the third-party repository to allow you to edit and execute its contents in a Dataform development workspace.
A Dataform repository page consists of the following components:
- Development workspaces tab
- Displays development workspaces created in the repository.
- Release configurations tab
- Lets you inspect, create, edit, and delete releases.
- Workflow execution logs tab
- Displays Dataform workflow execution logs.
- Workflow configurations tab
- Lets you inspects, create, edit, and delete workflow configurations.
- Settings tab
- Displays the name and location of the repository. For a repository connected to a third-party Git repository, displays the third party repository source, default branch name, and secret token. Displays the buttons to connect the repository to a third-party Git repository and to edit the Git connection.
- Create development workspace button
- Lets you create a development workspace.
After you create and initialize a development workspace, you can
dataform.json file to configure
the following Dataform settings of your repository:
- The default database (Google Cloud project ID)
- The default schema (BigQuery dataset ID)
- The default BigQuery location
- The default schema (BigQuery dataset ID) for assertions
- The warehouse, which must be set to
- User-defined variables that are made available to project code during compilation
For more information about Dataform repository settings, see IProjectConfig in the Dataform core reference.
- To learn how to create and initialize a workspace, see Create a workspace.
- To learn how to configure Dataform repository settings, see Configure Dataform settings.
- To learn how to connect a Dataform repository to a third party Git repository, see Connect to a third-party Git repository.
- To learn how to view workflow execution logs, see Monitor execution logs.
- To learn how to create Dataform compilation releases, see Create a compilation release.
- To learn more about how repository size impacts development in Dataform, see Overview of repository size.
- To learn how to schedule Dataform executions with workflow configurations, see Schedule executions with workflow configurations.
- To learn more about splitting a repository in Dataform, see
Introduction to splitting repositories.