This page shows you how to edit the Dataform processing settings for a specific repository.
You might want to edit the settings file to rename the schemas or add custom compilation variables to your repository.
About repository settings
Each Dataform repository contains a unique workflow_settings.yaml
settings
file. The file contains
the Google Cloud project ID and the schema in which Dataform
publishes assets in BigQuery.
Dataform uses default settings that you can override to best suit your
needs by editing the workflow_settings.yaml
file.
The following code sample shows a sample workflow_settings.yaml
file:
defaultProject: my-gcp-project-id
defaultDataset: dataform
defaultLocation: australia-southeast2
defaultAssertionDataset: dataform_assertions
In the sample code, the key-value pairs, and all other options for Workflow Settings, are described in the configs reference for workflow settings.
You can access the properties defined in workflow_settings.yaml
in your
project code as properties of the dataform.projectConfig
object. The following
mappings from workflow_settings.yaml
options to the code accessible
dataform.projectConfig
options are applied:
defaultProject
=>defaultDatabase
.defaultDataset
=>defaultSchema
.defaultAssertionDataset
=>assertionSchema
.projectSuffix
=>databaseSuffix
.datasetSuffix
=>schemaSuffix
.namePrefix
=>tablePrefix
.
For more information about
the dataform.projectConfig
object,
see IProjectConfig
Dataform core reference.
The following code sample shows the dataform.projectConfig
object called in a
SELECT
statement in a view:
config { type: "view" }
SELECT ${when(
!dataform.projectConfig.tablePrefix,
"table prefix is set!",
"table prefix is not set!"
)}
Before you begin
Required roles
To get the permissions that you need to configure Dataform settings,
ask your administrator to grant you the
Dataform Admin (roles/dataform.admin
) IAM role on repositories.
For more information about granting roles, see Manage access.
You might also be able to get the required permissions through custom roles or other predefined roles.
Configure schema names
This task shows how to configure defaultDataset
and defaultAssertionSchema
properties in the workflow_settings.yaml
file.
To change the name of a schema, follow these steps:
In your development workspace, in the Files pane, click the
workflow_settings.yaml
file.Edit the value of
defaultDataset
,defaultAssertionSchema
, or both.
The following code sample shows a custom defaultDataset
name in the
workflow_settings.yaml
file:
...
defaultDataset: mytables
...
Create custom compilation variables
Compilation variables contain values that you can modify with compilation overrides in a Dataform API request.
After you define a compilation variable in workflow_settings.yaml
and add it to
selected tables, you can use modify its value in
Dataform API compilation overrides
to execute tables conditionally.
To create a compilation variable that you can use across a repository, follow these steps:
- Go to your Dataform development workspace.
- In the Files pane, select the
workflow_settings.yaml
file. Enter the following code snippet:
"vars": { "YOUR_VARIABLE":"VALUE" }
Replace the following:
- YOUR_VARIABLE with a name for the variable.
- VALUE with the default value of the compilation variable.
The following code sample shows the myVariableName
compilation variable set
to myVariableValue
in the workflow_settings.yaml
file:
...
vars:
myVariableName: myVariableValue
...
The following code sample shows the workflow_settings.yaml
file with the
executionSetting
compilation variable set to dev
:
defaultProject: default_bigquery_database
defaultLocation:us-west1
defaultDataset: dataform_data,
vars:
executionSetting: dev
Add a compilation variable to a table
To add a compilation variable to a SQLX table definition file, follow these steps:
- Go to your Dataform development workspace.
- In the Files pane, select a SQLX table definition file.
In the file, enter a
when
clause in the following format:${when(dataform.projectConfig.vars.YOUR_VARIABLE === "SET_VALUE", "CONDITION")}
Replace the following:
- YOUR_VARIABLE with the name of your variable,
for example
executionSetting
. - SET_VALUE with a value for the variable,
for example,
staging
. - CONDITION with a condition for execution of the table.
- YOUR_VARIABLE with the name of your variable,
for example
The following code sample shows a table definition SQLX file with a when
clause and the executionSetting
variable that executes 10% of data in the
staging execution setting:
select
*
from ${ref("data")}
${when(
dataform.projectConfig.vars.executionSetting === "staging",
"where mod(farm_fingerprint(id) / 10) = 0",
)}
The following code sample shows a view definition SQLX file with a when
clause and the myVariableName
variable:
config { type: "view" }
SELECT ${when(
dataform.projectConfig.vars.myVariableName === "myVariableValue",
"myVariableName is set to myVariableValue!",
"myVariableName is not set to myVariableValue!"
)}
What's next
To learn more about Dataform project settings, see
IProjectConfig
reference.To learn how to version control code in Dataform, see Version control your code.
To learn how to define a table, see Create a table.