This document shows you how to create and execute a compilation result with compilation overrides by using the Dataform API.
About Dataform API compilation overrides
To execute your SQL workflow, Dataform compiles your code to SQL to create a compilation result. Then, during a workflow invocation, Dataform executes the compilation result in BigQuery.
By default, Dataform uses settings in the workflow settings file to create the compilation result. To isolate data executed at different stages of your development lifecycle, you can override the default settings with compilation overrides.
By passing Dataform API requests in the terminal, you can create and execute a single compilation result with compilation overrides. You can create a compilation result of a workspace or of a selected Git committish.
To create a compilation result with compilation overrides, you need to raise the
Dataform API
compilationResults.create
request. In the request, you need to specify a source, a
workspace or Git commitish, for Dataform to compile into the
compilation result. In the
CodeCompilationConfig
object of the compilationResults.create
request, you can configure compilation
overrides.
You can then execute the created compilation result in a
Dataform API
workflowInvocations.create
request.
You can configure the following compilation overrides by using the Dataform API:
Google Cloud
project
:
Google Cloud project in which Dataform executes the
compilation result, set in workflow_settings.yaml
as defaultProject
or in dataform.json
as defaultDatabase
.
- Table prefix
- Custom prefix added to all table names in the compilation result.
- Schema suffix
- Custom suffix appended to the schema of tables
defined in
defaultDataset
inworkflow_settings.yaml
,defaultSchema
indataform.json
, or in theschema
parameter in theconfig
block of a table.
Value of a compilation variable : Value of a compilation variable to be used in the compilation result. You can use compilation variables to execute tables conditionally.
As an alternative to Dataform API compilation overrides that you can only use for one compilation result, you can configure workspace compilation overrides in the Google Cloud console.
To learn about alternative ways to configure compilation overrides in Dataform, see Introduction to code lifecycle.
Before you begin
In the Google Cloud console, go to the Dataform page.
Select or create a repository.
Select or create a development workspace.
Set a compilation result source
To raise the Dataform API
compilationResults.create
request, you need to specify a source for the compilation result.
You can set a Dataform workspace or a Git branch, Git tag, or Git
commit SHA as the source in the
compilationResults.create
request.
Set a workspace as a compilation result source
- In the
compilationResults.create
request, populate theworkspace
property with the path of a selected Dataform workspace in the following format:
{
"workspace": "projects/PROJECT_NAME/locations/LOCATION/repositories/REPOSITORY_NAME/workspaces/WORKSPACE_NAME"
}
Replace the following:
- PROJECT_NAME with the name of your Google Cloud project.
- LOCATION with the location of your Dataform repository, set in workflow settings.
- REPOSITORY_NAME with the name of your Dataform repository.
- WORKSPACE_NAME with the name of your Dataform workspace.
The following code sample shows the workspace
property in the
compilationResults.create
request set to a workspace called "sales-test"
:
{
"workspace": "projects/analytics/locations/europe-west4/repositories/sales/workspaces/sales-test"
}
Set a Git commitish as a compilation result source
In the
compilationResults.create
request, populate thegitCommitish
property with the selected Git branch, tag, or commit SHA in the following format:{ "gitCommitish": "GIT_COMMITISH" }
Replace GIT_COMMITISH with the selected Git branch, Git tag, or a Git commit SHA for the compilation result.
The following code sample shows the gitCommitish
property in the
compilationResults.create
request set to "staging"
:
{
"gitCommitish": "staging"
}
Override the default Google Cloud project
To create staging or production tables in a Google Cloud project separate from
the project used for development, you can pass a different Google Cloud project
ID in the
CodeCompilationConfig
object in the Dataform API
compilationResults.create
request.
Passing a separate default project ID in the compilationResults.create
request
overrides the defaultGoogle Cloud project ID configured in the
workflow settings file,
but does not override Google Cloud project IDs
configured in individual tables.
To override the default Google Cloud project ID, set the
defaultDatabase
property to the selected Google Cloud project ID in theCodeCompilationConfig
object in the following format:{ "codeCompilationConfig": { "defaultDatabase": "PROJECT_NAME" } }
Replace PROJECT_NAME with the Google Cloud project ID that you want to set for the compilation result.
Add a table prefix
To quickly identify tables from the compilation result, you can add a prefix to
all table names in the compilation result by passing the table prefix in the
CodeCompilationConfig
object in the Dataform API
compilationResults.create
request.
- To add a table prefix, set the
tablePrefix
property in theCodeCompilationConfig
object in the following format:
{
"codeCompilationConfig": {
"tablePrefix": "PREFIX",
}
}
Replace PREFIX with the suffix you want to append, for example,
_staging
. For example, if your defaultDataset
in workflow_settings.yaml
is
set to dataform
, Dataform will create tables in the
dataform_staging
schema.
Append a schema suffix
To separate development, staging, and production data, you can append a suffix
to schemas in a compilation result by passing the schema suffix in the
CodeCompilationConfig
object in the Dataform API
compilationResults.create
request.
- To append a schema suffix, set the
schemaSuffix
property in theCodeCompilationConfig
object in the following format:
{
"codeCompilationConfig": {
"schemaSuffix": "SUFFIX",
}
}
Replace SUFFIX with the suffix you want to append, for example,
_staging
. For example, if your defaultDataset
in workflow_settings.yaml
is
set to dataform
, Dataform will create tables in the
dataform_staging
schema.
Note: The CodeCompilationConfig
schemaSuffix
parameter overrides schemas configured in the config
block of
individual files.
Execute selected files conditionally with compilation variables
To execute a selected table only in a specific execution setting, you can
create a compilation variable
for the execution setting and then pass its value in the
CodeCompilationConfig
object in the Dataform API
compilationResults.create
request.
To execute a table conditionally in a specific execution setting by using Dataform API, follow these steps:
- Create a compilation variable and add it to selected tables.
Set the YOUR_VARIABLE and VALUE key-value pair in the
codeCompilationConfig
block of a Dataform API compilation request in the following format:{ "codeCompilationConfig": { "vars": { "YOUR_VARIABLE": "VALUE" } } }
Replace YOUR_VARIABLE with the name of your variable, for example
executionSetting
.Replace VALUE with the value of the variable for this compilation result that fulfills the
when
condition set in selected tables.
The following code sample shows the executionSetting
variable passed to a
Dataform API compilation request:
{
"gitCommitish": "staging",
"codeCompilationConfig": {
"vars": {
"executionSetting": "staging"
}
}
}
Execute a compilation result with compilation overrides
- To execute the compilation result created by a
compilationResults.create
request, pass the compilation result ID returned by thecompilationResults.create
request in aworkflowInvocations.create
request.
The following code sample shows a compilation result ID passed in a
workflowInvocations.create
request:
{
"compilationResult": "projects/my-project-name/locations/europe-west4/repositories/my-repository-name/compilationResults/7646b4ed-ac8e-447f-93cf-63c43249ff11"
}
What's next
- To learn more about ways to configure compilation overrides in Dataform, see Introduction to code lifecycle.
- To learn more about the Dataform API, see Dataform API.
- To learn how to use the Google Cloud console to configure compilation overrides
for all workspaces in a repository, see Configure workspace compilation
overrides.