This document shows you how to configure additional table settings in a table definition file. With Dataform core, you can define pre_operations
and post_operations
to execute a SQL statement before or after table creation. You can also override table settings, such as database
or schema
, and disable table creation.
Before you begin
In the Google Cloud console, go to the Dataform page.
Select or create a repository.
Select or create a development workspace.
Required roles
To get the permissions that you need to configure additional table settings,
ask your administrator to grant you the
Dataform Editor (roles/dataform.editor
) IAM role on workspaces.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Override table settings
You can override the schema, database, and name of a selected table.
By default, a table follows the schema and database configuration you set in
workflow_settings.yaml
. The name of a table is the same as the name of the table
definition SQLX file.
To override the schema and name of a selected table, follow these steps:
Go to your development workspace.
In the Files pane, expand
definitions/
.Open a SQLX table definition file.
In the
config
block, enter the following code snippet:{ schema: "OVERRIDDEN_SCHEMA", database: "OVERRIDDEN_DATABASE", name: "OVERRIDDEN_NAME" }
Replace the following:
OVERRIDDEN_SCHEMA
: the BigQuery dataset in which you want to create the tableOVERRIDDEN_DATABASE
: the ID of the BigQuery project in which you want to create the tableOVERRIDDEN_NAME
: the name for the table, different from the SQLX table definition filename
Optional: Click Format.
Reference a table with an overridden table name
- To reference a table with an overridden table name, in the
ref
function, enter the overridden table name set inname: ""
.
The following code sample references a table with name overridden to
overridden_name
:
SELECT * FROM ${ref("overridden_name")}
Define a SQL statement to be executed before table creation
You can configure Dataform to execute one or more SQL statements
before creating a selected table in BigQuery. To execute a SQL
statement before Dataform creates a selected table, add your
statement to the pre_operations
block in the table definition SQLX file.
To create a custom SQL statement executed before Dataform creates a specific table, follow these steps:
Go to your development workspace.
In the Files pane, expand
definitions/
.Open a SQLX table definition file.
Outside the
config
block, enterpre_operations { ... }
.Inside
pre_operations { ... }
, add your SQL statement.Optional: To add multiple statements, separate them with
---
.Optional: Click Format.
The following code sample shows a pre_operations
statements that creates a
temporary function that can be used in the select statement:
pre_operations {
CREATE TEMP FUNCTION AddFourAndDivide(x INT64, y INT64)
RETURNS FLOAT64
AS ((x + 4) / y);
}
Define a SQL statement to be executed after table creation
You can configure Dataform to execute one or more SQL statements
after creating a selected table in BigQuery. To execute a SQL statement
after Dataform creates a selected table, add your statement to
the post_operations
block in the table definition SQLX file. You can add
multiple SQL statements to the post_operations
block.
To create a custom SQL statement executed after Dataform creates a specific table, follow these steps:
Go to your development workspace.
In the Files pane, expand
definitions/
.Open a SQLX table definition file.
Outside the
config
block, enterpost_operations { ... }
.Inside
post_operations { ... }
, add your SQL statement.Optional: Click Format.
The following code sample shows post_operations
statements that grant groups
access to the created table:
post_operations {
GRANT `roles/bigquery.dataViewer`
ON
TABLE ${self()}
TO "group:allusers@example.com", "user:otheruser@example.com"
}
Disable table creation
To stop Dataform from creating a selected table in BigQuery, you can disable the table in its SQLX table definition file. Dataform keeps a disabled table in the dependency graph, but does not compile and create it. This can be useful, for example, if a table fails and you don't want your whole workflow to fail while you fix the issue.
To disable a table, follow these steps:
Go to your development workspace.
In the Files pane, expand
definitions/
.Select a SQLX table definition file.
In the
config
block of the file, enterdisabled: true
.Optional: Click Format.
The following code sample shows a disabled table:
config {
type: "table",
disabled: true
}
select * from ${ref("source_data")}
What's next
To learn how to configure Dataform settings in
workflow_settings.yaml
, see Configure Dataform settings.To learn how to test table data with assertions, see Test tables with assertions.
To learn how to reuse code with includes, see Reuse variables and functions with includes.
To learn to manually trigger executions, see Trigger execution.