About assertions
An assertion is a data quality test query that finds rows that violate one or more rules specified in the query. If the query returns any rows, the assertion fails. Dataform runs assertions every time it updates your SQL workflow and it alerts you if any assertions fail.
Dataform automatically creates views in BigQuery that contain the results of compiled assertion queries. As configured in your dataform.json file, Dataform creates these views in an assertions schema where you can inspect assertion results.
For example, for the default dataform_assertions
schema, Dataform
creates a view in BigQuery in the following format:
dataform_assertions.assertion_name
.
You can create assertions for all Dataform table types: tables, incremental tables, views, and materialized views.
You can create assertions in the following ways:
Add built-in assertions to the config block of a table.
You can add built-in assertions to the
config
block of a table and specify their conditions.Add manual assertions in a separate SQLX file.
You manually write custom assertions in a separate SQLX file for advanced use cases or for datasets not created by Dataform.
Before you begin
In the Google Cloud console, go to the Dataform page.
Select or create a repository.
Select or create a development workspace.
Required roles
To get the permissions that you need to create assertions,
ask your administrator to grant you the
Dataform Editor (roles/dataform.editor
) IAM role on workspaces.
For more information about granting roles, see Manage access.
You might also be able to get the required permissions through custom roles or other predefined roles.
Create built-in assertions
You can add built-in Dataform assertions to the config
block of a
table. Dataform runs these assertions after table creation. After
Dataform publishes the table, you can inspect the assertion.
You can create the following assertions in the config
block of a table:
nonNull
This condition asserts that the specified columns are not null across all table rows. This condition is used for columns that can never be null.
The following code sample shows a
nonNull
assertion in theconfig
block of a table:
config {
type: "table",
assertions: {
nonNull: ["user_id", "customer_id", "email"]
}
}
SELECT ...
rowConditions
This condition asserts that all table rows follow the custom logic you define. Each row condition is a custom SQL expression, and each table row is evaluated against each row condition. The assertion fails if any table row violates any row condition.
The following code sample shows a custom
rowConditions
assertion in theconfig
block of an incremental table:
config {
type: "incremental",
assertions: {
rowConditions: [
'signup_date is null or signup_date > "2022-08-01"',
'email like "%@%.%"'
]
}
}
SELECT ...
uniqueKey
This condition asserts that, in a specified column, no table rows have the same value.
The following code sample shows a
uniqueKey
assertion in theconfig
block of a view:
config {
type: "view",
assertions: {
uniqueKey: ["user_id"]
}
}
SELECT ...
uniqueKeys
This condition asserts that, in the specified columns, no table rows have the same value. The assertion fails if there is more than one row in the table with the same values for all the specified columns.
The following code sample shows a
uniqueKeys
assertion in theconfig
block of a table:
config {
type: "table",
assertions: {
uniqueKeys: [["user_id"], ["signup_date", "customer_id"]]
}
}
SELECT ...
Add assertions to the config
block
To add assertions to the config block of a table, follow these steps:
- In your development workspace, in the Files pane, select a table definition SQLX file.
- In the
config
block of the table file, enterassertions: {}
. - Inside
assertions: {}
, add your assertions.
The following code sample shows the conditions added in the config
block:
config {
type: "table",
assertions: {
uniqueKey: ["user_id"],
nonNull: ["user_id", "customer_id"],
rowConditions: [
'signup_date is null or signup_date > "2019-01-01"',
'email like "%@%.%"'
]
}
}
SELECT ...
Create manual assertions with SQLX
Manual assertions are SQL queries that you write in a dedicated SQLX file. A manual assertion SQL query must return zero rows. If the query returns rows when executed, the assertion fails.
To add manual assertions in a new SQLX file, follow these steps:
- In the Files pane, next to
definitions/
, click theMore menu.
- Click Create file.
In the Add a file path field, enter the name of the file followed by
.sqlx
. For example,definitions/custom_assertion.sqlx
.Filenames can only include numbers, letters, hyphens, and underscores.
Click Create file.
In the Files pane, click the new file.
In the file, enter:
config { type: "assertion" }
Below the
config
block, write your SQL query or multiple queries.
The following code sample shows a manual assertion in a SQLX file that asserts
that fields a
, b
, and c
are never NULL
in sometable
:
config { type: "assertion" }
SELECT
*
FROM
${ref("sometable")}
WHERE
a IS NULL
OR b IS NULL
OR c IS NULL
Add assertions as dependencies
When table B
depends on table A
that has assertions, failure of table A
assertions does not block Dataform from creating table B
. To execute
table B
only if table A
assertions pass, add these assertions as
dependencies to dependencies: [ "" ]
in the config
block of table B
.
The following code sample shows assertions added to table A
:
config {
type: "table",
assertions: {
uniqueKey: ["user_id"],
nonNull: ["user_id", "customer_id"],
}
}
The following code sample shows table A
assertions added as dependencies
to table B
:
config {
type: "table",
dependencies: [ "A_uniqueKey", "A_nonNull"]
}
To execute a table only when certain manual assertions pass, you can add the
manual assertion SQLX file to dependencies: [ "" ]
in the config
block.
The following code sample show a manual assertion added as a dependency to a view:
config {
type: "view",
dependencies: [ "manual_assertion"]
}
To add an assertion file or a table with assertions in its config
block as a
table dependency, follow these steps:
- In your development workspace, in the Files pane, expand
definitions/
. - Select a table SQLX file.
- In the
config
block of the table file, enterdependencies: [ "" ]
. - Inside
dependencies: [ "" ]
, enter the name of the table assertion or filename of the manual assertion that you want to set as a table dependency. You can find assertion names in the METADATA panel. - To add another assertion as a dependency to the edited table, repeat Step 4.
The following code sample shows the manual_assertion assertion
file and the
assertions of the sometable
table added as dependencies to a table:
config {
type: "table",
dependencies: [ "manual_assertion", "sometable_nonNull" , "sometable_rowConditions"]
}
SELECT * FROM ${ref("referenced_table")} LEFT JOIN ...
What's next
- To learn more about assertion types, see Dataform API.
- To learn how to define assertions with JavaScript, see Create SQL workflows with JavaScript.
- To learn how to manually execute workflows, see Trigger execution.