Declare dependencies

This document shows you how to define the relationship between objects in your SQL workflow in Dataform by declaring dependencies.

You can define a dependency relationship between objects of a SQL workflow. In a dependency relationship, the execution of the dependent object depends on the execution of the dependency object. This means that Dataform executes the dependent after the dependency. You define the relationship by declaring dependencies inside the SQLX definition file of the dependent object.

The dependency declarations make up a dependency tree of your SQL workflow that determines the order in which Dataform executes your SQL workflow objects.

You can define the dependency relationship between the following SQL workflow objects:

Data source declarations
Declarations of BigQuery data sources that let you reference these data sources in Dataform table definitions and SQL operations. You can set a data source declaration as a dependency, but not as a dependent.
Tables
Tables that you create in Dataform based on the declared data sources or other tables in your SQL workflow. Dataform supports the following table types: table, incremental table, view, and materialized view. You can set a table as a dependency and as a dependent.
Custom SQL operations
SQL statements that Dataform runs in BigQuery as they are, without modification. You can set a custom SQL operation defined in a type: operations file as a dependency and as a dependent. To declare a custom SQL operation as a dependency in the ref function, you need to set the hasOutput property to true in the custom SQL operation SQLX definition file.
Assertions
Data quality test queries that you can use to test table data. Dataform runs assertions every time it updates your SQL workflow and it alerts you if any assertions fail. You can set an assertion defined in a type: assertion file as a dependency and as a dependent by declaring dependencies in the config block.

You can define the dependency relationship in the following ways:

Before you begin

  1. Create and initialize a development workspace in your repository.
  2. Optional: Declare a data source.
  3. Create at least two SQL workflow objects: tables, assertions, data source declarations, or operations.

Required roles

To get the permissions that you need to declare dependencies for tables, assertions, data source declarations, and custom SQL operations, ask your administrator to grant you the Dataform Editor (roles/dataform.editor) IAM role on workspaces. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Declare a dependency as an argument of the ref function

To reference and automatically declare a dependency in a SELECT statement, add the dependency as an argument of the ref function.

The ref function is a Dataform core built-in function that lets you reference and automatically depend on any table, data source declaration, or custom SQL operation with the hasOutput property set to true in your SQL workflow.

For more information about the ref function, see Dataform core context methods reference.

For more information about using the ref function in a table definition, see About table definitions.

The following code sample shows the source_data data source declaration added as an argument of the ref function in the incremental_table.sqlx SQLX definition file of an incremental table:

// filename is incremental_table.sqlx

config { type: "incremental" }

SELECT * FROM ${ref("source_data")}

In the preceding code sample, source_data is automatically declared a dependency of incremental_table.

The following code sample shows some_table table definition SQLX file added as an argument of the ref function in the custom_assertion.sqlx SQLX definition file of an assertion:

// filename is custom_assertion.sqlx

config { type: "assertion" }

SELECT
  *
FROM
  ${ref("some_table")}
WHERE
  a is null
  or b is null
  or c is null

In the preceding code sample, some_table is automatically declared a dependency of custom_assertion. During execution, Dataform executes some_table first, and then executes custom_assertion once some_table is created.

Declare dependencies in the config block

To declare dependencies that are not referenced in the SQL statement definition of the dependent, but need to be executed before the table, assertion, or custom SQL operation, follow these steps:

  1. In your development workspace, in the Files pane, expand the definitions/ directory.
  2. Select the table, assertion, or custom SQL operation SQLX file that you want to edit.
  3. In the config block of the file, enter the following code snippet:

    dependencies: [ "DEPENDENCY", ]
    

    Replace DEPENDENCY with the filename of the table, assertion, data source declaration, or custom SQL operation that you want to add as a dependency. You can enter multiple filenames, separated by commas.

  4. Optional: Click Format.

The following code sample shows the some_table table and some_assertion assertion added as dependencies to the config block of a table definition file:

config { dependencies: [ "some_table", "some_assertion" ] }

What's next