This document shows you how to create a workflow configuration in Dataform to schedule and configure SQL workflow executions. You can use workflow configurations to execute Dataform SQL workflows on a schedule.
About workflow configurations
To schedule Dataform executions of all or selected SQL workflow actions in BigQuery, you can create workflow configurations. In a workflow configuration, you select a compilation release configuration, select SQL workflow actions for execution, and set the execution schedule.
Then, during a scheduled execution of your workflow configuration, Dataform deploys your selection of actions from the latest compilation result in your release configuration to BigQuery. You can also manually trigger execution of a workflow configuration with the Dataform API workflowConfigs.
A Dataform workflow configuration contains the following execution settings:
- ID of the workflow configuration
- Release configuration
Service account
Service account associated with the workflow configuration. You can select the default Dataform service account, a service account associated with your Google Cloud project, or manually enter a different service account. By default, workflow configurations use the same service accounts as their repositories.
SQL workflow actions to be executed:
- All actions
- Selection of actions
- Selection of tags
Execution schedule and time zone
Before you begin
In the Google Cloud console, go to the Dataform page.
Select or create a repository.
Create a release configuration.
Required roles
To get the permissions that you need to create a workflow configuration,
ask your administrator to grant you the
Dataform Editor (roles/dataform.editor
) IAM role on repositories.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
To use a service account other than the default Dataform service account, grant access to the custom service account.
Create a workflow configuration
To create a Dataform workflow configuration, follow these steps:
- In your repository, go to Releases & Scheduling.
- In the Workflow configurations section, click Create.
In the Create workflow configuration pane, in the Configuration ID field, enter a unique ID for the workflow configuration.
IDs can only include numbers, letters, hyphens, and underscores.
In the Release configuration drop-down, select a compilation release configuration.
Optional: In the Frequency field, enter the frequency of executions in the unix-cron format.
To ensure that Dataform executes the latest compilation result in the corresponding release configuration, keep a minimum 1 hour break between the time of compilation result creation and the time of scheduled execution.
In the Service account drop-down, select a service account for the workflow configuration.
In the drop-down, you can select the default Dataform service account or any service account associated with your Google Cloud project that you have access to. If you don't select a service account, the workflow configuration uses the service account of the repository.
Optional: In the Timezone drop-down, select the time zone for executions.
The default time zone is UTC.
Select SQL workflow actions to be executed:
- To execute the entire SQL workflow, click All actions.
- To execute selected actions in the SQL workflow, click Selection of actions, and then select actions.
- To execute actions with selected tags, click Selection of tags, and then select tags.
- Optional: To execute selected actions or tags and their dependencies, select the Include dependencies option.
- Optional: To execute selected actions or tags and their dependents, select the Include dependents option.
- Optional: To rebuild all tables from scratch, select the Run with full refresh option.
Without this option, Dataform updates incremental tables without rebuilding them from scratch.
Click Create.
For example, the following workflow configuration executes actions
with the hourly
tag every hour in the CEST timezone:
- Configuration ID:
production-hourly
- Release configuration: -
- Frequency:
0 * * * *
- Timezone:
Central European Summer Time (CEST)
- Selection of SQL workflow actions: selection of tags,
hourly
tag
Edit a workflow configuration
To edit a workflow configuration, follow these steps:
- In your repository, go to Releases & Scheduling.
- By the workflow configuration that you want to edit, click the More menu, and then click Edit.
- In the Edit workflow configuration pane, edit release configuration settings, and then click Save.
Delete a workflow configuration
To delete a workflow configuration, follow these steps:
- In your repository, go to Releases & Scheduling.
- By the workflow configuration that you want to delete, click the More menu, and then click Delete.
- In the Delete release configuration dialog, click Delete.
What's next
- To learn how to configure Dataform compilation release configurations, see Create a release configuration.
- To learn more about code lifecycle Dataform, see Introduction to code lifecycle in Dataform.