Create Dataform SQL workflows with JavaScript

This document shows you how to use JavaScript to create SQL workflow elements in Dataform. You might want to use JavaScript over Dataform core to create similar elements repeatedly in your workflow.

About developing workflows in Dataform with JavaScript

Dataform core lets you create SQL workflow actions with SQLX and JavaScript. With JavaScript, you can create similar actions repeatedly in your workflow. For example, you can create a view of each table in your workflow with certain user IDs removed.

The following JavaScript code sample lets you create a view of each table where the value of the user_id field does not correspond to one of the values in the blocked_user_ids list:

  const tableNames = ["user_events", "user_settings", "user_logs"];

  tableNames.forEach(tableName => {
    publish(tableName + "_blocked_removed").query(
      ctx => `
        SELECT * FROM ${ctx.ref(tableName)}
        WHERE user_id NOT IN (
          SELECT user_id
          FROM ${ctx.ref("blocked_user_ids")}
        )`
    );
  });

This code sample creates three views named user_events_blocked_removed, user_settings_blocked_removed, and user_logs_blocked_removed that don't contain any of the blocked user IDs.

You can create multiple actions within one JavaScript file with Dataform global methods and arbitrary JavaScript ES5 code, such as loops and constants.

You can define the following actions with JavaScript in Dataform:

Before you begin

  1. In the Google Cloud console, go to the Dataform page.

    Go to Dataform

  2. Select or create a repository.

  3. Select or create a development workspace.

Required roles

To get the permissions that you need to develop a SQL workflow with JavaScript, ask your administrator to grant you the Dataform Editor (roles/dataform.editor) IAM role on workspaces. For more information about granting roles, see Manage access.

You might also be able to get the required permissions through custom roles or other predefined roles.

Create a JavaScript file in Dataform

Store JavaScript files for definitions and data source declarations in the definitions/ directory. To create a new JavaScript file in the definitions/ directory, follow these steps:

  1. In the Files pane, next to definitions/, click the More menu.

  2. Click Create file.

  3. In the Create new file pane, do the following:

    1. In the Add a file path field, after definitions/, enter the name of the file followed by .js. For example, definitions/definitions.js.

      Filenames can only include numbers, letters, hyphens, and underscores.

    2. Click Create file.

Set Dataform SQL workflow object properties with JavaScript

You can use the following Dataform global methods to create SQL workflow objects with Dataform:

  • declare. Used to declare a data source.
  • publish. Used to define a table.
  • assert. Used to create an assertion.
  • operate. Used to define a custom SQL operation.

Each of the global methods contain properties you can use to configure the created object. For more information on global methods and their properties, see the Dataform core reference.

In the publish() method that creates a table, you can set table properties by passing them as the second method argument.

To pass table properties as the second argument of publish(), follow these steps:

  1. In your development workspace, in the Files pane, expand definitions/.

  2. Select a JavaScript file.

  3. In the file, add table properties to the publish() method in the following format:

     method("first_method_argument", {
       property1: "property1_value",
       property2: "property2_value",
       property3: "property3_value",
     });
    
  4. Optional: Click Format.

The following code sample shows how to set properties to the publish() method by passing the properties as a second argument of the method:

  publish("table1", {
    type: "table",
    dependencies: ["other_table"],
    description: {
      "Value is 1"
    }
  }).query(ctx => "SELECT 1 AS test");

Reference includes in a JavaScript file

You can reference any includes function, macro, or constant inside a JavaScript file. For more information on includes in Dataform, see Reuse variables and functions with JavaScript includes.

The syntax for referencing includes inside a JavaScript file depends on the location of the include file. Dataform stores such files in the includes directory.

Reference top-level includes

  • To reference a top-level includes file, reference the filename when declaring your variables.

The following code sample references the serviceName and serviceId variables from the includes/service.js file:

  const {serviceName, serviceId} = service;

Reference nested includes

To reference a nested includes file, enter the filename in the JavaScript require function.

The following code sample references the serviceName and serviceId variables from the includes/allServices/service.js file:

  const {serviceName, serviceId} = require("includes/allServices/service.js");

Use Dataform query functions in JavaScript methods

Dataform provides a number of built-in functions that you can use inside queries, such as ref and self. For more information on Dataform built-in functions, see the Dataform API reference.

To use a built-in query function in a JavaScript method, follow these steps:

  1. In your development workspace, in the Files pane, expand definitions/.

  2. Select a JavaScript file.

  3. In the file, enter a global Dataform JavaScript method.

  4. In the method, enter a contextable ctx argument.

  5. Optional: If you are using JavaScript template strings, wrap the contextable argument in backticks ``.

  6. In the contextable argument, enter the query function with its parameter as a context object.

  7. Optional: Click Format.

The following code sample shows the ref query function wrapped inside a contextable argument of the publish method:

  publish("example").query(ctx => `SELECT * FROM ${ctx.ref("other_table")}`);

Declare a SQL workflow data source with JavaScript

You can declare multiple data sources in one JavaScript declaration file with the Dataform declare JavaScript method. For more information on the declare method, see the Dataform core reference. For more information on data sources in Dataform, see Declare a data source.

To declare a data source in a JavaScript file, follow these steps:

  1. In your development workspace, in the Files pane, expand definitions/.

  2. Select a JavaScript file.

  3. In the file, enter the following code snippet:

     declare({
       database: "DATABASE_PROJECT_ID",
       schema: "BIGQUERY_SCHEMA",
       name: "RELATION_NAME",
     });
    

    Replace the following:

    • DATABASE_PROJECT_ID: the project ID of the project which contains the data source
    • BIGQUERY_SCHEMA: the BigQuery dataset in which the external relation exists
    • RELATION_NAME: the name of the relation that you can later use to reference the data source in Dataform
  4. To declare another data source in the same file, add an extra declare block to the file.

  5. Optional: Click Format.

Define a table with JavaScript

You can create a table with the Dataform JavaScript publish method. For more information on the publish method, see the Dataform core reference.

You can define the following table types: table, incremental table, and view. For more information on defining tables in Dataform, see Create a table.

To define a table in a JavaScript file, follow these steps:

  1. In your development workspace, in the Files pane, expand definitions/.

  2. Select a JavaScript file.

  3. In the file, enter the following code snippet:

     publish("TABLE_NAME").query(ctx => "SELECT_QUERY");
    
    1. Replace TABLE_NAME with the name of the table.

    2. Replace SELECT_QUERY with a SQL SELECT statement defining the table.

  4. To set the table type, add table dependencies, and add table description, set object properties of the publish method.

  5. To define another table in the same file, repeat step 3 and step 4.

  6. Optional: Click Format.

Define manual Dataform assertions with JavaScript

You can create manual SQL assertions in a JavaScript file with the Dataform assert JavaScript method. For more information on the assert method, see the Dataform core reference.

A manual assertion SQL query must return zero rows. If the query returns rows when executed, the assertion fails. You can create multiple assertions in one JavaScript file.

For more information on assertions in Dataform, see Test tables with assertions.

To create a manual assertion in a JavaScript file, follow these steps:

  1. In your development workspace, in the Files pane, expand definitions/.

  2. Select a JavaScript file.

  3. In the file, enter the following code snippet:

     assert("ASSERTION_NAME").query(ctx => "CUSTOM_ASSERTION_QUERY");
    
    1. Replace ASSERTION_NAME with the name for your custom assertion.

    2. Replace CUSTOM_ASSERTION_QUERY with your SQL assertion query.

  4. To create another assertion in the same file, repeat step 3.

  5. Optional: Click Format.

The following code sample shows a JavaScript assertion that asserts that no value in source_table is NULL:

  assert("assertion1").query(ctx => "SELECT * FROM source_table WHERE value IS NULL");

Define custom SQL operations with JavaScript

You can define custom SQL operations in a JavaScript file with the Dataform operate JavaScript method. For more information on custom SQL operations in Dataform, see Add custom SQL operations.

To define a custom SQL operation with JavaScript, follow these steps:

  1. In your development workspace, in the Files pane, expand definitions/.

  2. Select a JavaScript file.

  3. In the file, enter the following code snippet:

     operate("OPERATION_NAME").queries(ctx => "CUSTOM_SQL_QUERY");
    
    1. Replace OPERATION_NAME with the name for the custom operation.

    2. Replace CUSTOM_SQL_QUERY with your custom SQL query.

  4. To define another custom SQL operation in the same file, repeat step 3.

  5. Optional: Click Format.

The following code sample shows a custom SQL operation in a JavaScript file that inserts a single new row into some_table and sets test_column of the new row to 2:

  operate("operation1").queries("INSERT INTO some_table (test_column) VALUES (2)");

What's next