Migrate legacy environments and schedules

Legacy Dataform will be deprecated on February 26, 2024, after which you will not be able to access legacy projects. This document describes ways to migrate environments and schedules from legacy Dataform to Dataform in Google Cloud with release and workflow configurations.

In legacy Dataform, you defined environments and schedules together in the environments.json file.

The following code sample shows definitions of production and staging environments and corresponding schedules in an environments.json file from legacy Dataform:

// example of an environments.json file

{
  "environments": [
    {
      "name": "production",
      "configOverride": {},
      "schedules": [
        {
          "name": "daily",
          "cron": "30 14 * * *",
          "tags": [
            "daily"
          ]
        },
        {
          "name": "hourly",
          "cron": "*/5 * * * *",
          "disabled": false
        }
      ],
      "gitRef": "master"
    },
    {
      "name": "staging",
      "configOverride": {
        "schemaSuffix": "staging"
      },
      "schedules": [
        {
          "name": "daily (all)",
          "cron": "42 16 * * mon,tue,wed,thu,fri,sat,sun"
        }
      ],
      "gitRef": "master"
    }
  ]
}

In Dataform in Google Cloud, configuring environments and schedules is split into two experiences:

Release configurations
Similar to legacy Dataform environments, used to configure compilation settings for different environments, for example, staging and production.

Release configurations let you configure Git commitish and compilation overrides to customize creation of compilation results. Dataform creates compilation results from release configurations independently from workflow configuration schedules. This means that scheduled executions of compilation results from release configurations are able to run even if your remote Git provider is unavailable.

When you create a release configuration, you can set the frequency of creating compilation results. You can also create a compilation result from a release configuration manually or in an automated continuous deployment process.

Workflow configurations
Equivalent of legacy Dataform schedules, used to schedule executions of compilation results from release configurations.

First, you need to create a release configuration and define compilation settings for a selected environment, for example, staging. Then, create a workflow configuration and define the schedule of executing staging compilation results.

Ways to migrate environments and schedules

You can migrate legacy environments and schedules to release configurations and workflow configurations in the following ways:

Configure release and workflow configurations inside Dataform

Recreate environments as release configurations and schedules as workflow configurations inside Dataform in Google Cloud.

Go to the Dataform page

Configure release and workflow configurations with the Dataform API

Use the Dataform API to configure release configurations and workflow configurations.

Alternatively, you can migrate legacy environments and schedules in the following ways:

Apply custom configuration to environments.json via the Dataform API
You can keep the environments.json file your repository and configure a continuous deployment process with the tool of your choice. In the continuous deployment process, update release configurations and workflow configurations from the environment.json file upon merging to the default branch.
Bypass release and workflow configurations
You can bypass release and workflow configurations and use the open-source Dataform CLI, the Dataform API, or the Dataform API together with Cloud Composer or Workflows to compile your repository and execute workflows.

Migrate workflow alerts

Dataform provides Cloud Logging for workflow invocations. These logs contain the information that can be useful for monitoring and debugging your workflows:

  • receiveTimestamp
  • release_config_id
  • repository_id
  • resource_container
  • workflow_invocation_id
  • workflow_config_id
  • severity: can be INFO, WARNING, or ERROR
  • terminalState: can be SUCCEEDED, CANCELED or FAILED
  • timestamp
  • @type

You can use Cloud Logging together with Cloud Monitoring to configure alerts similar to your legacy alerts.

With Cloud Monitoring, you can configure the following metrics and alerts:

  • Log-based metrics, which you can use as follows:
    • To create alerting policies that notify you of changes over time.
    • To create charts that display changes over time.
  • Log-based alerts, which notify you anytime a specific event appears in a log.

For more information, see View Cloud Logging for Dataform

What's next