Schedule dependent jobs

This document describes how to create and run a job that isn't scheduled until specific jobs have succeeded or failed. To learn more about job states, see Job creation and execution overview.

If you have a workload with varying resource requirements, consider using dependent jobs to create an automated chain of jobs that each use separate VMs. For example, separate the types of VMs used for low-demand operations (like data preparation) and compute-intensive operations (like data processing). By using dependent jobs to help optimize resource consumption, you can reduce costs and quota usage.

Before you begin

  1. If you haven't used Batch before, review Get started with Batch and enable Batch by completing the prerequisites for projects and users.
  2. To get the permissions that you need to create a job, ask your administrator to grant you the following IAM roles:

    For more information about granting roles, see Manage access to projects, folders, and organizations.

    You might also be able to get the required permissions through custom roles or other predefined roles.

Restrictions

Dependent jobs have the following restrictions:

  • A dependent job can have up to four dependencies. Each dependency must contain a unique job name and one of the following required states:

    • SUCCEEDED: succeeded
    • FAILED: failed
    • FINISHED: succeeded or failed
  • When you create a dependent job, all of its dependency jobs must exist.

  • A dependent job can't enter the scheduled (SCHEDULED) state until each dependency job has entered its required state. If it becomes impossible for a dependency job to enter its required state, then the dependent job immediately fails without being scheduled.

Create a dependent job

You can create a job that is dependent using the following methods:

To specify that a job is dependent, include the dependencies[].items field, which supports one or more dependencies specified as key-value pairs, in the main body in the JSON file:

"dependencies": [
  {
    "items": {
      "DEPENDENCY_JOB_NAME": "REQUIRED_STATE"
    }
  }
]

Replace the following:

  • DEPENDENCY_JOB_NAME: the name of a dependency job, which must reach its required state before this dependent job is allowed to be scheduled.

  • REQUIRED_STATE: the required state for the corresponding dependency job, which must be SUCCEEDED, FAILED, or FINISHED.

For example, a dependent job with three dependencies can have a JSON configuration file that is similar to the following:

{
  "taskGroups": [
    {
      "taskSpec": {
        "runnables": [
          {
            "script": {
              "text": "echo Hello World! This is task $BATCH_TASK_INDEX."
            }
          }
        ]
      },
      "taskCount": 3
    }
  ],
  "dependencies": [
    {
      "items": {
        "DEPENDENCY_JOB_NAME_1": "REQUIRED_STATE_1",
        "DEPENDENCY_JOB_NAME_2": "REQUIRED_STATE_2",
        "DEPENDENCY_JOB_NAME_3": "REQUIRED_STATE_3"
      }
    }
  ]
}

What's next