Schedule pipelines

This page describes how to create a schedule for your pipeline runs. For example, you can schedule a pipeline to run daily at 1:00 AM UTC.

Before you begin

To create the schedule, you need to have a deployed pipeline in Cloud Data Fusion. If you don't have one, you can create a pipeline by following the Quickstart.

To create, edit, or suspend a schedule, open your pipeline in Cloud Data Fusion:

  1. Go to your instance:

    1. In the Google Cloud console, go to the Cloud Data Fusion page.

    2. To open the instance in the Cloud Data Fusion web interface, click Instances, and then click View instance.

      Go to Instances

  2. Go to the Cloud Data Fusion List page.

  3. In the Deployed tab, select the desired pipeline.

    The Pipeline page opens, where you can create, edit, or suspend a schedule for your pipeline.

Create the schedule

From the Pipeline page in the Cloud Data Fusion UI, click Schedule.

You can use either the Basic or Advanced tab to define your schedule. The Advanced tab lets you define complex schedules using the unix-cron format.

Basic

  1. On the Basic tab, enter the following information about your schedule:

    • Frequency
    • Start time, specified in UTC.
    • Optional: Date
    • Maximum concurrent runs (up to ten runs). If there are already ten pipelines running, the scheduled run that you are creating will not run.
    • Optional: Compute Engine profile. If you leave this blank, the default Dataproc profile is used.
  2. Click Save and start schedule (or Save schedule, if you want to start it later).

Advanced

  1. On the Advanced tab, define your schedule using unix-cron format.

    You can schedule your pipelines to run multiple times a day, or on specific days and months.

    The time fields have the format and possible values shown in the following table:

    Field Format of valid values
    Minute 0-59
    Hour 0-23
    Day of the month 1-31
    Month 1-12
    Day of the week 0-6 (Sunday is 0, Monday is 1)
  2. Click Save and start schedule (or Save schedule, if you want to start it later).

Change or suspend the schedule

You can change or suspend a pipeline schedule from the Pipeline page in the Cloud Data Fusion UI.

  • To change the schedule, click Configure and update the fields.

  • To suspend the schedule, click Unschedule.