This section describes Google Cloud options you can use to schedule workflows.
Dataproc Workflow Templates
Dataproc Workflow templates provide a flexible and easy-to-use mechanism for managing and executing workflows. A Workflow Template is a reusable workflow configuration. It defines a graph of jobs with information on where to run those jobs.
Cloud Scheduler
Cloud Scheduler is a fully managed enterprise-grade cron job scheduler. It allows you to schedule virtually any job, including batch, big data jobs, and Cloud infrastructure operations. It provides simple time-based scheduling, for example, daily or hourly, without requiring you to write code.
Advantages:
Enables time-based instantiation of workflow templates based on familiar cron expressions
No code to write
Tutorial: Workflow using Cloud Scheduler
Cloud Functions
Cloud Run functions is a lightweight compute solution you can use to create single-purpose, stand-alone functions that respond to Cloud events without the need to manage a server or runtime environment. You can use Cloud Run functions to launch Workflows in response to Pub/Sub events or file changes in Cloud Storage. You can use Cloud Run functions with Cloud Scheduler for workflows that require the calculation of time-based parameters.
Advantages:
Enables workflow instantiation in response to data events, such as new files in Cloud Storage or Pub/Sub events.
Minimal coding required using Dataproc Go, Node.js, or Python client libraries
Dynamically generate workflows and workflow parameters
Tutorial: Workflow using Cloud Run functions
Cloud Composer
Cloud Composer is a managed Apache Airflow service you can use to create, schedule, monitor, and manage workflows.
Advantages:
Supports time- and event-based scheduling
Simplified calls to Dataproc using Operators
Dynamically generate workflows and workflow parameters
Build data flows that span multiple Google Cloud products
Tutorial: Workflow using Cloud Composer