[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[[["\u003cp\u003eWorkload scheduling helps organize and optimize chains of actions involving BigQuery, creating seamless connections across data resources and processes.\u003c/p\u003e\n"],["\u003cp\u003eScheduling methods are either event-driven, where actions are triggered by state changes, or time-driven, where actions are based on set time intervals, or can include both.\u003c/p\u003e\n"],["\u003cp\u003eGoogle Cloud provides several tools for scheduling complex data workloads, including Dataform, Workflows, Cloud Composer, and Vertex AI Pipelines, each with a focus such as data transformation, microservices, ETL/ELT, or machine learning, respectively.\u003c/p\u003e\n"],["\u003cp\u003eScheduled queries directly within BigQuery are the simplest form of workload scheduling, suitable for straightforward query chains without external dependencies.\u003c/p\u003e\n"],["\u003cp\u003eIn addition to the tools provided, Google cloud has messaging tools such as Pub/Sub and Eventarc that are designed to integrate with BigQuery for data integration pipelines and managing state changes.\u003c/p\u003e\n"]]],[],null,["# Schedule workloads\n==================\n\nBigQuery tasks are usually part of larger workloads, with external\ntasks triggering and then being triggered by BigQuery operations.\nWorkload scheduling helps data administrators, analysts, and developers\norganize and optimize this chain of actions, creating a seamless connection\nacross data resources and processes. Scheduling methods and tools assist\nin designing, building, implementing, and monitoring these complex data\nworkloads.\n\nChoose a scheduling method\n--------------------------\n\nTo select a scheduling method, you should identify whether your workloads\nare event-driven, time-driven, or both. An *event* is defined as a state change,\nsuch as a change to data in a database or a file added to a storage system. In\n*event-driven scheduling* , an action on a website might trigger a data\nactivity, or an object landing in a certain bucket might need to be processed\nimmediately on arrival. In *time-driven scheduling*, new data might need to\nbe loaded once per day or frequently enough to produce hourly reports. You can\nuse event-driven and time-driven scheduling in scenarios where you need to\nload objects into a data lake in real time, but activity reports on the data\nlake are only generated daily.\n\nChoose a scheduling tool\n------------------------\n\nScheduling tools assist with tasks that are involved in managing complex data\nworkloads, such as combining multiple Google Cloud or third-party services with\nBigQuery jobs, or running multiple BigQuery jobs\nin parallel. Each workload has unique requirements for dependency and parameter\nmanagement to ensure that tasks are executed in the correct order using the\ncorrect data. Google Cloud provides several scheduling options that are\nbased on scheduling method and workload requirements.\n\nWe recommend using Dataform, Workflows,\nCloud Composer, or Vertex AI Pipelines for most use cases.\nConsult the following chart for a side-by-side comparison:\n\nThe following sections detail these scheduling tools and several others.\n\n### Scheduled queries\n\nThe simplest form of workload scheduling is\n[scheduling recurring queries](/bigquery/docs/scheduling-queries) directly in\nBigQuery. While this is the least complex approach to\nscheduling, we recommend it only for straightforward query chains with no\nexternal dependencies. Queries scheduled in this way must be written in\n[GoogleSQL](/bigquery/docs/reference/standard-sql/query-syntax) and\ncan include\n[data definition language (DDL)](/bigquery/docs/reference/standard-sql/data-definition-language)\nand\n[data manipulation language (DML)](/bigquery/docs/data-manipulation-language)\nstatements.\n\n**Scheduling method**: time-driven\n\n### Dataform\n\n[Dataform](/dataform/docs/overview) is a free, SQL-based, opinionated\ntransformation framework that schedules complex data transformation tasks in\nBigQuery. When raw data is loaded into BigQuery,\nDataform helps you create an organized, tested,\nversion-controlled collection of datasets and tables. Use\nDataform to schedule runs for your\n[data preparations](/bigquery/docs/orchestrate-data-preparations),\n[notebooks](/bigquery/docs/orchestrate-notebooks),\nand [BigQuery pipelines](/bigquery/docs/schedule-pipelines).\n\n**Scheduling method**: time-driven\n| **Note:** If you create an asset in a BigQuery repository---for example, a query, notebook (including a notebook with an Apache Spark job), BigQuery pipeline, or Dataform workflow---you cannot schedule it for execution in Dataform. Instead, you need to use BigQuery execution and scheduling capabilities. For more information, see [Scheduling queries](/bigquery/docs/scheduling-queries), [Schedule notebooks](/bigquery/docs/orchestrate-notebooks), and [Schedule pipelines](/bigquery/docs/schedule-pipelines).\n\n### Workflows\n\n[Workflows](/workflows/docs/overview) is a serverless tool that\nschedules HTTP-based services with very low latency. It is best for chaining\nmicroservices together, automating infrastructure tasks, integrating with\nexternal systems, or creating a sequence of operations in Google Cloud.\nTo learn more about using Workflows with BigQuery,\nsee\n[Run multiple BigQuery jobs in parallel](/workflows/docs/tutorials/bigquery-parallel-jobs).\n\n**Scheduling method**: event-driven and time-driven\n\n### Cloud Composer\n\n[Cloud Composer](/composer/docs/concepts/overview) is a fully managed\ntool built on Apache Airflow. It is best for extract, transform, load (ETL) or\nextract, load, transform (ELT) workloads as it supports several\n[operator](https://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html)\ntypes and patterns, as well as task execution across other Google Cloud\nproducts and external targets. To learn more about using Cloud Composer\nwith BigQuery, see\n[Run a data analytics DAG in Google Cloud](/composer/docs/data-analytics-googlecloud).\n\n**Scheduling method**: time-driven\n\n### Vertex AI Pipelines\n\n[Vertex AI Pipelines](/vertex-ai/docs/pipelines/introduction) is a\nserverless tool based on Kubeflow Pipelines specially designed for scheduling\nmachine learning workloads. It automates and connects all tasks of your model\ndevelopment and deployment, from training data to code, giving you a complete\nview of how your models work. To learn more about using\nVertex AI Pipelines with BigQuery, see\n[Export and deploy a BigQuery machine learning model for prediction](https://codelabs.developers.google.com/codelabs/bqml-vertex-prediction#0).\n\n**Scheduling method**: event-driven\n\n### Apigee Integration\n\n[Apigee Integration](/apigee/docs/api-platform/integration/what-is-apigee-integration)\nis an extension of the Apigee platform that includes connectors and\ndata transformation tools. It is best for integrating with external enterprise\napplications, like Salesforce. To learn more about using\nApigee Integration with BigQuery, see\n[Get started with Apigee Integration and a Salesforce trigger](/apigee/docs/api-platform/integration/getting-started-salesforce-updates).\n\n**Scheduling method**: event-driven and time-driven\n\n### Cloud Data Fusion\n\n[Cloud Data Fusion](/data-fusion) is a data integration tool that\noffers code-free ELT/ETL pipelines and over 150 preconfigured connectors and\ntransformations. To learn more about using Cloud Data Fusion with\nBigQuery, see\n[Replicating data from MySQL to BigQuery](/data-fusion/docs/tutorials/replicating-data/mysql-to-bigquery).\n\n**Scheduling method**: event-driven and time-driven\n\n### Cloud Scheduler\n\n[Cloud Scheduler](/scheduler/docs/overview) is a fully managed\nscheduler for jobs like batch streaming or infrastructure operations that should\noccur on defined time intervals. To learn more about using\nCloud Scheduler with BigQuery, see\n[Scheduling workflows with Cloud Scheduler](/scheduler/docs/tut-workflows).\n\n**Scheduling method**: time-driven\n\n### Cloud Tasks\n\n[Cloud Tasks](/tasks/docs/dual-overview) is a fully managed\nservice for asynchronous task distribution of jobs that can execute\nindependently, outside of your main workload. It is best for delegating slow\nbackground operations or managing API call rates. To learn more\nabout using Cloud Tasks with BigQuery, see\n[Add a task to a Cloud Tasks queue](/tasks/docs/add-task-queue).\n\n**Scheduling method**: event-driven\n\n### Third-party tools\n\nYou can also connect to BigQuery using a number of\npopular third-party tools such as CData and SnapLogic. The\nBigQuery Ready program offers a\n[full list of validated partner solutions](/bigquery/docs/bigquery-ready-partners).\n\nMessaging tools\n---------------\n\nMany data workloads require additional messaging connections between decoupled\nmicroservices that only need to be activated when certain events occur.\nGoogle Cloud provides two tools that are designed to integrate with\nBigQuery.\n\n### Pub/Sub\n\n[Pub/Sub](/pubsub/docs/overview) is an asynchronous messaging tool\nfor data integration pipelines. It is designed to ingest and distribute data\nlike server events and user interactions. It can also be used for parallel\nprocessing and data streaming from IoT devices. To learn more about using\nPub/Sub with BigQuery, see\n[Stream from Pub/Sub to BigQuery](/dataflow/docs/tutorials/dataflow-stream-to-bigquery).\n\n### Eventarc\n\n[Eventarc](/eventarc/docs/overview) is an event-driven tool that\nlets you manage the flow of state changes throughout your data pipeline. This\ntool has a wide range of use cases including automated error remediation,\nresource labeling, image retouching, and more. To learn more about using\nEventarc with BigQuery, see\n[Build a BigQuery processing pipeline with Eventarc](/eventarc/docs/run/bigquery).\n\nWhat's next\n-----------\n\n- Learn to [schedule recurring queries directly in BigQuery](/bigquery/docs/scheduling-queries).\n- Get started with [Dataform](/dataform/docs/overview).\n- Get started with [Workflows](/workflows/docs/overview).\n- Get started with [Cloud Composer](/composer/docs/concepts/overview).\n- Get started with [Vertex AI Pipelines](/vertex-ai/docs/pipelines/introduction).\n- Get started with [Apigee Integration](/apigee/docs/api-platform/integration/what-is-apigee-integration).\n- Get started with [Cloud Data Fusion](/data-fusion).\n- Get started with [Cloud Scheduler](/scheduler/docs/overview).\n- Get started with [Pub/Sub](/pubsub/docs/overview).\n- Get started with [Eventarc](/eventarc/docs/overview)."]]