Scheduling Queries

This page describes how to schedule recurring queries in BigQuery.

Overview

You can schedule queries to run on a recurring basis. Scheduled queries must be written in standard SQL, which can include Data Definition Language (DDL) and Data Manipulation Language (DML) statements. The query string and destination table can be parameterized, allowing you to organize query results by date and time.

Before you begin

Before you create a scheduled query:

  • Scheduled queries use features of the BigQuery Data Transfer Service. Verify that you have completed all actions required in Enabling the BigQuery Data Transfer Service.
  • Ensure that you have the following required permissions:
    • BigQuery: bigquery.transfers.update permissions to create the scheduled transfer. The bigquery.admin predefined, project-level IAM role includes bigquery.transfers.update permissions. For more information on IAM roles in BigQuery, see Access Control.
  • Allow pop-ups in your browser from bigquery.cloud.google.com, so that you can view the permissions window. You must allow the BigQuery Data Transfer Service permission to manage your Scheduled Query.

Configuration options

Query string

The query string must be valid and written in standard SQL. Each run of a scheduled query can receive the following query parameters.

Available parameters

Parameter Standard SQL Type Value
@run_time TIMESTAMP Represented in UTC time. For regularly scheduled queries, run_time represents the intended time of execution. For example, if the scheduled query is set to "every 24 hours", the run_time difference between two consecutive queries will be exactly 24 hours - even though the actual execution time might slightly vary.
@run_date DATE Represents a logical calendar date.

Example

The @run_time parameter is part of the query string in this example, which queries a public dataset named hacker_news.stories.

SELECT @run_time AS time,
  title,
  author,
  text
FROM `bigquery-public-data.hacker_news.stories`
LIMIT
  1000

Destination table

When you set up the scheduled query, if the destination table for your results doesn't exist, BigQuery attempts to create the table for you. (If using a DDL/DML query, leave the destination table blank.)

If the target table does exist, the destination table’s schema might be updated based on the query results if you add columns to the schema (ALLOW_FIELD_ADDITION) or relax a column's mode from REQUIRED to NULLABLE (ALLOW_FIELD_RELAXATION). In all other cases, table schema changes between runs cause the scheduled query to fail.

Queries can reference tables from different projects and different datasets. When configuring your scheduled query, you don't need to include the destination dataset in the table name. You specify the destination dataset separately.

Write preference

The write preference you select determines how your query results are written to an existing destination table. If you are using a DDL/DML query, leave write preference blank.

  • WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data.
  • WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.

Creating, truncating, or appending a destination table only happens if BigQuery is able to successfully complete the query. Creation, truncation, or append actions occur as one atomic update upon job completion.

Partitioning options

Scheduled queries can create partitioned or non-partitioned destination tables. If you are using a DDL/DML query, leave the partitioning field blank. There are two types of table partitioning in BigQuery:

  • Tables partitioned by ingestion time: Tables partitioned based on the scheduled query's run time.
  • Partitioned tables: Tables that are partitioned based on a TIMESTAMP or DATE column.

If the destination table will be partitioned on a column, you'll specify the column name when Setting up a scheduled query. Leave the Partitioning field blank for ingestion-time partitioned tables and non-partitioned tables.

Learn more about partitioning tables in Introduction to Partitioned Tables.

Partitioning examples

  • Table with no partitioning
    • Destination table - mytable
    • Partitioning field - leave blank
  • Ingestion-time partitioned table
    • Destination table - mytable$YYYYMMDD
    • Partitioning field - leave blank
  • Column-partitioned table
    • Destination table - mytable
    • Partitioning field - name of the TIMESTAMP or DATE column used to partition the table

Available parameters

When setting up the scheduled query, you can specify how you want to partition the destination table with runtime parameters.

Parameter Template Type Value
run_time Formatted timestamp In UTC time, per the schedule. For regularly scheduled queries, run_time represents the intended time of execution. For example, if the scheduled query is set to "every 24 hours", the run_time difference between two consecutive queries will be exactly 24 hours - even though the actual execution time may vary slightly.

See TransferRun.runTime.
run_date Date string The date of the run_time parameter in the following format: %Y%m%d; for example, 20180101. This format is compatible with ingestion-time partitioned tables.

Templating system

Scheduled queries support runtime parameters in the destination table name with a templating syntax.

Parameter templating syntax

The templating syntax supports basic string templating and time offsetting. Parameters are referenced in the following formats:

  • {run_date}
  • {run_time[+\-offset]|"time_format"}

Parameter Purpose
run_date This parameter is replaced by the date in format YYYYMMDD.
run_time This parameter supports the following properties:


offset
Time offset expressed in hours (h), minutes (m), and seconds (s) in that order.
Days (d) are not supported.
Decimals are allowed, for example: 1.5h.

time_format
A formatting string. The most common formatting parameters are years (%Y), months (%m), and days (%d).
For partitioned tables, YYYYMMDD is the required suffix - this is equivalent to "%Y%m%d".

Read more about formatting datetime elements.

Usage notes:

  • No whitespace is allowed between run_time, offset, and time format.
  • To include literal curly braces in the string, you can escape them as ‘\{‘ and ‘\}’.
  • To include literal quotes or a vertical bar in the time_format, such as “YYYY|MM|DD”, you can escape them in the format string as: ‘\”’ or ‘\|’.

Parameter templating examples

These examples demonstrate specifying destination table names with different time formats, and offsetting the run time.
run_time (UTC) Templated parameter Output destination table name
2018-02-15 00:00:00 mytable mytable
2018-02-15 00:00:00 mytable_{run_time|"%Y%m%d"} mytable_20180215
2018-02-15 00:00:00 mytable_+25h{run_time|"%Y%m%d"} mytable_20180216
2018-02-15 00:00:00 mytable_-1h{run_time|"%Y%m%d"} mytable_20180214
2018-02-15 00:00:00 mytable_+1.5h{run_time|"%Y%m%d;%H"}
or
mytable_+90m{run_time|"%Y%m%d;%H"}
mytable_2018021501
2018-02-15 00:00:00 {run_time+97s|"%Y%m%d"}_mytable_{run_time|"%H%M%s"} 20180215_mytable_000137

Setting up a scheduled query

  1. Go to the BigQuery web UI.

    Go to the BigQuery web UI

  2. Run the query that you're interested in. When you are satisfied with your results, click Schedule Query.

    Schedule query in query UI

  3. The scheduled query options open underneath the query box.

  4. On the New Scheduled Query page:

    • For Destination Dataset, choose the appropriate dataset.
    • For Display Name, enter a name for the scheduled query such as My Scheduled Query. The scheduled query name can be any value that allows you to easily identify the scheduled query if you need to modify it later.
    • For Destination Table:
      • For a standard SQL query, enter the name of your destination table.
      • For a DDL/DML query, leave this field blank.
    • For Write Preference:
      • For a standard SQL query, choose either WRITE_TRUNCATE to overwrite the destination table or WRITE_APPEND to append data to the table.
      • For a DDL/DML query, choose Unspecified.
    • (Optional) For Partitioning Field:

      • For a standard SQL query, if the destination table is a column-partitioned table, enter the column name where the table should be partitioned. Leave this field blank for ingestion-time partitioned tables and non-partitioned tables.
      • For a DDL/DML query, leave this field blank.

        New scheduled query

    • (Optional) For Schedule, you can leave the default value of Daily (every 24 hours, based on creation time), or click Edit to change the time. You can also change the interval to Weekly, Monthly, or Custom. When selecting Custom, a Cron-like time specification is expected, for example every 3 hours. The shortest allowed period is three hours. See the schedule field under TransferConfig for additional valid API values.

      Query schedule

    • (Optional) Expand the Advanced section and configure run notifications for your transfer. Transfer run notifications are currently in Alpha.

      • For Cloud Pub/Sub topic, enter your Cloud Pub/Sub topic name, for example, projects/myproject/topics/mytopic.
      • Check Send email notifications to allow email notifications of transfer run failures.

        Cloud Pub/Sub topic

  5. Click Add.

Setting up a manual run on historical dates

In addition to scheduling a query to run in the future, you can also set up your query to run on historical data within a specified date range. Within the given date range, your query is executed at the same time you specified when you created your scheduled query.

After Setting up a scheduled query, you can run the query within a historical date range.

After clicking Add to save your scheduled query, you'll see the details of your scheduled query displayed. Below the details, click on the Start Manual Runs button to specify a historical date range.

start manual runs button

You can further refine the date range to have a start and end time, or leave the time fields as 00:00:00.

set historic dates

Example 1

If your scheduled query is set to run every day 14:00, and you apply the following historic date range:

Start Time = 2/21/2018 00:00:00 AM
End Time = 2/24/2018 00:00:00 AM

Your query runs at the following times:

  • 2/21/2018 14:00:00
  • 2/22/2018 14:00:00
  • 2/23/2018 14:00:00

Example 2

If your scheduled query is set to run every fri at 01:05 and you apply the following historic date range:

Start Time = 2/1/2018 00:00:00(a Thursday)
End Time = 2/24/2018 00:00:00 AM (also a Thursday)

Your query runs at the following times:

  • 2/2/2018 01:05:00
  • 2/9/2018 01:05:00

Quotas

A scheduled query is executed with the creator's credentials and project, as if you were executing the query yourself. The scheduled query is subject to all BigQuery Quotas & Limits on query jobs.

Known issues and limitations

Regions

Cross-region queries are not supported, and the destination table for your scheduled query must be in the same region as the data being queried. See Dataset Locations for more information about regions and multi-regions.

Google Drive

You can query Google Drive data in a scheduled query. If you're scheduling an existing query, you might need to click "Update Credentials" in the scheduled query details screen. Allow 10-20 minutes for the change to take effect. Your might need to clear your browser's cache. Credentials are automatically up to date for new scheduled queries.

Update credentials

Was this page helpful? Let us know how we did:

Send feedback about...

Need help? Visit our support page.