Introduction to continuous queries
This document describes BigQuery continuous queries.
BigQuery continuous queries are SQL statements that run continuously. Continuous queries let you analyze incoming data in BigQuery in real time. You can insert the output rows produced by a continuous query into a BigQuery table or export them to Pub/Sub or Bigtable. Continuous queries can process data that has been written to standard BigQuery tables by using one of the following methods:
You can use continuous queries to perform time sensitive tasks, such as creating and immediately acting on insights, applying real time machine learning (ML) inference, and replicating data into other platforms. This lets you use BigQuery as an event-driven data processing engine for your application's decision logic.
The following diagram shows common continuous query workflows:
Use cases
Common use cases where you might want to use continuous queries are as follows:
- Personalized customer interaction services: use generative AI to create tailored messages customized for each customer interaction.
- Anomaly detection: build solutions that let you perform anomaly and threat detection on complex data in real time, so that you can react to issues more quickly.
- Customizable event-driven pipelines: use continuous query integration with Pub/Sub to trigger downstream applications based on incoming data.
- Data enrichment and entity extraction: use continuous queries to perform real time data enrichment and transformation by using SQL functions and ML models.
- Reverse extract-transform-load (ETL): perform real time reverse ETL into other storage systems more suited for low latency application serving. For example, analyzing or enhancing event data that is written to BigQuery, and then streaming it to Bigtable for application serving.
Supported operations
The following operations are supported in continuous queries:
- Running
INSERT
statements to write data from a continuous query into a BigQuery table. Running
EXPORT DATA
statements to publish continuous query output to Pub/Sub topics. For more information, see Export data to Pub/Sub.From a Pub/Sub topic, you can use the data with other services, such as performing streaming analytics by using Dataflow, or using the data in an application integration workflow.
Running
EXPORT DATA
statements to export data from BigQuery to Bigtable tables. For more information, see Export data to Bigtable.Calling the following generative AI functions:
These functions require you to have a BigQuery ML remote model over a Vertex AI model.
Calling the following AI functions:
These functions require you to have a BigQuery ML remote model over a Cloud AI API.
Normalizing numerical data by using the
ML.NORMALIZER
function.Using stateless GoogleSQL functions—for example, conversion functions. In stateless functions, each row is processed independently from other rows in the table.
Using the
APPENDS
change history function to start continuous query processing from a specific point in time.
Authorization
To execute long running continuous queries, use a service account rather than a user account.
The Google Cloud access tokens that are used when running continuous query jobs have a time to live (TTL) of two days when they are generated by a user account. Therefore, such jobs stop running after two days. The access tokens that are generated by service accounts aren't constrained by a TTL, so continuous query jobs executed by a service account run until explicitly canceled. For more information, see Run a continuous query by using a service account.
Locations
Continuous queries are supported in the following locations:
US
EU
asia-northeast1
asia-south1
europe-west1
europe-west2
europe-west4
us-central1
us-east1
us-east4
us-west1
Limitations
Continuous queries are subject to the following limitations:
- BigQuery continuous queries don't maintain the state of
ingested data. Common operations that rely on state, such as
JOINs
, aggregation functions, and windowed analytic functions, aren't currently supported. You can't use the following SQL capabilities in a continuous query:
- Non-deterministic scalar functions—for example, the
CURRENT_DATE
function JOIN
operations- Aggregate functions
- Approximate aggregate functions
The following query clauses:
The following query operators:
Query set operators
BigQuery ML functions other than those listed in Supported operations
Data manipulation language (DML) statements except for
INSERT
.EXPORT DATA
statements that don't target Bigtable or Pub/Sub
- Non-deterministic scalar functions—for example, the
Continuous queries don't support wildcard tables as a data source.
Continuous queries don't support external tables as a data source.
Continuous queries don't support the following BigQuery security features:
When exporting data to Bigtable, you can only target Bigtable instances that fall within the same Google Cloud regional boundary as the BigQuery dataset that contains the table you are querying. For more information, see Location considerations. This restriction doesn't apply to exporting data to Pub/Sub because Pub/Sub is a global resource.
You can't run a continuous query from a data canvas.
You can't modify the SQL used in a continuous query while the continuous query job is running. For more information, see Modify the SQL of a continuous query.
If the continuous query job falls behind by more than seven days, you must cancel and start a new continuous query job. You can run the query again and use the
APPENDS
change history function to resume processing from the point in time at which you stopped the previous continuous query job. For more information, see Start a continuous query from a particular point in time.
Reservation limitations
- You must create Enterprise edition or Enterprise Plus edition reservations in order to run continuous queries. Continuous queries don't support the on-demand compute billing model.
- When you create a reservation assignment for a continuous query, the associated reservation is limited to 500 slots or less, and can't be configured to use autoscaling.
- A continuous query reservation assignment doesn't share idle slots, even if the reservation is configured to do so.
- You can't create a reservation assignment that uses a different job type in the same reservation as a continuous query reservation assignment.
- You can't configure continuous query concurrency. BigQuery
automatically determines the number of continuous queries that can run
concurrently, based on available reservation assignments that use the
CONTINUOUS
job type. - When running multiple continuous queries using the same reservation, individual jobs might not split available resources fairly, as defined by BigQuery fairness.
Pricing
Continuous queries use
BigQuery capacity compute pricing,
which is measured in slots.
To run continuous queries, you must have a
reservation that uses the
Enterprise or Enterprise Plus edition,
and a reservation assignment
that uses the CONTINUOUS
job type.
Usage of other BigQuery resources, such as data ingestion and storage, are charged at the rates shown in BigQuery pricing.
Usage of other services that receive continuous query results or that are called during continuous query processing are charged at the rates published for those services. For the pricing of other Google Cloud services used by continuous queries, see the following topics:
What's next
Try creating a continuous query.