GoogleSQL for BigQuery supports built-in table functions.
This topic includes functions that produce columns of a table.
You can only use these functions in the FROM
clause.
Function list
Name | Summary |
---|---|
APPENDS
|
Returns all rows appended to a table for a given time range. |
CHANGES
|
Returns all rows that have changed in a table for a given time range. |
EXTERNAL_OBJECT_TRANSFORM
|
Produces an object table with the original columns plus one or more additional columns. |
GAP_FILL
|
Finds and fills gaps in a time series. |
RANGE_SESSIONIZE
|
Produces a table of session ranges. |
APPENDS
APPENDS(
TABLE table,
start_timestamp DEFAULT NULL,
end_timestamp DEFAULT NULL)
Description
The APPENDS
function returns all rows appended to a table for a given
time range.
The following operations add rows to the APPENDS
change history:
CREATE TABLE
DDL statementINSERT
DML statement- Data appended as part of a
MERGE
DML statement - Loading data into BigQuery
- Streaming ingestion
Definitions
table
: the BigQuery table name. This must be a regular BigQuery table. This argument must be preceded by the wordTABLE
.start_timestamp
: aTIMESTAMP
value indicating the earliest time at which a change is included in the output. If the value isNULL
, all changes since the table creation are returned. If the table was created after thestart_timestamp
value, the actual table creation time is used instead. An error is returned if the time specified is earlier than allowed by time travel, or if the table was created earlier than allowed by time travel if thestart_timestamp
value isNULL
. For standard tables, this window is seven days, but you can configure the time travel window to be less than that.end_timestamp
: aTIMESTAMP
value indicating the latest time at which a change is included in the output.end_timestamp
is exclusive; for example, if you specify2023-12-31 08:00:00
forstart_timestamp
and2023-12-31 12:00:00
forend_timestamp
, all changes made from 8 AM December 31, 2023 through 11:59 AM December 31, 2023 are returned.If the
end_timestamp
value isNULL
, all changes made until the start of the query are included.
Details
Records of inserted rows persist even if that data is later deleted. Deletions
are not reflected in the APPENDS
function. If a table
is copied, calling the APPENDS
function on the copied table returns every row
as inserted at the time of table creation. If a row is modified due to an
UPDATE
operation, there's no effect.
Output
The APPENDS
function returns a table with the following columns:
- All columns of the input table at the time the query is run. If a column is
added after the
end_timestamp
value, it appears withNULL
values populated in any of the rows that were inserted before the addition of the column. _CHANGE_TYPE
: aSTRING
value indicating the type of change that produced the row. ForAPPENDS
, the only supported value isINSERT
._CHANGE_TIMESTAMP
: aTIMESTAMP
value indicating the commit time of the transaction that made the change.
Limitations
- The data returned by the
APPENDS
function is limited to the time travel window of the table. - You can't call the
APPENDS
function within a multi-statement transaction. - You can only use the
APPENDS
function with regular BigQuery tables. Clones, snapshots, views, materialized views, external tables, and wildcard tables aren't supported. - Partition pseudo-columns for ingestion-time partitioned tables, such as
_PARTITIONTIME
and_PARTITIONDATE
, aren't included in the function's output.
Example
This example shows the change history returned by the APPENDS
function as various
changes are made to a table called Produce
. First, create the table:
CREATE TABLE mydataset.Produce (product STRING, inventory INT64) AS (
SELECT 'apples' AS product, 10 AS inventory);
Next, insert two rows into the table:
INSERT INTO mydataset.Produce
VALUES
('bananas', 20),
('carrots', 30);
To view the full change history of appends, use NULL
values to get the full
history within the time travel window:
SELECT
product,
inventory,
_CHANGE_TYPE AS change_type,
_CHANGE_TIMESTAMP AS change_time
FROM
APPENDS(TABLE mydataset.Produce, NULL, NULL);
The output is similar to the following:
+---------+-----------+-------------+--------------------------------+ | product | inventory | change_type | change_time | +---------+-----------+-------------+--------------------------------+ | apples | 10 | INSERT | 2022-04-15 20:06:00.488000 UTC | | bananas | 20 | INSERT | 2022-04-15 20:06:08.490000 UTC | | carrots | 30 | INSERT | 2022-04-15 20:06:08.490000 UTC | +---------+-----------+-------------+--------------------------------+
Next, add a column, insert a new row of values, update the inventory, and delete
the bananas
row:
ALTER TABLE mydataset.Produce ADD COLUMN color STRING;
INSERT INTO mydataset.Produce VALUES ('grapes', 40, 'purple');
UPDATE mydataset.Produce SET inventory = inventory + 5 WHERE TRUE;
DELETE mydataset.Produce WHERE product = 'bananas';
View the new table:
SELECT * FROM mydataset.Produce;
The output is similar to the following:
+---------+-----------+--------+ | product | inventory | color | +---------+-----------+--------+ | apples | 15 | NULL | | carrots | 35 | NULL | | grapes | 45 | purple | +---------+-----------+--------+
View the full change history of appends:
SELECT
product,
inventory,
color,
_CHANGE_TYPE AS change_type,
_CHANGE_TIMESTAMP AS change_time
FROM
APPENDS(TABLE mydataset.Produce, NULL, NULL);
The output is similar to the following:
+---------+-----------+--------+-------------+--------------------------------+ | product | inventory | color | change_type | change_time | +---------+-----------+--------+-------------+--------------------------------+ | apples | 10 | NULL | INSERT | 2022-04-15 20:06:00.488000 UTC | | bananas | 20 | NULL | INSERT | 2022-04-15 20:06:08.490000 UTC | | carrots | 30 | NULL | INSERT | 2022-04-15 20:06:08.490000 UTC | | grapes | 40 | purple | INSERT | 2022-04-15 20:07:45.751000 UTC | +---------+-----------+--------+-------------+--------------------------------+
The inventory
column displays the values that were set when the rows were
originally inserted into to the table. It does not show the changes from the
UPDATE
statement. The row with information on bananas is still present because
the APPENDS
function only captures additions to tables, not deletions.
CHANGES
CHANGES(
TABLE table,
start_timestamp DEFAULT NULL,
end_timestamp)
Description
The CHANGES
function returns all rows that have changed in a table for a given
time range. To use the CHANGES
function on a table, you must set the table's
enable_change_history
option
to TRUE
.
The following operations add rows to the CHANGES
change history:
CREATE TABLE
DDL statementINSERT
DML statement- Data appended, changed or deleted as part of a
MERGE
DML statement UPDATE
DML statementDELETE
DML statement- Loading data into BigQuery
- Streaming ingestion
TRUNCATE TABLE
DML statement- Jobs configured with a
writeDisposition
ofWRITE_TRUNCATE
- Individual table partition deletions
Definitions
table
: the BigQuery table name. This must be a regular BigQuery table, and must have theenable_change_history
option set toTRUE
. Enabling this table option has an impact on costs; for more information see Pricing and costs. This argument must be preceded by the wordTABLE
.start_timestamp
: aTIMESTAMP
value indicating the earliest time at which a change is included in the output. If the value isNULL
, all changes since the table creation are returned. If you set theenable_change_history
option after setting thestart_timestamp
option, the history before the enablement time might be incomplete. If the table was created after thestart_timestamp
value, the actual table creation time is used instead. An error is returned if the time specified is earlier than allowed by time travel, or if the table was created earlier than allowed by time travel if thestart_timestamp
value isNULL
. For standard tables, this window is seven days, but you can configure the time travel window to be less than that.end_timestamp
: aTIMESTAMP
value indicating the latest time at which a change is included in the output.end_timestamp
is exclusive; for example, if you specify2023-12-31 08:00:00
forstart_timestamp
and2023-12-31 12:00:00
forend_timestamp
, all changes made from 8 AM December 31, 2023 through 11:59 AM December 31, 2023 are returned. Theend_timestamp
value must be at least ten minutes prior to the current time. The maximum time range allowed betweenstart_timestamp
andend_timestamp
is one day.
Details
If a row is inserted, a record of the new row with an INSERT
change type is
produced.
If a row is deleted, a record of the deleted row with a DELETE
change type is
produced.
If a row is updated, a record of the old row with a DELETE
change type and a
record of the new row with an UPDATE
change type are produced.
Output
The CHANGES
function returns a table with the following columns:
- All columns of the input table at the time that the query is run. If a
column is added after the
end_timestamp
value, it appears withNULL
values populated in of the any rows that were changed before the addition of the column. _CHANGE_TYPE
: aSTRING
value indicating the type of change that produced the row. ForCHANGES
, the supported values areINSERT
,UPDATE
, andDELETE
._CHANGE_TIMESTAMP
: aTIMESTAMP
value indicating the commit time of the transaction that made the change.
Limitations
- The data returned by the
CHANGES
function is limited to the time travel window of the table. - The maximum allowed time range between the
start_timestamp
andend_timestamp
arguments you specify for the function is one day. - The
CHANGES
function can't query the last ten minutes of table history. Therefore, theend_timestamp
argument value must be at least ten minutes prior to the current time. - You can't call the
CHANGES
function within a multi-statement transaction. - You can't use the
CHANGES
function with tables that have had multi-statement transactions committed to them within the requested time window. - You can only use the
CHANGES
function with regular BigQuery tables. Views, materialized views, external tables, and wildcard tables aren't supported. - For tables that have been cloned or snapshotted, and for tables that are restored from a clone or snapshot, change history from the source table isn't carried over to the new table, clone, or snapshot.
- You can't use the
CHANGES
function with a table that has change data capture enabled. - Partition pseudo-columns for ingestion-time partitioned tables, such as
_PARTITIONTIME
and_PARTITIONDATE
, aren't included in the function's output. - Change history isn't captured for table deletions made due to table partition expiration.
- Performing
data manipulation language (DML) statements over recently streamed data
fails on tables that have the
enable_change_history
option set toTRUE
.
Example
This example shows the change history returned by the CHANGES
function as
various changes are made to a table called Produce
. First, create the table:
CREATE TABLE mydataset.Produce (
product STRING,
inventory INT64)
OPTIONS(enable_change_history=true);
Insert two rows into the table:
INSERT INTO mydataset.Produce
VALUES
('bananas', 20),
('carrots', 30);
Delete one row from the table:
DELETE mydataset.Produce
WHERE product = 'bananas';
Update one row of the table:
UPDATE mydataset.Produce
SET inventory = inventory - 10
WHERE product = 'carrots';
Wait for 10 minutes and view the full change history of the changes:
SELECT
product,
inventory,
_CHANGE_TYPE AS change_type,
_CHANGE_TIMESTAMP AS change_time
FROM
CHANGES(TABLE mydataset.Produce, NULL, TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 601 SECOND))
ORDER BY change_time, product;
The output is similar to the following:
+---------+-----------+-------------+---------------------+ | product | inventory | change_type | change_time | +---------+-----------+-------------+---------------------+ | bananas | 20 | INSERT | 2024-01-09 17:13:58 | | carrots | 30 | INSERT | 2024-01-09 17:13:58 | | bananas | 20 | DELETE | 2024-01-09 17:14:30 | | carrots | 30 | DELETE | 2024-01-09 17:15:24 | | carrots | 20 | UPDATE | 2024-01-09 17:15:24 | +---------+-----------+-------------+---------------------+
EXTERNAL_OBJECT_TRANSFORM
EXTERNAL_OBJECT_TRANSFORM(TABLE object_table_name, transform_types_array)
Description
This function returns a transformed object table with the original columns plus
one or more additional columns, depending on the transform_types
values
specified.
This function only supports object tables as inputs. Subqueries or any other types of tables are not supported.
object_table_name
is the name of the object table to be transformed, in
the format dataset_name.object_table_name
.
transform_types_array
is an array of STRING
literals. Currently, the only
supported transform_types_array
value is SIGNED_URL
. Specifying SIGNED_URL
creates read-only signed URLs for the objects in the identified object table,
which are returned in a signed_url
column. Generated signed URLs are
valid for 6 hours.
Return Type
TABLE
Example
Run the following query to return URIs and signed URLs for the objects in the
mydataset.myobjecttable
object table.
SELECT uri, signed_url
FROM EXTERNAL_OBJECT_TRANSFORM(TABLE mydataset.myobjecttable, ['SIGNED_URL']);
--The preceding statement returns results similar to the following:
/*-----------------------------------------------------------------------------------------------------------------------------*
| uri | signed_url |
+-----------------------------------------------------------------------------------------------------------------------------+
| gs://myobjecttable/1234_Main_St.jpeg | https://storage.googleapis.com/mybucket/1234_Main_St.jpeg?X-Goog-Algorithm=1234abcd… |
+-----------------------------------------------------------------------------------------------------------------------------+
| gs://myobjecttable/345_River_Rd.jpeg | https://storage.googleapis.com/mybucket/345_River_Rd.jpeg?X-Goog-Algorithm=2345bcde… |
*-----------------------------------------------------------------------------------------------------------------------------*/
GAP_FILL
Finds and fills gaps in a time series.
For more information, see GAP_FILL
in
Time series functions.
RANGE_SESSIONIZE
For more information, see RANGE_SESSIONIZE
in
Range functions.