Data definition language (DDL) statements in standard SQL
Data definition language (DDL) statements let you create and modify BigQuery resources using standard SQL query syntax. You can use DDL commands to create, alter, and delete resources, such as tables, table clones, table snapshots, views, user-defined functions (UDFs), and row-level access policies.
Required permissions
To create a job that runs a DDL statement, you must have the
bigquery.jobs.create
permission for the project where you are running the job.
Each DDL statement also requires specific permissions on the affected resources,
which are documented under each statement.
IAM roles
The predefined IAM roles bigquery.user
,
bigquery.jobUser
, and bigquery.admin
include the required
bigquery.jobs.create
permission.
For more information about IAM roles in BigQuery, see Predefined roles and permissions or the IAM permissions reference.
Running DDL statements
You can run DDL statements by using the console, by using the
bq
command-line tool, by calling the
jobs.query
REST API, or
programmatically using the
BigQuery API client libraries.
Console
Go to the BigQuery page in the console.
Click Compose new query.
Enter the DDL statement into the Query editor text area. For example:
CREATE TABLE mydataset.newtable ( x INT64 )
Click Run.
bq
Enter the
bq query
command
and supply the DDL statement as the query parameter. Set the
use_legacy_sql
flag to false
.
bq query --use_legacy_sql=false \ 'CREATE TABLE mydataset.newtable ( x INT64 )'
API
Call the jobs.query
method
and supply the DDL statement in the request body's query
property.
DDL functionality extends the information returned by a
Jobs resource.
statistics.query.statementType
includes the following additional values for DDL
support:
CREATE_TABLE
CREATE_TABLE_AS_SELECT
DROP_TABLE
CREATE_VIEW
DROP_VIEW
statistics.query
has 2 additional fields:
ddlOperationPerformed
: The DDL operation performed, possibly dependent on the existence of the DDL target. Current values include:CREATE
: The query created the DDL target.SKIP
: No-op. Examples —CREATE TABLE IF NOT EXISTS
was submitted, and the table exists. OrDROP TABLE IF EXISTS
was submitted, and the table does not exist.REPLACE
: The query replaced the DDL target. Example —CREATE OR REPLACE TABLE
was submitted, and the table already exists.DROP
: The query deleted the DDL target.
ddlTargetTable
: When you submit aCREATE TABLE/VIEW
statement or aDROP TABLE/VIEW
statement, the target table is returned as an object with 3 fields:- "projectId": string
- "datasetId": string
- "tableId": string
Java
Call the
BigQuery.create()
method to start a query job. Call the
Job.waitFor()
method to wait for the DDL query to finish.
Node.js
Python
Call the
Client.query()
method to start a query job. Call the
QueryJob.result()
method to wait for the DDL query to finish.
CREATE SCHEMA
statement
Creates a new dataset.
Syntax
CREATE SCHEMA [ IF NOT EXISTS ] [project_name.]dataset_name [DEFAULT COLLATE collate_specification] [OPTIONS(schema_option_list)]
Arguments
IF NOT EXISTS
: If any dataset exists with the same name, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.DEFAULT COLLATE collate_specification
: When a new table is created in the schema, the table inherits a default collation specification unless a collation specification is explicitly specified for a column.If you remove or change this collation specification later with the
ALTER SCHEMA
statement, this will not change existing collation specifications in this schema. If you want to update an existing collation specification in a schema, you must alter the column that contains the specification.project_name
: The name of the project where you are creating the dataset. Defaults to the project that runs this DDL statement.dataset_name
: The name of the dataset to create.schema_option_list
: A list of options for creating the dataset.
Details
The dataset is created in the location that you specify in the query settings. For more information, see Specifying your location.
For more information about creating a dataset, see Creating datasets. For information about quotas, see Dataset limits.
schema_option_list
The option list specifies options for the dataset. Specify the options in the
following format: NAME=VALUE, ...
The following options are supported:
NAME |
VALUE |
Details |
---|---|---|
default_kms_key_name |
STRING |
Specifies the default Cloud KMS key for encrypting table data in this dataset. You can override this value when you create a table. |
default_partition_expiration_days |
FLOAT64 |
Specifies the default expiration time, in days, for table partitions in this dataset. You can override this value when you create a table. |
default_table_expiration_days |
FLOAT64 |
Specifies the default expiration time, in days, for tables in this dataset. You can override this value when you create a table. |
description |
STRING |
The description of the dataset. |
friendly_name |
STRING |
A descriptive name for the dataset. |
labels |
<ARRAY<STRUCT<STRING, STRING>>> |
An array of labels for the dataset, expressed as key-value pairs. |
location |
STRING |
The location in which to create the dataset. If you don't specify this option, the dataset is created in the location where the query runs. If you specify this option and also explicitly set the location for the query job, the two values must match; otherwise the query fails. |
max_time_travel_hours |
SMALLINT |
In preview. Specifies the duration in hours of the time travel window for the
dataset. The For more information on the time travel window, see Configuring the time travel window. |
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.datasets.create |
The project where you create the dataset. |
Examples
Creating a new schema
The following example creates a dataset with a default table expiration and a set of labels.
CREATE SCHEMA mydataset OPTIONS( location="us", default_table_expiration_days=3.75, labels=[("label1","value1"),("label2","value2")] )
Creating a schema with collation support
The following example creates a dataset with a collation specification.
CREATE SCHEMA mydataset DEFAULT COLLATE 'und:ci'
CREATE TABLE
statement
Creates a new table.
Syntax
CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] TABLE [ IF NOT EXISTS ] table_name [( column[, ...] )] [DEFAULT COLLATE collate_specification] [PARTITION BY partition_expression] [CLUSTER BY clustering_column_list] [OPTIONS(table_option_list)] [AS query_statement]
Arguments
OR REPLACE
: Replaces any table with the same name if it exists. Cannot appear withIF NOT EXISTS
.TEMP | TEMPORARY
: Creates a temporary table.IF NOT EXISTS
: If any table exists with the same name, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.table_name
: The name of the table to create. See Table path syntax. For temporary tables, do not include the project name or dataset name.column
: The table's schema information.collation_specification
: When a new column is created in the schema, and if the column does not have an explicit collation specification, the column inherits this colloation specification forSTRING
types.If you remove or change this collation specification later with the
ALTER TABLE
statement, this will not change existing collation specifications in this table. If you want to update an existing collation specification in a table, you must alter the column that contains the specification.If the table is part of a schema, the default collation specification for this table overrides the default collation specification for the schema.
partition_expression
: An expression that determines how to partition the table.clustering_column_list
: A comma-separated list of column references that determine how to cluster the table. You cannot have collation on columns in this list.table_option_list
: A list of options for creating the table.query_statement
: The query from which the table should be created. For the query syntax, see SQL syntax reference. {: #query_statement } If a collation specification is used on this table, collation passes through this query statement.
Details
CREATE TABLE
statements must comply with the following rules:
- Only one
CREATE
statement is allowed. - Either the column list, the
as query_statement
clause, or both must be present. - When both the column list and the
as query_statement
clause are present, BigQuery ignores the names in theas query_statement
clause and matches the columns with the column list by position. - When the
as query_statement
clause is present and the column list is absent, BigQuery determines the column names and types from theas query_statement
clause. - Column names must be specified either through the column list,
the
as query_statement
clause or schema of the table in theLIKE
clause. - Duplicate column names are not allowed.
- When both the
LIKE
and theas query_statement
clause are present, the column list in the query statement must match the columns of the table referenced by theLIKE
clause.
Limitations:
- It is not possible to create an
ingestion-time partitioned table
from the result of a query. Instead, use a
CREATE TABLE
DDL statement to create the table, and then use anINSERT
DML statement to insert data into it. - It is not possible to use the
OR REPLACE
modifier to replace a table with a different kind of partitioning. Instead,DROP
the table, and then use aCREATE TABLE ... AS SELECT ...
statement to recreate it.
This statement supports the following variants:
CREATE TABLE LIKE
: Create a table with the same schema as an existing table.CREATE TABLE COPY
: Create a table by copying schema and data from an existing table.
column
(column_name column_schema[, ...])
contains the table's
schema information in a comma-separated list.
column := column_name column_schema column_schema := { simple_type [NOT NULL] | STRUCT<field_list> [NOT NULL] | ARRAY<array_element_schema> } [OPTIONS(column_option_list)] field_list := field_name column_schema [, ...] array_element_schema := { simple_type | STRUCT<field_list> } [NOT NULL] simple_type := { data_type | STRING COLLATE collate_specification }
column_name
is the name of the column. A column name:- Must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_)
- Must start with a letter or underscore
- Can be up to 300 characters
column_schema
: Similar to a data type, but supports an optionalNOT NULL
constraint for types other thanARRAY
.column_schema
also supports options on top-level columns andSTRUCT
fields.column_schema
can be used only in the column definition list ofCREATE TABLE
statements. It cannot be used as a type in expressions. Forsimple_type
: Any supported data type aside fromSTRUCT
andARRAY
.If
simple_type
is aSTRING
, it supports an additional clause for collation, which defines how a resultingSTRING
can be compared and sorted. The syntax looks like this:STRING COLLATE collate_specification
If you have
DEFAULT COLLATE collate_specification
assigned to the table, the collation specification for a column overrides the specification for the table.field_list
: Represents the fields in a struct.field_name
: The name of the struct field. Struct field names have the same restrictions as column names.NOT NULL
: When theNOT NULL
constraint is present for a column or field, the column or field is created withREQUIRED
mode. Conversely, when theNOT NULL
constraint is absent, the column or field is created withNULLABLE
mode.Columns and fields of
ARRAY
type do not support theNOT NULL
modifier. For example, acolumn_schema
ofARRAY<INT64> NOT NULL
is invalid, sinceARRAY
columns haveREPEATED
mode and can be empty but cannot beNULL
. An array element in a table can never beNULL
, regardless of whether theNOT NULL
constraint is specified. For example,ARRAY<INT64>
is equivalent toARRAY<INT64 NOT NULL>
.The
NOT NULL
attribute of a table'scolumn_schema
does not propagate through queries over the table. If tableT
contains a column declared asx INT64 NOT NULL
, for example,CREATE TABLE dataset.newtable AS SELECT x FROM T
creates a table nameddataset.newtable
in whichx
isNULLABLE
.
partition_expression
PARTITION BY
is an optional clause that controls
table partitioning. partition_expression
is an expression that determines how to partition the table. The partition
expression can contain the following values:
_PARTITIONDATE
. Partition by ingestion time with daily partitions. This syntax cannot be used with theAS query_statement
clause.DATE(_PARTITIONTIME)
. Equivalent to_PARTITIONDATE
. This syntax cannot be used with theAS query_statement
clause.<date_column>
. Partition by aDATE
column with daily partitions.DATE({ <timestamp_column> | <datetime_column> })
. Partition by aTIMESTAMP
orDATETIME
column with daily partitions.DATETIME_TRUNC(<datetime_column>, { DAY | HOUR | MONTH | YEAR })
. Partition by aDATETIME
column with the specified partitioning type.TIMESTAMP_TRUNC(<timestamp_column>, { DAY | HOUR | MONTH | YEAR })
. Partition by aTIMESTAMP
column with the specified partitioning type.TIMESTAMP_TRUNC(_PARTITIONTIME, { DAY | HOUR | MONTH | YEAR })
. Partition by ingestion time with the specified partitioning type. This syntax cannot be used with theAS query_statement
clause.DATE_TRUNC(<date_column>, { MONTH | YEAR })
. Partition by aDATE
column with the specified partitioning type.RANGE_BUCKET(<int64_column>, GENERATE_ARRAY(<start>, <end>[, <interval>]))
. Partition by an integer column with the specified range, where:start
is the start of range partitioning, inclusive.end
is the end of range partitioning, exclusive.interval
is the width of each range within the partition. Defaults to 1.
clustering_column_list
CLUSTER BY
is an optional clause that controls table clustering.
clustering_column_list
is a comma-separated list that determines how to
cluster the table. The clustering column list can contain a list of up to four
clustering columns.
table_option_list
The option list allows you to set table options such as a label and an expiration time. You can include multiple options using a comma-separated list.
Specify a table option list in the following format:
NAME=VALUE, ...
NAME
and VALUE
must be one of the following combinations:
NAME |
VALUE |
Details |
---|---|---|
expiration_timestamp |
TIMESTAMP |
Example: This property is equivalent to the expirationTime table resource property. |
partition_expiration_days |
|
Example: Sets the partition expiration in days. For more information, see Set the partition expiration. By default, partitions do not expire. This property is equivalent to the timePartitioning.expirationMs table resource property but uses days instead of milliseconds. One day is equivalent to 86400000 milliseconds, or 24 hours. This property can only be set if the table is partitioned. |
require_partition_filter |
|
Example: Specifies whether queries on this table must include a a predicate
filter that filters on the partitioning column. For more information,
see
Set partition filter requirements. The default value is
This property is equivalent to the timePartitioning.requirePartitionFilter table resource property. This property can only be set if the table is partitioned. |
kms_key_name |
|
Example: This property is equivalent to the encryptionConfiguration.kmsKeyName table resource property. See more details about Protecting data with Cloud KMS keys. |
friendly_name |
|
Example: This property is equivalent to the friendlyName table resource property. |
description |
|
Example: This property is equivalent to the description table resource property. |
labels |
|
Example: This property is equivalent to the labels table resource property. |
VALUE
is a constant expression containing only literals, query parameters,
and scalar functions.
The constant expression cannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT
,CREATE
, orUPDATE
- User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRING
REPLACE
REGEXP_REPLACE
RAND
FORMAT
LPAD
RPAD
REPEAT
SESSION_USER
GENERATE_ARRAY
GENERATE_DATE_ARRAY
If VALUE
evaluates to NULL
, the corresponding option NAME
in the
CREATE TABLE
statement is ignored.
column_option_list
The column_option_list
in column_schema
lets you specify optional
column or field options. Column options have the same syntax and requirements as
table options but with a different list of NAME
s and VALUE
s:
NAME |
VALUE |
Details |
---|---|---|
description |
|
Example: This property is equivalent to the schema.fields[].description table resource property. |
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.create |
The dataset where you create the table. |
In addition, the OR REPLACE
clause requires bigquery.tables.update
and
bigquery.tables.updateData
permissions.
If the OPTIONS
clause includes any expiration options, then
bigquery.tables.delete
permission is also required.
Examples
Creating a new table
The following example creates a partitioned table named newtable
in
mydataset
:
CREATE TABLE mydataset.newtable ( x INT64 OPTIONS(description="An optional INTEGER field"), y STRUCT< a ARRAY<STRING> OPTIONS(description="A repeated STRING field"), b BOOL > ) PARTITION BY _PARTITIONDATE OPTIONS( expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC", partition_expiration_days=1, description="a table that expires in 2025, with each partition living for 24 hours", labels=[("org_unit", "development")] )
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.newtable
, your table qualifier might be
`myproject.mydataset.newtable`
.
If the table name exists in the dataset, the following error is returned:
Already Exists: project_id:dataset.table
The table uses the following partition_expression
to partition the table:
PARTITION BY _PARTITIONDATE
. This expression partitions the table using
the date in the _PARTITIONDATE
pseudo column.
The table schema contains two columns:
- x: An integer, with description "An optional INTEGER field"
y: A STRUCT containing two columns:
- a: An array of strings, with description "A repeated STRING field"
- b: A boolean
The table option list specifies the:
- Table expiration time: January 1, 2025 at 00:00:00 UTC
- Partition expiration time: 1 day
- Description:
A table that expires in 2025
- Label:
org_unit = development
Creating a new table from an existing table
The following example creates a table named top_words
in mydataset
from a
query:
CREATE TABLE mydataset.top_words OPTIONS( description="Top ten words per Shakespeare corpus" ) AS SELECT corpus, ARRAY_AGG(STRUCT(word, word_count) ORDER BY word_count DESC LIMIT 10) AS top_words FROM bigquery-public-data.samples.shakespeare GROUP BY corpus;
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.top_words
, your table qualifier might be
`myproject.mydataset.top_words`
.
If the table name exists in the dataset, the following error is returned:
Already Exists: project_id:dataset.table
The table schema contains 2 columns:
- corpus: Name of a Shakespeare corpus
top_words: An
ARRAY
ofSTRUCT
s containing 2 fields:word
(aSTRING
) andword_count
(anINT64
with the word count)
The table option list specifies the:
- Description:
Top ten words per Shakespeare corpus
Creating a table only if the table doesn't exist
The following example creates a table named newtable
in mydataset
only if no
table named newtable
exists in mydataset
. If the table name exists in the
dataset, no error is returned, and no action is taken.
CREATE TABLE IF NOT EXISTS mydataset.newtable (x INT64, y STRUCT<a ARRAY<STRING>, b BOOL>) OPTIONS( expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC", description="a table that expires in 2025", labels=[("org_unit", "development")] )
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.newtable
, your table qualifier might be
`myproject.mydataset.newtable`
.
The table schema contains 2 columns:
- x: An integer
y: A STRUCT containing a (an array of strings) and b (a boolean)
The table option list specifies the:
- Expiration time: January 1, 2025 at 00:00:00 UTC
- Description:
A table that expires in 2025
- Label:
org_unit = development
Creating or replacing a table
The following example creates a table named newtable
in mydataset
, and if
newtable
exists in mydataset
, it is overwritten with an empty table.
CREATE OR REPLACE TABLE mydataset.newtable (x INT64, y STRUCT<a ARRAY<STRING>, b BOOL>) OPTIONS( expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC", description="a table that expires in 2025", labels=[("org_unit", "development")] )
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.newtable
, your table qualifier might be
`myproject.mydataset.newtable`
.
The table schema contains 2 columns:
- x: An integer
y: A STRUCT containing a (an array of strings) and b (a boolean)
The table option list specifies the:
- Expiration time: January 1, 2025 at 00:00:00 UTC
- Description:
A table that expires in 2025
- Label:
org_unit = development
Creating a table with REQUIRED
columns
The following example creates a table named newtable
in mydataset
. The NOT
NULL
modifier in the column definition list of a CREATE TABLE
statement
specifies that a column or field is created in REQUIRED
mode.
CREATE TABLE mydataset.newtable ( x INT64 NOT NULL, y STRUCT< a ARRAY<STRING>, b BOOL NOT NULL, c FLOAT64 > NOT NULL, z STRING )
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.newtable
, your table qualifier might be
`myproject.mydataset.newtable`
.
If the table name exists in the dataset, the following error is returned:
Already Exists: project_id:dataset.table
The table schema contains 3 columns:
- x: A
REQUIRED
integer - y: A
REQUIRED
STRUCT containing a (an array of strings), b (aREQUIRED
boolean), and c (aNULLABLE
float) z: A
NULLABLE
string
Creating a table with collation support
The following examples create a table named newtable
in mydataset
with
columns a
, b
, c
, and a struct with fields x
and y
.
All STRING
column schemas in this table are collated with 'und:ci'
:
CREATE TABLE mydataset.newtable ( a STRING, b STRING, c STRUCT< x FLOAT64 y ARRAY<STRING> > ) DEFAULT COLLATE 'und:ci';
Only b
and y
are collated with 'und:ci'
:
CREATE TABLE mydataset.newtable ( a STRING, b STRING COLLATE 'und:ci', c STRUCT< x FLOAT64 y ARRAY<STRING COLLATE 'und:ci'> > );
Creating a table with parameterized data types
The following example creates a table named newtable
in mydataset
. The
parameters in parentheses specify that the column contains a parameterized data
type. See Parameterized Data Types
for more information about parameterized types.
CREATE TABLE mydataset.newtable ( x STRING(10), y STRUCT< a ARRAY<BYTES(5)>, b NUMERIC(15, 2), c FLOAT64 >, z BIGNUMERIC(35) )
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. Instead of
mydataset.newtable
, your table qualifier should be
`myproject.mydataset.newtable`
.
If the table name exists in the dataset, the following error is returned:
Already Exists: project_id:dataset.table
The table schema contains 3 columns:
- x: A parameterized string with a maximum length of 10
- y: A STRUCT containing a (an array of parameterized bytes with a maximum length of 5), b (a parameterized NUMERIC with a maximum precision of 15 and maximum scale of 2), and c (a float)
- z: A parameterized BIGNUMERIC with a maximum precision of 35 and maximum scale of 0
Creating a partitioned table
The following example creates a
partitioned table
named newtable
in mydataset
using a DATE
column:
CREATE TABLE mydataset.newtable (transaction_id INT64, transaction_date DATE) PARTITION BY transaction_date OPTIONS( partition_expiration_days=3, description="a table partitioned by transaction_date" )
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.newtable
, your table qualifier might be
`myproject.mydataset.newtable`
.
The table schema contains 2 columns:
- transaction_id: An integer
- transaction_date: A date
The table option list specifies the:
- Partition expiration: Three days
- Description:
A table partitioned by transaction_date
Creating a partitioned table from the result of a query
The following example creates a
partitioned table
named days_with_rain
in mydataset
using a DATE
column:
CREATE TABLE mydataset.days_with_rain PARTITION BY date OPTIONS ( partition_expiration_days=365, description="weather stations with precipitation, partitioned by day" ) AS SELECT DATE(CAST(year AS INT64), CAST(mo AS INT64), CAST(da AS INT64)) AS date, (SELECT ANY_VALUE(name) FROM `bigquery-public-data.noaa_gsod.stations` AS stations WHERE stations.usaf = stn) AS station_name, -- Stations can have multiple names prcp FROM `bigquery-public-data.noaa_gsod.gsod2017` AS weather WHERE prcp != 99.9 -- Filter unknown values AND prcp > 0 -- Filter stations/days with no precipitation
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.days_with_rain
, your table qualifier might be
`myproject.mydataset.days_with_rain`
.
The table schema contains 2 columns:
- date: The
DATE
of data collection - station_name: The name of the weather station as a
STRING
- prcp: The amount of precipitation in inches as a
FLOAT64
The table option list specifies the:
- Partition expiration: One year
- Description:
Weather stations with precipitation, partitioned by day
Creating a clustered table
Example 1
The following example creates a
clustered table
named myclusteredtable
in mydataset
. The table is a partitioned table,
partitioned by a TIMESTAMP
column and clustered by a STRING
column named
customer_id
.
CREATE TABLE mydataset.myclusteredtable ( timestamp TIMESTAMP, customer_id STRING, transaction_amount NUMERIC ) PARTITION BY DATE(timestamp) CLUSTER BY customer_id OPTIONS ( partition_expiration_days=3, description="a table clustered by customer_id" )
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.myclusteredtable
, your table qualifier might be
`myproject.mydataset.myclusteredtable`
.
The table schema contains 3 columns:
- timestamp: The time of data collection as a
TIMESTAMP
- customer_id: The customer ID as a
STRING
- transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Partition expiration: 3 days
- Description:
A table clustered by customer_id
Example 2
The following example creates a
clustered table
named myclusteredtable
in mydataset
. The table is an
ingestion-time partitioned table.
CREATE TABLE mydataset.myclusteredtable ( customer_id STRING, transaction_amount NUMERIC ) PARTITION BY DATE(_PARTITIONTIME) CLUSTER BY customer_id OPTIONS ( partition_expiration_days=3, description="a table clustered by customer_id" )
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.myclusteredtable
, your table qualifier might be
`myproject.mydataset.myclusteredtable`
.
The table schema contains 2 columns:
- customer_id: The customer ID as a
STRING
- transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Partition expiration: 3 days
- Description:
A table clustered by customer_id
Example 3
The following example creates a
clustered table
named myclusteredtable
in mydataset
. The table is not partitioned.
CREATE TABLE mydataset.myclusteredtable ( customer_id STRING, transaction_amount NUMERIC ) CLUSTER BY customer_id OPTIONS ( description="a table clustered by customer_id" )
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.myclusteredtable
, your table qualifier might be
`myproject.mydataset.myclusteredtable`
.
The table schema contains 2 columns:
- customer_id: The customer ID as a
STRING
- transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Description:
A table clustered by customer_id
Creating a clustered table from the result of a query
Example 1
The following example creates a
clustered table
named myclusteredtable
in mydataset
using the result of a query. The table
is a partitioned table, partitioned by a
TIMESTAMP
column.
CREATE TABLE mydataset.myclusteredtable ( timestamp TIMESTAMP, customer_id STRING, transaction_amount NUMERIC ) PARTITION BY DATE(timestamp) CLUSTER BY customer_id OPTIONS ( partition_expiration_days=3, description="a table clustered by customer_id" ) AS SELECT * FROM mydataset.myothertable
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.myclusteredtable
, your table qualifier might be
`myproject.mydataset.myclusteredtable`
.
The table schema contains 3 columns:
- timestamp: The time of data collection as a
TIMESTAMP
- customer_id: The customer ID as a
STRING
- transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Partition expiration: 3 days
- Description:
A table clustered by customer_id
Example 2
The following example creates a
clustered table
named myclusteredtable
in mydataset
using the result of a query. The table
is not partitioned.
CREATE TABLE mydataset.myclusteredtable ( customer_id STRING, transaction_amount NUMERIC ) CLUSTER BY customer_id OPTIONS ( description="a table clustered by customer_id" ) AS SELECT * FROM mydataset.myothertable
If you haven't configured a default project, prepend a project ID to the dataset
name in the example SQL, and enclose the name in backticks if project_id
contains special characters:
`project_id.dataset.table`
. So, instead of
mydataset.myclusteredtable
, your table qualifier might be
`myproject.mydataset.myclusteredtable`
.
The table schema contains 2 columns:
- customer_id: The customer ID as a
STRING
- transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Description:
A table clustered by customer_id
Creating a temporary table
The following example creates a temporary table named Example
and inserts
values into it.
CREATE TEMP TABLE Example
(
x INT64,
y STRING
);
INSERT INTO Example
VALUES (5, 'foo');
INSERT INTO Example
VALUES (6, 'bar');
SELECT *
FROM Example;
This script returns the following output:
+-----+---+-----+
| Row | x | y |
+-----+---|-----+
| 1 | 5 | foo |
| 2 | 6 | bar |
+-----+---|-----+
CREATE TABLE LIKE
statement
Creates a new table with all of the same metadata of another table.
Syntax
CREATE [ OR REPLACE ] TABLE [ IF NOT EXISTS ] table_name LIKE [[project_name.]dataset_name.]source_table_name ... [OPTIONS(table_option_list)]
Details
Other than the use of the LIKE
clause in place of a column list, the syntax is
identical to the CREATE TABLE
syntax.
The CREATE TABLE LIKE
statement copies only the metadata of the source table.
You can use the as query_statement
clause to include data into the new table.
The new table has no relationship to the source table after creation; thus modifications to the source table will not propagate to the new table.
By default, the new table inherits partitioning, clustering, and options
metadata from the source table. You can customize metadata in the new table by
using the optional clauses in the SQL statement. For example, if you want to
specify a different set of options for the new table, then include the OPTIONS
clause with a list of options and values. This behavior match that of
ALTER TABLE SET OPTIONS
.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.create |
The dataset where you create the table. |
bigquery.tables.get |
The source table. |
In addition, the OR REPLACE
clause requires bigquery.tables.update
and
bigquery.tables.updateData
permissions.
If the OPTIONS
clause includes any expiration options, then
bigquery.tables.delete
permission is also required.
Examples
Example 1
The following example creates a new table named newtable
in
mydataset
with the same metadata as sourcetable
:
CREATE TABLE mydataset.newtable LIKE mydataset.sourcetable
Example 2
The following example creates a new table named newtable
in
mydataset
with the same metadata as sourcetable
and the data from the
SELECT
statement:
CREATE TABLE mydataset.newtable LIKE mydataset.sourcetable AS SELECT * FROM mydataset.myothertable
CREATE TABLE COPY
statement
Creates a table that has the same metadata and data as another table. The source table can be a table, a table clone, or a table snapshot.
Syntax
CREATE [ OR REPLACE ] TABLE [ IF NOT EXISTS ] table_name COPY source_table_name ... [OPTIONS(table_option_list)]
Details
Other than the use of the COPY
clause in place of a column list, the syntax is
identical to the CREATE TABLE
syntax.
The CREATE TABLE COPY
statement copies both the metadata and data from the
source table.
The new table inherits partitioning and clustering from the source table. By
default, the table options metadata from the source table are also inherited,
but you can override table options by using the OPTIONS
clause. The behavior
is equivalent to running ALTER TABLE SET OPTIONS
after the table is copied.
The new table has no relationship to the source table after creation; modifications to the source table are not propagated to the new table.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.create |
The dataset where you create the table snapshot. |
bigquery.tables.get |
The source table. |
bigquery.tables.getData |
The source table. |
In addition, the OR REPLACE
clause requires bigquery.tables.update
and
bigquery.tables.updateData
permissions.
If the OPTIONS
clause includes any expiration options, then
bigquery.tables.delete
permission is also required.
CREATE SNAPSHOT TABLE
statement
Creates a table snapshot based on a source table. The source table can be a table, a table clone, or a table snapshot.
Syntax
CREATE SNAPSHOT TABLE [ IF NOT EXISTS ] table_snapshot_name CLONE source_table_name [FOR SYSTEM_TIME AS OF time_expression] [OPTIONS(snapshot_option_list)]
Arguments
IF NOT EXISTS
: If a table snapshot or other table resource exists with the same name, theCREATE
statement has no effect.table_snapshot_name
: The name of the table snapshot that you want to create. The table snapshot name must be unique per dataset. See Table path syntax.source_table_name
: The name of the table that you want to snapshot or the table snapshot that you want to copy. See Table path syntax.If the source table is a standard table, then BigQuery creates a table snapshot of the source table. If the source table is a table snapshot, then BigQuery creates a copy of the table snapshot.
FOR SYSTEM_TIME AS OF
: Lets you select the version of the table that was current at the time specified bytimestamp_expression
. It can only be used when creating a snapshot of a table; it can't be used when making a copy of a table snapshot.snapshot_option_list
: Additional table snapshot creation options such as a label and an expiration time.
Details
CREATE SNAPSHOT TABLE
statements must comply with the following rules:
- Only one
CREATE
statement is allowed. - The source table must be one of the following:
- A table
- A table clone
- A table snapshot
- The
FOR SYSTEM_TIME AS OF
clause can only be used when creating a snapshot of a table or table clone; it can't be used when making a copy of a table snapshot.
snapshot_option_list
The option list lets you set table snapshot options such as a label and an expiration time. You can include multiple options using a comma-separated list.
Specify a table snapshot option list in the following format:
NAME=VALUE, ...
NAME
and VALUE
must be one of the following combinations:
NAME |
VALUE |
Details |
---|---|---|
expiration_timestamp |
TIMESTAMP |
Example: This property is equivalent to the
|
friendly_name |
|
Example: This property is equivalent to the
|
description |
|
Example: This property is equivalent to the
|
labels |
|
Example: This property is equivalent to the
|
VALUE
is a constant expression that contains only literals, query parameters,
and scalar functions.
The constant expression cannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT
,CREATE
, andUPDATE
- User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRING
REPLACE
REGEXP_REPLACE
RAND
FORMAT
LPAD
RPAD
REPEAT
SESSION_USER
GENERATE_ARRAY
GENERATE_DATE_ARRAY
If VALUE
evaluates to NULL
, the corresponding option NAME
in the
CREATE SNAPSHOT TABLE
statement is ignored.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.create
|
The dataset where you create the table snapshot. |
bigquery.tables.createSnapshot |
The source table. |
bigquery.tables.get |
The source table. |
bigquery.tables.getData |
The source table. |
Examples
Create a table snapshot: fail if it already exists
The following example creates a table snapshot of the table
myproject.mydataset.mytable
. The table snapshot is created in the dataset
mydataset
and is named mytablesnapshot
:
CREATE SNAPSHOT TABLE `myproject.mydataset.mytablesnapshot` CLONE `myproject.mydataset.mytable` OPTIONS( expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR), friendly_name="my_table_snapshot", description="A table snapshot that expires in 2 days", labels=[("org_unit", "development")] )
If the table snapshot name already exists in the dataset, then the following error is returned:
Already Exists: myproject.mydataset.mytablesnapshot
The table snapshot option list specifies the following:
- Expiration time: 48 hours after the time the table snapshot is created
- Friendly name:
my_table_snapshot
- Description:
A table snapshot that expires in 2 days
- Label:
org_unit = development
Create a table snapshot: ignore if it already exists
The following example creates a table snapshot of the table
myproject.mydataset.mytable
. The table snapshot is created in the dataset
mydataset
and is named mytablesnapshot
:
CREATE SNAPSHOT TABLE IF NOT EXISTS `myproject.mydataset.mytablesnapshot` CLONE `myproject.mydataset.mytable` OPTIONS( expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR), friendly_name="my_table_snapshot", description="A table snapshot that expires in 2 days" labels=[("org_unit", "development")] )
The table snapshot option list specifies the following:
- Expiration time: 48 hours after the time the table snapshot is created
- Friendly name:
my_table_snapshot
- Description:
A table snapshot that expires in 2 days
- Label:
org_unit = development
If the table snapshot name already exists in the dataset, then no action is taken, and no error is returned.
For information about restoring table snapshots, see
CREATE TABLE CLONE
.
For information about removing table snapshots, see
DROP SNAPSHOT TABLE
.
CREATE TABLE CLONE
statement
Creates a table clone based on a source table. The source table can be a table, a table clone, or a table snapshot.
Syntax
CREATE [ OR REPLACE ] TABLE [ IF NOT EXISTS ] destination_table_name CLONE source_table_name [FOR SYSTEM_TIME AS OF time_expression] ... [OPTIONS(table_option_list)]
Details
Other than the use of the CLONE
clause in place of a column list, the syntax
is identical to the CREATE TABLE
syntax.
Arguments
OR REPLACE
: Replaces a table with the same name if it exists. Cannot appear withIF NOT EXISTS
.IF NOT EXISTS
: If the specified destination table name already exists, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.
destination_table_name
is the name of the table that you want to create.
The table name must
be unique per dataset. The table name can contain the following:
- Up to 1,024 characters
- Letters (upper or lower case), numbers, and underscores
OPTIONS(table_option_list)
lets you specify
additional table creation options such as a
label and
an expiration time.
source_table_name
is the name of the source
table.
CREATE TABLE CLONE
statements must comply with the following rules:
- Only one
CREATE
statement is allowed. - The table that is being cloned must be a table, a table clone, or a table snapshot.
OPTIONS
CREATE TABLE CLONE
options are the same as
CREATE TABLE
options.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.create |
The dataset where you create the table clone. |
bigquery.tables.get |
The source table. |
bigquery.tables.getData |
The source table. |
bigquery.tables.restoreSnapshot |
The source table (required only if the source table is a table snapshot). |
In addition, the OR REPLACE
clause requires bigquery.tables.update
and
bigquery.tables.updateData
permissions.
If the OPTIONS
clause includes any expiration options, then
bigquery.tables.delete
permission is also required.
Examples
Restore a table snapshot: fail if destination table already exists
The following example creates the table
myproject.mydataset.mytable
from the table snapshot
myproject.mydataset.mytablesnapshot
:
CREATE TABLE `myproject.mydataset.mytable` CLONE `myproject.mydataset.mytablesnapshot` OPTIONS( expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 365 DAY), friendly_name="my_table", description="A table that expires in 1 year", labels=[("org_unit", "development")] )
If the table name exists in the dataset, then the following error is returned:
Already Exists: myproject.mydataset.mytable.
The table option list specifies the following:
- Expiration time: 365 days after the time that the table is created
- Friendly name:
my_table
- Description:
A table that expires in 1 year
- Label:
org_unit = development
Create a clone of a table: ignore if the destination table already exists
The following example creates the table clone
myproject.mydataset.mytableclone
based on the table
myproject.mydataset.mytable
:
CREATE TABLE IF NOT EXISTS `myproject.mydataset.mytableclone` CLONE `myproject.mydataset.mytable` OPTIONS( expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 365 DAY), friendly_name="my_table", description="A table that expires in 1 year", labels=[("org_unit", "development")] )
The table option list specifies the following:
- Expiration time: 365 days after the time the table is created
- Friendly name:
my_table
- Description:
A table that expires in 1 year
- Label:
org_unit = development
If the table name exists in the dataset, then no action is taken, and no error is returned.
For information about creating a copy of a table, see
CREATE TABLE COPY
.
For information about creating a snapshot of a table, see
CREATE SNAPSHOT TABLE
.
CREATE VIEW
statement
Creates a new view.
Syntax
CREATE [ OR REPLACE ] VIEW [ IF NOT EXISTS ] view_name [(view_column_name_list)] [OPTIONS(view_option_list)] AS query_expression
Arguments
OR REPLACE
: Replaces any view with the same name if it exists. Cannot appear withIF NOT EXISTS
.IF NOT EXISTS
: If a view or other table resource exists with the same name, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.view_name
: The name of the view you're creating. See Table path syntax.view_column_name_list
: Lets you explicitly specify the column names of the view, which may be aliases to the column names in the underlying SQL query.view_option_list
: Additional view creation options such as a label and an expiration time.query_expression
: The Standard SQL query expression used to define the view.
Details
CREATE VIEW
statements must comply with the following rules:
- Only one
CREATE
statement is allowed.
view_column_name_list
The view's column name list is optional. The names must be unique but do not have to be the same as the column names of the underlying SQL query. For example, if your view is created with the following statement:
CREATE VIEW mydataset.age_groups(age, count) AS SELECT age, COUNT(*)
FROM mydataset.people
group by age;
Then you can query it with:
SELECT age, count from mydataset.age_groups;
The number of columns in the column name list must match the number of columns in the underlying SQL query. If the columns in the table of the underlying SQL query is added or dropped, the view becomes invalid and must be recreated. For example, if the age
column is dropped from the mydataset.people
table, then the view created in the previous example becomes invalid.
view_option_list
The option list allows you to set view options such as a label and an expiration time. You can include multiple options using a comma-separated list.
Specify a view option list in the following format:
NAME=VALUE, ...
NAME
and VALUE
must be one of the following combinations:
NAME |
VALUE |
Details |
---|---|---|
expiration_timestamp |
TIMESTAMP |
Example: This property is equivalent to the expirationTime table resource property. |
friendly_name |
|
Example: This property is equivalent to the friendlyName table resource property. |
description |
|
Example: This property is equivalent to the description table resource property. |
labels |
|
Example: This property is equivalent to the labels table resource property. |
VALUE
is a constant expression containing only literals, query parameters,
and scalar functions.
The constant expression cannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT
,CREATE
, orUPDATE
- User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRING
REPLACE
REGEXP_REPLACE
RAND
FORMAT
LPAD
RPAD
REPEAT
SESSION_USER
GENERATE_ARRAY
GENERATE_DATE_ARRAY
If VALUE
evaluates to NULL
, the corresponding option NAME
in the
CREATE VIEW
statement is ignored.
Default project in view body
If the view is created in the same project used to run the CREATE VIEW
statement, the view body query_expression
can reference entities without
specifying the project; the default project is the project
which owns the view. Consider the sample query below.
CREATE VIEW myProject.myDataset.myView AS SELECT * FROM anotherDataset.myTable;
After running the above CREATE VIEW
query in the project myProject
, you can
run the query SELECT * FROM myProject.myDataset.myView
. Regardless of the project you
choose to run this SELECT
query, the referenced table anotherDataset.myTable
is always resolved against project myProject
.
If the view is not created in the same project used to run the CREATE VIEW
statement, then all references in the view body query_expression
must be
qualified with project IDs. For instance, the preceding sample CREATE VIEW
query
is invalid if it runs in a project different from myProject
.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.create |
The dataset where you create the view. |
In addition, the OR REPLACE
clause requires bigquery.tables.update
permission.
If the OPTIONS
clause includes an expiration time, then
bigquery.tables.delete
permission is also required.
Examples
Creating a new view
The following example creates a view named newview
in mydataset
:
CREATE VIEW `myproject.mydataset.newview` OPTIONS( expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR), friendly_name="newview", description="a view that expires in 2 days", labels=[("org_unit", "development")] ) AS SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
If the view name exists in the dataset, the following error is returned:
Already Exists: project_id:dataset.table
The view is defined using the following standard SQL query:
SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The view option list specifies the:
- Expiration time: 48 hours from the time the view is created
- Friendly name:
newview
- Description:
A view that expires in 2 days
- Label:
org_unit = development
Creating a view only if the view doesn't exist
The following example creates a view named newview
in mydataset
only if no
view named newview
exists in mydataset
. If the view name exists in the
dataset, no error is returned, and no action is taken.
CREATE VIEW IF NOT EXISTS `myproject.mydataset.newview` OPTIONS( expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR), friendly_name="newview", description="a view that expires in 2 days", labels=[("org_unit", "development")] ) AS SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The view is defined using the following standard SQL query:
SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The view option list specifies the:
- Expiration time: 48 hours from the time the view is created
- Friendly name:
newview
- Description:
A view that expires in 2 days
- Label:
org_unit = development
Creating or replacing a view
The following example creates a view named newview
in mydataset
, and if
newview
exists in mydataset
, it is overwritten using the specified query
expression.
CREATE OR REPLACE VIEW `myproject.mydataset.newview` OPTIONS( expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR), friendly_name="newview", description="a view that expires in 2 days", labels=[("org_unit", "development")] ) AS SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The view is defined using the following standard SQL query:
SELECT column_1, column_2, column_3 FROM
myproject.mydataset.mytable
The view option list specifies the:
- Expiration time: 48 hours from the time the view is created
- Friendly name:
newview
- Description:
A view that expires in 2 days
- Label:
org_unit = development
CREATE MATERIALIZED VIEW
statement
Creates a new materialized view.
Syntax
CREATE [ OR REPLACE ] MATERIALIZED VIEW [ IF NOT EXISTS ] materialized_view_name [PARTITION BY partition_expression] [CLUSTER BY clustering_column_list] [OPTIONS(materialized_view_option_list)] AS query_expression
Arguments
OR REPLACE
: Replaces any materialized view with the same name if it exists. Cannot appear withIF NOT EXISTS
.IF NOT EXISTS
: If a materialized view or other table resource exists with the same name, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.materialized_view_name
: The name of the materialized view you're creating. See Table path syntax.If the
project_name
is omitted from the materialized view name, or it is the same as the project that runs this DDL query, then the latter is also used as the default project for references to tables, functions, and other resources inquery_expression
. The default project of the references is fixed and does not depend on the future queries that invoke the new materialized view. Otherwise, all references inquery_expression
must be qualified with project names.The materialized view name must be unique per dataset.
partition_expression
: An expression that determines how to partition the table. A materialized view can only be partitioned in the same way as the table inquery expression
(the base table) is partitioned.clustering_column_list
: A comma-separated list of column references that determine how to cluster the materialized view.materialized_view_option_list
** allows you to specify additional materialized view options such as a whether refresh is enabled, the refresh interval, a label, and an expiration time.query_expression
: The Standard SQL query expression used to define the materialized view.
Details
CREATE MATERIALIZED VIEW
statements must comply with the following rules:
- Only one
CREATE
statement is allowed.
Default project in materialized view body
If the materialized view is created in the same project used to run the CREATE MATERIALIZED VIEW
statement, the materialized view body query_expression
can reference entities without
specifying the project; the default project is the project
which owns the materialized view. Consider the sample query below.
CREATE MATERIALIZED VIEW myProject.myDataset.myView AS SELECT * FROM anotherDataset.myTable;
After running the above CREATE MATERIALIZED VIEW
query in the project myProject
, you can
run the query SELECT * FROM myProject.myDataset.myView
. Regardless of the project you
choose to run this SELECT
query, the referenced table anotherDataset.myTable
is always resolved against project myProject
.
If the materialized view is not created in the same project used to run the CREATE VIEW
statement, then all references in the materialized view body query_expression
must be
qualified with project IDs. For instance, the preceding sample CREATE MATERIALIZED VIEW
query
is invalid if it runs in a project different from myProject
.
materialized_view_option_list
The option list allows you to set materialized view options such as a whether refresh is enabled. the refresh interval, a label and an expiration time. You can include multiple options using a comma-separated list.
Specify a materialized view option list in the following format:
NAME=VALUE, ...
NAME
and VALUE
must be one of the following combinations:
NAME |
VALUE |
Details |
---|---|---|
enable_refresh |
BOOLEAN |
Example: |
refresh_interval_minutes |
FLOAT64 |
Example: |
expiration_timestamp |
TIMESTAMP |
Example: This property is equivalent to the expirationTime table resource property. |
friendly_name |
|
Example: This property is equivalent to the friendlyName table resource property. |
description |
|
Example: This property is equivalent to the description table resource property. |
labels |
|
Example: This property is equivalent to the labels table resource property. |
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.create
|
The dataset where you create the materialized view. |
In addition, the OR REPLACE
clause requires bigquery.tables.update
permission.
If the OPTIONS
clause includes any expiration options, then
bigquery.tables.delete
permission is also required.
Examples
Creating a new materialized view
The following example creates a materialized view named new_mv
in mydataset
:
CREATE MATERIALIZED VIEW `myproject.mydataset.new_mv`
OPTIONS(
expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),
friendly_name="new_mv",
description="a materialized view that expires in 2 days",
labels=[("org_unit", "development")],
enable_refresh=true,
refresh_interval_minutes=20
)
AS SELECT column_1, SUM(column_2) AS sum_2, AVG(column_3) AS avg_3
FROM `myproject.mydataset.mytable`
GROUP BY column_1
If the materialized view name exists in the dataset, the following error is returned:
Already Exists: project_id:dataset.materialized_view
When you use a DDL statement to create a materialized view, you must specify the
project, dataset, and materialized view in the following format:
`project_id.dataset.materialized_view`
(including the backticks if project_id
contains special characters); for example,
`myproject.mydataset.new_mv`
.
The materialized view is defined using the following standard SQL query:
SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The materialized view option list specifies the:
- Expiration time: 48 hours from the time the materialized view is created
- Friendly name:
new_mv
- Description:
A materialized view that expires in 2 days
- Label:
org_unit = development
- Refresh enabled: true
- Refresh interval: 20 minutes
Creating a materialized view only if the materialized view doesn't exist
The following example creates a materialized view named new_mv
in mydataset
only if no materialized view named new_mv
exists in mydataset
. If the
materialized view name exists in the dataset, no error is returned, and no
action is taken.
CREATE MATERIALIZED VIEW IF NOT EXISTS `myproject.mydataset.new_mv`
OPTIONS(
expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),
friendly_name="new_mv",
description="a view that expires in 2 days",
labels=[("org_unit", "development")],
enable_refresh=false
)
AS SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The materialized view is defined using the following standard SQL query:
SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The materialized view option list specifies the:
- Expiration time: 48 hours from the time the view is created
- Friendly name:
new_mv
- Description:
A view that expires in 2 days
- Label:
org_unit = development
- Refresh enabled: false
Creating a materialized view with partitioning and clustering
The following example creates a materialized view named new_mv
in mydataset
,
partitioned by the col_datetime
column and clustered
by the col_int
column:
CREATE MATERIALIZED VIEW `myproject.mydataset.new_mv`
PARTITION BY DATE(col_datetime)
CLUSTER BY col_int
AS SELECT col_int, col_datetime, COUNT(1) as cnt
FROM `myproject.mydataset.mv_base_table`
GROUP BY col_int, col_datetime
The base table, mv_base_table
, must also be partitioned by the
col_datetime
column. For more information, see
Working with partitioned and clustered tables.
CREATE EXTERNAL TABLE
statement
Creates a new external table.
External tables let BigQuery query data that is stored outside of BigQuery storage. For more information about external tables, see Introduction to external data sources.
Syntax
CREATE [ OR REPLACE ] EXTERNAL TABLE [ IF NOT EXISTS ] table_name [( column_name column_schema, ... )] [WITH CONNECTION connection_name] [WITH PARTITION COLUMNS [( partition_column_name partition_column_type, ... )] ] OPTIONS ( external_table_option_list, ... );
Arguments
OR REPLACE
: Replaces any external table with the same name if it exists. Cannot appear withIF NOT EXISTS
.IF NOT EXISTS
: If an external table or other table resource exists with the same name, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.table_name
: The name of the external table. See Table path syntax.column_name
: The name of a column in the table.column_schema
: Specifies the schema of the column. It uses the same syntax as thecolumn_schema
definition in theCREATE TABLE
statement. If you don't include this clause, BigQuery detects the schema automatically.connection_name
: Specifies a connection resource that has credentials for accessing the external data. Specify the connection name in the form PROJECT_ID.LOCATION.CONNECTION_ID. If the project ID or location contains a dash, enclose the connection name in backticks (`
).partition_column_name
: The name of a partition column. Include this field if your external data uses a hive-partitioned layout. For more information, see: Supported data layouts.partition_column_type
: The partition column type.external_table_option_list
: A list of options for creating the external table.
Details
The CREATE EXTERNAL TABLE
statement does not support creating temporary
external tables.
To create an externally partitioned table, use the WITH PARTITION COLUMNS
clause to specify the partition schema details. BigQuery
validates the column definitions against the external data location. The schema
declaration must strictly follow the ordering of the fields in the external
path. For more information about external partitioning, see
Querying externally partitioned data.
external_table_option_list
The option list specifies options for creating the external table. The format
and uris
options are required. Specify the option list in the following
format: NAME=VALUE, ...
Options | |
---|---|
allow_jagged_rows |
If Applies to CSV data. |
allow_quoted_newlines |
If Applies to CSV data. |
compression |
The compression type of the data source. Supported values include:
Applies to CSV and JSON data. |
description |
A description of this table. |
enable_logical_types |
If Applies to Avro data. |
enum_as_string |
If Applies to Parquet data. |
enable_list_inference |
If Applies to Parquet data. |
encoding |
The character encoding of the data. Supported values include:
Applies to CSV data. |
expiration_timestamp |
The time when this table expires. If not specified, the table does not expire. Example: |
field_delimiter |
The separator for fields in a CSV file. Applies to CSV data. |
format |
The format of the external data.
Supported values for
Supported values for
The value |
decimal_target_types |
Determines how to convert a Example: |
json_extension |
For JSON data, indicates a particular JSON interchange format. If not specified, BigQuery reads the data as generic JSON records. Supported values include: |
hive_partition_uri_prefix |
A common prefix for all source URIs before the partition key encoding begins. Applies only to hive-partitioned external tables. Applies to Avro, CSV, JSON, Parquet, and ORC data. Example: |
ignore_unknown_values |
If Applies to CSV and JSON data. |
max_bad_records |
The maximum number of bad records to ignore when reading the data. Applies to: CSV, JSON, and Sheets data. |
null_marker |
The string that represents Applies to CSV data. |
preserve_ascii_control_characters |
If Applies to CSV data. |
projection_fields |
A list of entity properties to load. Applies to Datastore data. |
quote |
The string used to quote data sections in a CSV file. If your data
contains quoted newline characters, also set the
Applies to CSV data. |
require_hive_partition_filter |
If Applies to Avro, CSV, JSON, Parquet, and ORC data. |
sheet_range |
Range of a Sheets spreadsheet to query from. Applies to Sheets data. Example: |
skip_leading_rows |
The number of rows at the top of a file to skip when reading the data. Applies to CSV and Sheets data. |
uris |
An array of fully qualified URIs for the external data locations. Example: |
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.create |
The dataset where you create the external table. |
In addition, the OR REPLACE
clause requires bigquery.tables.update
permission.
If the OPTIONS
clause includes an expiration time, then
bigquery.tables.delete
permission is also required.
Examples
The following example creates an external table from multiple URIs. The data format is CSV. This example uses schema auto-detection.
CREATE EXTERNAL TABLE dataset.CsvTable OPTIONS (
format = 'CSV',
uris = ['gs://bucket/path1.csv', 'gs://bucket/path2.csv']
);
The following example creates an external table from a CSV file and explicitly
specifies the schema. It also specifies the field delimeter ('|'
) and sets the
maximum number of bad records allowed.
CREATE OR REPLACE EXTERNAL TABLE dataset.CsvTable
(
x INT64,
y STRING
)
OPTIONS (
format = 'CSV',
uris = ['gs://bucket/path1.csv'],
field_delimiter = '|',
max_bad_records = 5
);
The following example creates an externally partitioned table. It uses schema auto-detection to detect both the file schema and the hive partitioning layout.
For example, if the external path is
gs://bucket/path/field_1=first/field_2=1/data.csv
, the partition columns
would be field_1
(STRING
) and field_2
(INT64
).
CREATE EXTERNAL TABLE dataset.AutoHivePartitionedTable
WITH PARTITION COLUMNS
OPTIONS (
uris=['gs://bucket/path/*'],
format=csv,
hive_partition_uri_prefix='gs://bucket/path'
);
The following example creates an externally partitioned table by explicitly
specifying the partition columns. This example assumes that the external file
path has the pattern gs://bucket/path/field_1=first/field_2=1/data.csv
.
CREATE EXTERNAL TABLE dataset.CustomHivePartitionedTable
WITH PARTITION COLUMNS (
field_1 STRING, -- column order must match the external path
field_2 INT64
)
OPTIONS (
uris=['gs://bucket/path/*'],
format=csv,
hive_partition_uri_prefix='gs://bucket/path'
);
CREATE FUNCTION
statement
Creates a new user-defined function (UDF). BigQuery supports UDFs written in either SQL or JavaScript.
Syntax
To create a SQL UDF, use the following syntax:
CREATE [ OR REPLACE ] [ TEMPORARY | TEMP ] FUNCTION [ IF NOT EXISTS ] [[project_name.]dataset_name.]function_name ([named_parameter[, ...]]) ([named_parameter[, ...]]) [RETURNS data_type] AS (sql_expression) [OPTIONS (function_option_list)] named_parameter: param_name param_type
To create a JavaScript UDF, use the following syntax:
CREATE [OR REPLACE] [TEMPORARY | TEMP] FUNCTION [IF NOT EXISTS] [[project_name.]dataset_name.]function_name ([named_parameter[, ...]]) RETURNS data_type [determinism_specifier] LANGUAGE js [OPTIONS (function_option_list)] AS javascript_code named_parameter: param_name param_type determinism_specifier: { DETERMINISTIC | NOT DETERMINISTIC }
To create a remote function, use the following syntax:
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [[project_name.]dataset_name.]function_name ([named_parameter[, ...]]) RETURNS data_type REMOTE WITH CONNECTION connection_path [OPTIONS (function_option_list)] named_parameter: param_name param_type
Routine names must contain only letters, numbers, and underscores, and be at most 256 characters long.
Arguments
OR REPLACE
: Replaces any function with the same name if it exists. Cannot appear withIF NOT EXISTS
.IF NOT EXISTS
: If any dataset exists with the same name, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.TEMP
orTEMPORARY
: Creates a temporary function. If the clause is not present, the statement creates a persistent UDF. You can reuse persistent UDFs across multiple queries, whereas you can only use temporary UDFs in a single query, script, or procedure.project_name
. For persistent functions, the name of the project where you are creating the function. Defaults to the project that runs the DDL query. Do not include the project name for temporary functions.dataset_name
. For persistent functions, the name of the dataset where you are creating the function. Defaults to thedefaultDataset
in the request. Do not include the dataset name for temporary functions.function_name
. The name of the function.named_parameter
: A comma-separatedparam_name
andparam_type
pair. The value ofparam_type
is a BigQuery data type. For a SQL UDF, the value ofparam_type
can also beANY TYPE
.determinism_specifier
: Applies only to JavaScript UDFs. Provides a hint to BigQuery as to whether the query result can be cached. Can be one of the following values:DETERMINISTIC
: The function always returns the same result when passed the same arguments. The query result is potentially cacheable. For example, if the functionadd_one(i)
always returnsi + 1
, the function is deterministic.NOT DETERMINISTIC
: The function does not always return the same result when passed the same arguments, and therefore is not cacheable. For example, if the functionjadd_random(i)
returnsi + rand()
, the function is not deterministic and BigQuery does not use cached results.If all of the invoked functions are
DETERMINISTIC
, BigQuery tries to cache the result, unless the results can't be cached for other reasons. For more information, see Using cached query results.
data_type
: The data type that the function returns.- If the function is defined in SQL, then the
RETURNS
clause is optional. If theRETURNS
clause is omitted, then BigQuery infers the result type of the function from the SQL function body when a query calls the function. - If the function is defined in JavaScript, then the
RETURNS
clause is required. For more information about allowed values fordata_type
, see Supported JavaScript UDF data types.
- If the function is defined in SQL, then the
sql_expression
: The SQL expression that defines the function.function_option_list
. A list of options for creating the function.javascript_code
: The definition of a JavaScript function. The value is a string literal. If the code includes quotes and backslashes, it must be either escaped or represented as a raw string. For example, the codereturn "\n";
can be represented as one of the following:- Quoted string
"return \"\\n\";"
. Both quotes and backslashes need to be escaped. - Triple quoted string:
"""return "\\n";"""
. Backslashes need to be escaped while quotes do not. - Raw string:
r"""return "\n";"""
. No escaping is needed.
- Quoted string
connection_name
: Specifies a connection resource that has credentials for accessing the remote endpoint. Specify the connection name in the formproject_name.location.connection_id
. If the project name or location contains a dash, enclose the connection name in backticks (`
).
function_option_list
The option list specifies options for creating a UDF. The following options are supported:
NAME |
VALUE |
Details |
---|---|---|
description |
|
A description of the UDF. |
library |
|
An array of JavaScript libraries to include in the function definition. Applies only to JavaScript UDFs. For more information, see Including JavaScript libraries. Example: |
endpoint |
|
A HTTP endpoint of Cloud Functions. Applies only to remote functions. Example: For more information, see Creating a Remote Function. |
user_defined_context |
|
A list of key-value pairs that will be sent with every HTTP request when the function is invoked. Applies only to remote functions. Example: |
max_batching_rows |
|
The maximum number of rows in each HTTP request. If not specified, BigQuery decides how many rows are included in a HTTP request. Applies only to remote functions. |
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.routines.create |
The dataset where you create the function. |
In addition, the OR REPLACE
clause requires bigquery.routines.update
permission.
To create a remote function, additional IAM permissions are needed:
Permission | Resource |
---|---|
bigquery.connections.delegate
|
The connection which you use to create the remote function. |
Examples
Create a SQL UDF
The following example creates a persistent SQL UDF named multiplyInputs
in
a dataset named mydataset
.
CREATE FUNCTION mydataset.multiplyInputs(x FLOAT64, y FLOAT64)
RETURNS FLOAT64
AS (x * y);
Create a JavaScript UDF
The following example creates a temporary JavaScript UDF named multiplyInputs
and calls it from inside a SELECT
statement.
CREATE TEMP FUNCTION multiplyInputs(x FLOAT64, y FLOAT64)
RETURNS FLOAT64
LANGUAGE js
AS r"""
return x*y;
""";
SELECT multiplyInputs(a, b) FROM (SELECT 3 as a, 2 as b);
Create a remote function
The following example creates a persistent remote function named
remoteMultiplyInputs
in a dataset named mydataset
, assuming mydataset
is
in US
location and there is a connection myconnection
in the same location
and same project.
CREATE FUNCTION mydataset.remoteMultiplyInputs(x FLOAT64, y FLOAT64)
RETURNS FLOAT64
REMOTE WITH CONNECTION us.myconnection
OPTIONS(endpoint="https://us-central1-myproject.cloudfunctions.net/multiply");
CREATE TABLE FUNCTION
statement
Creates a new table function, also called a table-valued function (TVF).
Syntax
CREATE [ OR REPLACE ] TABLE FUNCTION [ IF NOT EXISTS ] [[project_name.]dataset_name.]function_name ( [ function_parameter [, ...] ] ) [RETURNS TABLE < column_declaration [, ...] > ] AS sql_query function_parameter: parameter_name { data_type | ANY TYPE } column_declaration: column_name data_type
Arguments
OR REPLACE
: Replaces any table function with the same name if it exists. Cannot appear withIF NOT EXISTS
.IF NOT EXISTS
: If any table function exists with the same name, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.project_name
: The name of the project where you are creating the function. Defaults to the project that runs this DDL statement.dataset_name
: The name of the dataset where you are creating the function.function_name
: The name of the function to create.function_parameter
: A parameter for the function, specified as a parameter name and a data type. The value ofdata_type
is a scalar BigQuery data type orANY TYPE
.RETURNS TABLE
: The schema of the table that the function returns, specified as a comma-separated list of column name and data type pairs. IfRETURNS TABLE
is absent, BigQuery infers the output schema from the query statement in the function body. IfRETURNS TABLE
is included, the names in the returned table type must match column names from the SQL query.sql_query
: Specifies the SQL query to run. The SQL query must include names for all columns.
Details
BigQuery coerces argument types when possible. For example, if
the parameter type is FLOAT64
and you pass an INT64
value, then
BigQuery coerces it to a FLOAT64
.
If a parameter type is ANY TYPE
, the function accepts an input of any type for
this argument. The type that you pass to the function must be compatible with
the function definition. If you pass an argument with an incompatible type, the
query returns an error. If more than one parameter has type ANY TYPE
,
BigQuery does not enforce any type relationship between them.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.routines.create
|
The dataset where you create the table function. |
In addition, the OR REPLACE
clause requires bigquery.routines.update
permission.
Examples
The following table function takes an INT64
parameter that is used to filter
the results of a query:
CREATE OR REPLACE TABLE FUNCTION mydataset.names_by_year(y INT64)
AS
SELECT year, name, SUM(number) AS total
FROM `bigquery-public-data.usa_names.usa_1910_current`
WHERE year = y
GROUP BY year, name
The following example specifies the return TABLE
type in the RETURNS
clause:
CREATE OR REPLACE TABLE FUNCTION mydataset.names_by_year(y INT64) RETURNS TABLE<name STRING, year INT64, total INT64> AS SELECT year, name, SUM(number) AS total FROM `bigquery-public-data.usa_names.usa_1910_current` WHERE year = y GROUP BY year, name
CREATE PROCEDURE
statement
Creates a new procedure, which is a block of statements that can be called from other queries. Procedures can call themselves recursively.
Syntax
CREATE [OR REPLACE] PROCEDURE [IF NOT EXISTS] [[project_name.]dataset_name.]procedure_name (procedure_argument[, ...] ) [OPTIONS(procedure_option_list)] BEGIN multi_statement_query END; procedure_argument: [procedure_argument_mode] argument_name argument_type
procedure_argument_mode: IN | OUT | INOUT
Arguments
OR REPLACE
: Replaces any procedure with the same name if it exists. Cannot appear withIF NOT EXISTS
.IF NOT EXISTS
: If any procedure exists with the same name, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.project_name
**: The name of the project where you are creating the procedure. Defaults to the project that runs this DDL query. If the project name contains special characters such as colons, it should be quoted in backticks`
(example:`google.com:my_project`
).dataset_name
: The name of the dataset where you are creating the procedure. Defaults to thedefaultDataset
in the request.procedure_name
: The name of the procedure to create.multi_statement_query
: The multi-statement query to run.argument_type
: Any valid BigQuery type.procedure_argument_mode
: Specifies whether an argument is an input, an output, or both.
procedure_option_list
The procedure_option_list
lets you specify procedure options. Procedure
options have the same syntax and requirements as table options but with a
different list of NAME
s and VALUE
s:
NAME |
VALUE |
Details |
---|---|---|
strict_mode |
|
Example: If While If Default value is |
description |
|
Example: A description of the procedure. |
Argument mode
IN
indicates that the argument is only an input to the procedure. You can
specify either a variable or a value expression for IN
arguments.
OUT
indicates that the argument is an output of the procedure. An OUT
argument is initialized to NULL
when the procedure starts. You
must specify a variable for OUT
arguments.
INOUT
indicates that the argument is both an input to and an output from
the procedure. You must specify a variable for INOUT
arguments. An INOUT
argument can be referenced in the body of a procedure as a variable and assigned
new values.
If neither IN
, OUT
, nor INOUT
is specified, the argument is treated as an
IN
argument.
Variable scope
If a variable is declared outside a procedure, passed as an INOUT or OUT argument to a procedure, and the procedure assigns a new value to that variable, that new value is visible outside of the procedure.
Variables declared in a procedure are not visible outside of the procedure, and vice versa.
An OUT
or INOUT
argument can be assigned a value using SET
, in which case
the modified value is visible outside of the procedure. If the procedure exits
successfully, then the value of the OUT
or INOUT
argument is the final value
assigned to that INOUT
variable.
Temporary tables exist for the duration of the script, so if a procedure creates a temporary table, the caller of the procedure will be able to reference the temporary table as well.
Default project in procedure body
Procedure bodies can reference entities without specifying the project; the
default project is the project which owns the procedure, not necessarily the
project used to run the CREATE PROCEDURE
statement. Consider the sample query
below.
CREATE PROCEDURE myProject.myDataset.QueryTable()
BEGIN
SELECT * FROM anotherDataset.myTable;
END;
After creating the above procedure, you can run the query
CALL myProject.myDataset.QueryTable()
. Regardless of the project you
choose to run this CALL
query, the referenced table anotherDataset.myTable
is always resolved against project myProject
.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.routines.create |
The dataset where you create the procedure. |
In addition, the OR REPLACE
clause requires bigquery.routines.update
permission.
Examples
The following example creates a procedure that both takes x
as an input
argument and returns x
as output; because no argument mode is present for the
argument delta
, it is an input argument. The procedure consists of a block
containing a single statement, which assigns the sum of the two input arguments
to x
.
CREATE PROCEDURE mydataset.AddDelta(INOUT x INT64, delta INT64)
BEGIN
SET x = x + delta;
END;
The following example calls the AddDelta
procedure from the example above,
passing it the variable accumulator
both times; because the changes to x
within AddDelta
are visible outside of AddDelta
, these procedure calls
increment accumulator
by a total of 8.
DECLARE accumulator INT64 DEFAULT 0;
CALL mydataset.AddDelta(accumulator, 5);
CALL mydataset.AddDelta(accumulator, 3);
SELECT accumulator;
This returns the following:
+-------------+
| accumulator |
+-------------+
| 8 |
+-------------+
The following example creates the procedure SelectFromTablesAndAppend
, which
takes target_date
as an input argument and returns rows_added
as an output.
The procedure creates a temporary table DataForTargetDate
from a query; then,
it calculates the number of rows in DataForTargetDate
and assigns the result
to rows_added
. Next, it inserts a new row into TargetTable
, passing the
value of target_date
as one of the column names. Finally, it drops the table
DataForTargetDate
and returns rows_added
.
CREATE PROCEDURE mydataset.SelectFromTablesAndAppend(
target_date DATE, OUT rows_added INT64)
BEGIN
CREATE TEMP TABLE DataForTargetDate AS
SELECT t1.id, t1.x, t2.y
FROM dataset.partitioned_table1 AS t1
JOIN dataset.partitioned_table2 AS t2
ON t1.id = t2.id
WHERE t1.date = target_date
AND t2.date = target_date;
SET rows_added = (SELECT COUNT(*) FROM DataForTargetDate);
SELECT id, x, y, target_date -- note that target_date is a parameter
FROM DataForTargetDate;
DROP TABLE DataForTargetDate;
END;
The following example declares a variable rows_added
, then passes it as an
argument to the SelectFromTablesAndAppend
procedure from the previous example,
along with the value of CURRENT_DATE
; then it returns a message stating how
many rows were added.
DECLARE rows_added INT64;
CALL mydataset.SelectFromTablesAndAppend(CURRENT_DATE(), rows_added);
SELECT FORMAT('Added %d rows', rows_added);
CREATE ROW ACCESS POLICY
statement
Creates or replaces a row-level access policy. Row-level access policies on a table must have unique names.
Syntax
CREATE [ OR REPLACE ] ROW ACCESS POLICY [ IF NOT EXISTS ]
row_access_policy_name ON table_name
[GRANT TO (grantee_list)]
FILTER USING (filter_expression);
Arguments
IF NOT EXISTS
: If any row-level access policy exists with the same name, theCREATE
statement has no effect. Cannot appear withOR REPLACE
.row_access_policy_name
: The name of the row-level access policy that you are creating. The row-level access policy name must be unique for each table. The row-level access policy name can contain the following:- Up to 256 characters.
- Letters (upper or lowercase), numbers, and underscores. Must start with a letter.
table_name
: The name of the table that you want to create a row-level access policy for. The table must already exist.GRANT TO grantee_list
: An optional clause that specifies the initial members that the row-level access policy should be created with.grantee_list
is a list ofiam_member
users or groups. Strings must be valid IAM principals, or members, following the format of an IAM Policy Binding member, and must be quoted. The following types are supported:grantee_list
typesuser:{emailid}
An email address that represents a specific Google account.
Example:
user:alice@example.com
serviceAccount:{emailid}
An email address that represents a service account.
Example:
serviceAccount:my-other-app@appspot.gserviceaccount.com
group:{emailid}
An email address that represents a Google group.
Example:
group:admins@example.com
domain:{domain}
The Google Workspace domain (primary) that represents all the users of that domain.
Example:
domain:example.com
allAuthenticatedUsers
A special identifier that represents all service accounts and all users on the internet who have authenticated with a Google Account. This identifier includes accounts that aren't connected to a Google Workspace or Cloud Identity domain, such as personal Gmail accounts. Users who aren't authenticated, such as anonymous visitors, aren't included. allUsers
A special identifier that represents anyone who is on the internet, including authenticated and unauthenticated users. Because BigQuery requires authentication before a user can access the service, allUsers
includes only authenticated users.You can combine a series of
iam_member
values, if they are comma-separated and quoted separately. For example:"user:alice@example.com","group:admins@example.com","user:sales@example.com"
filter_expression
: Defines the subset of table rows to show only to the members of thegrantee_list
. Thefilter_expression
is similar to theWHERE
clause in aSELECT
query.The following are valid filter expressions:
- BigQuery standard SQL scalar functions, aggregate functions, analytic functions.
SESSION_USER()
, to restrict access only to rows that belong to the user running the query. If none of the row-level access policies are applicable to the querying user, then the user has no access to the data in the table.TRUE
. Grants the principals in thegrantee_list
field access to all rows of the table.
The filter expression cannot contain the following:
- A reference to a table.
- Subqueries, or SQL statements such as
SELECT
,CREATE
, orUPDATE
. - User-defined functions.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.rowAccessPolicies.create |
The target table. |
bigquery.rowAccessPolicies.setIamPolicy |
The target table. |
bigquery.tables.getData |
The target table. |
Examples
CREATE CAPACITY
statement
Purchases slots by creating a new capacity commitment.
Syntax
CREATE CAPACITY project_id.location_id.commitment_id AS JSON capacity_json_object
Arguments
project_id
: The project ID of the administration project that will maintain ownership of this commitment.location_id
The location of the project.commitment_id
: Tthe ID of the commitment. The value must be unique to the project and location. It must start and end with a lowercase letter or a number and contain only lowercase letters, numbers and dashes.capacity_json_object
: A JSON string that describes the capacity commitment.
capacity_json_object
Specifies a JSON object that contains the following fields:
NAME |
TYPE |
Details |
---|---|---|
plan |
String | The commitment plan to purchase. Supported values include:
FLEX , MONTHLY , ANNUAL . For more
information, see
Commitment
plans. |
renewal_plan |
String | The commitment renewal plan. Applies only when plan
is ANNUAL . For more information, see
Renewing commitments. |
slot_count |
Integer | The number of slots in the commitment. |
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.capacityCommitments.create
|
The administration project that maintains ownership of the commitments. |
Example
The following example creates a capacity commitment of 100 Flex slots that are
located in the region-us
region and managed by a project admin_project
:
CREATE CAPACITY `admin_project.region-us.my-commitment`
AS JSON """{
"slot_count": 100,
"plan": "FLEX"
}"""
CREATE RESERVATION
statement
Creates a reservation. For more information, see Introduction to Reservations.
Syntax
CREATE RESERVATION project_id.location_id.reservation_id AS JSON reservation_json_object
Arguments
project_id
: The project ID of the administration project where the capacity commitment was created.location_id
: The location of the project.reservation_id
: The reservation ID.reservation_json_object
: A JSON string that describes the reservation.
reservation_json_object
Specifies a JSON object that contains the following fields:
NAME |
TYPE |
Details |
---|---|---|
ignore_idle_slots |
Boolean | If the value is true , then the reservation uses only the
slots that are provisioned to it. The default value is false .
For more information, see
Idle slots. |
slot_capacity |
Integer | The number of slots to allocate to the reservation. |
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.reservations.create
|
The administration project that maintains ownership of the commitments. |
Example
The following example creates a reservation of 100 slots in the project
admin_project
:
CREATE RESERVATION `admin_project.region-us.prod`
AS JSON """{
"slot_capacity": 100
}"""
CREATE ASSIGNMENT
statement
Assigns a project, folder, or organization to a reservation.
Syntax
CREATE ASSIGNMENT project_id.location_id.reservation_id.assignment_id AS JSON assignment_json_object
Arguments
project_id
The project ID of the administration project where the reservation was created.location_id
: The location of the project.reservation_id
: The reservation ID.assignment_id
: The ID of the assignment. The value must be unique to the project and location. It must start and end with a lowercase letter or a number and contain only lowercase letters, numbers and dashes.assignment_json_object
: A JSON string that describes the assignment.
To remove a project from any reservations and use on-demand billing instead, set
reservation_id
to none
.
assignment_json_object
Specifies a JSON object that contains the following fields:
NAME |
TYPE |
Details |
---|---|---|
assignee |
String | The ID of the project, folder, or organization to assign to the reservation. |
job_type |
String | The type of job to assign to this reservation. Supported values include
QUERY , PIPELINE , and ML_EXTERNAL .
For more information, see
Assignments.
|
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.reservationAssignments.create
|
The administration project and the assignee. |
Example
The following example assigns the project my_project
to the prod
reservation
for query jobs:
CREATE ASSIGNMENT `admin_project.region-us.prod.my_assignment`
AS JSON """{
"assignee": "projects/my_project",
"job_type": "QUERY"
}"""
The following example assigns an organization to the prod
reservation for
pipeline jobs, such as load and export jobs:
CREATE ASSIGNMENT `admin_project.region-us.prod.my_assignment`
AS JSON """{
"assignee": "organizations/1234",
"job_type": "PIPELINE"
}"""
CREATE SEARCH INDEX
statement
Creates a new search index on one or more columns of a table.
A search index enables efficient queries using the
SEARCH
function.
Syntax
CREATE SEARCH INDEX [ IF NOT EXISTS ] index_name ON table_name({ALL COLUMNS | column_name [, ...]})
Arguments
IF NOT EXISTS
: If there is already an index by that name on the table, do nothing. If the table has an index by a different name, then return an error.index_name
: The name of the index you're creating. Since the index is always created in the same project and dataset as the base table, there is no need to specify these in the name.table_name
: The name of the table. See Table path syntax.ALL COLUMNS
: Creates an index on every column in the table which contains aSTRING
field.column_name
: The name of a top-level column in the table which is aSTRING
or contains aSTRING
field. The column must be one of the following types:STRING
ARRAY<STRING>
STRUCT
containing at least one nested field of typeSTRING
orARRAY<STRING>
JSON
Details
You can create only one index per base table. You cannot create an index on a
view or materialized view. To modify which columns are
indexed, DROP
the current index and create a new one.
BigQuery returns an error if any column_name
is not a STRING
or does not contain a STRING
field, or if you call CREATE SEARCH INDEX
on
ALL COLUMNS
of a table which contains no STRING
fields.
Creating an index will fail on a table which has column ACLs or row filters; however, these may all be added to the table after creation of the index.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.createIndex |
The base table where you create the index. |
Examples
The following example creates an index called my_index
on all string columns
of my_table
. In this case, the index is only created on column a
.
CREATE TABLE dataset.my_table(a STRING, b INT64); CREATE SEARCH INDEX my_index ON dataset.my_table(ALL COLUMNS);
The following example creates an index on columns a
, my_struct.string_field
,
and b
.
CREATE TABLE dataset.complex_table(
a STRING,
my_struct STRUCT<string_field STRING, int_field INT64>,
b ARRAY<STRING>
);
CREATE SEARCH INDEX my_index
ON dataset.complex_table(a, my_struct, b);
ALTER SCHEMA SET DEFAULT COLLATE
statement
Sets collation specifications on a dataset.
Syntax
ALTER SCHEMA [IF EXISTS] [project_name.]dataset_name SET DEFAULT COLLATE collate_specification
Arguments
IF EXISTS
: If no dataset exists with that name, the statement has no effect.DEFAULT COLLATE collate_specification
: When a new table is created in the schema, the table inherits a default collation specification unless a collation specification is explicitly specified for a column.The updated collation specification only applies to tables created afterwards. If you want to update an existing collation specification, you must alter the column that contains the specification.
project_name
: The name of the project that contains the dataset. Defaults to the project that runs this DDL statement.dataset_name
: The name of the dataset.collate_specification
: Specifies the collation specifications to set.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.datasets.get |
The dataset to alter. |
bigquery.datasets.update |
The dataset to alter. |
Example
Assume you have an existing table, mytable_a
, in a schema called mydataset
.
For example:
CREATE SCHEMA mydataset
CREATE TABLE mydataset.mytable_a ( number INT64, word STRING )
+----------------------+
| mydataset.mytable_a |
| number INT64 |
| word STRING |
+----------------------+
At a later time, you decide to add a collation specification to your schema. For example:
ALTER SCHEMA mydataset SET DEFAULT COLLATE 'und:ci'
If you create a new table for your schema, it inherits COLLATE 'und:ci'
for
all STRING
columns. For example, collation is added to characters
when you create the mytable_b
table in the mydataset
schema:
CREATE TABLE mydataset.mytable_b ( amount INT64, characters STRING )
+--------------------------------------+
| mydataset.mytable_b |
| amount INT64 |
| characters STRING COLLATE 'und:ci' |
+--------------------------------------+
However, although you have updated the collation specification for the schema,
your existing table, mytable_a
, continues to use the previous
collation specification. For example:
+---------------------+
| mydataset.mytable_a |
| number INT64 |
| word STRING |
+---------------------+
ALTER SCHEMA SET OPTIONS
statement
Sets options on a dataset.
The statement runs in the location of the dataset if the dataset exists, unless you specify the location in the query settings. For more information, see Specifying your location.
Syntax
ALTER SCHEMA [IF EXISTS] [project_name.]dataset_name SET OPTIONS(schema_set_options_list)
Arguments
IF EXISTS
: If no dataset exists with that name, the statement has no effect.project_name
: The name of the project that contains the dataset. Defaults to the project that runs this DDL statement.dataset_name
: The name of the dataset.schema_set_options_list
: The list of options to set.
schema_set_options_list
The option list specifies options for the dataset. Specify the options in the
following format: NAME=VALUE, ...
The following options are supported:
NAME |
VALUE |
Details |
---|---|---|
default_kms_key_name |
STRING |
Specifies the default Cloud KMS key for encrypting table data in this dataset. You can override this value when you create a table. |
default_partition_expiration_days |
FLOAT64 |
Specifies the default expiration time, in days, for table partitions in this dataset. You can override this value when you create a table. |
default_table_expiration_days |
FLOAT64 |
Specifies the default expiration time, in days, for tables in this dataset. You can override this value when you create a table. |
description |
STRING |
The description of the dataset. |
friendly_name |
STRING |
A descriptive name for the dataset. |
labels |
<ARRAY<STRUCT<STRING, STRING>>> |
An array of labels for the dataset, expressed as key-value pairs. |
location |
STRING |
The location in which to create the dataset. If you don't specify this option, the dataset is created in the location where the query runs. If you specify this option and also explicitly set the location for the query job, the two values must match; otherwise the query fails. |
max_time_travel_hours |
SMALLINT |
In preview. Specifies the duration in hours of the time travel window for the
dataset. The For more information on the time travel window, see Configuring the time travel window. |
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.datasets.get |
The dataset to alter. |
bigquery.datasets.update |
The dataset to alter. |
Example
The following example sets the default table expiration.
ALTER SCHEMA mydataset SET OPTIONS( default_table_expiration_days=3.75 )
ALTER TABLE SET OPTIONS
statement
Sets the options on a table.
Syntax
ALTER TABLE [IF EXISTS] table_name SET OPTIONS(table_set_options_list)
Arguments
IF EXISTS
: If no table exists with that name, the statement has no effect.table_name
: The name of the table to alter. See Table path syntax.table_set_options_list
: The list of options to set.
Details
This statement is not supported for external tables.
table_set_options_list
The option list allows you to set table options such as a label and an expiration time. You can include multiple options using a comma-separated list.
Specify a table option list in the following format:
NAME=VALUE, ...
NAME
and VALUE
must be one of the following combinations:
NAME |
VALUE |
Details |
---|---|---|
expiration_timestamp |
TIMESTAMP |
Example: This property is equivalent to the expirationTime table resource property. |
partition_expiration_days |
|
Example: Sets the partition expiration in days. For more information, see Set the partition expiration. By default, partitions do not expire. This property is equivalent to the timePartitioning.expirationMs table resource property but uses days instead of milliseconds. One day is equivalent to 86400000 milliseconds, or 24 hours. This property can only be set if the table is partitioned. |
require_partition_filter |
|
Example: Specifies whether queries on this table must include a a predicate
filter that filters on the partitioning column. For more information,
see
Set partition filter requirements. The default value is
This property is equivalent to the timePartitioning.requirePartitionFilter table resource property. This property can only be set if the table is partitioned. |
kms_key_name |
|
Example: This property is equivalent to the encryptionConfiguration.kmsKeyName table resource property. See more details about Protecting data with Cloud KMS keys. |
friendly_name |
|
Example: This property is equivalent to the friendlyName table resource property. |
description |
|
Example: This property is equivalent to the description table resource property. |
labels |
|
Example: This property is equivalent to the labels table resource property. |
VALUE
is a constant expression containing only literals, query parameters,
and scalar functions.
The constant expression cannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT
,CREATE
, orUPDATE
- User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRING
REPLACE
REGEXP_REPLACE
RAND
FORMAT
LPAD
RPAD
REPEAT
SESSION_USER
GENERATE_ARRAY
GENERATE_DATE_ARRAY
Setting the value replaces the existing value of that option for the table, if
there was one. Setting the value to NULL
clears the table's value for that
option.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The table to alter. |
bigquery.tables.update |
The table to alter. |
Examples
Setting the expiration timestamp and description on a table
The following example sets the expiration timestamp on a table to seven days
from the execution time of the ALTER TABLE
statement, and sets the description
as well:
ALTER TABLE mydataset.mytable SET OPTIONS ( expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 7 DAY), description="Table that expires seven days from now" )
Setting the require partition filter attribute on a partitioned table
The following example sets the
timePartitioning.requirePartitionFilter
attribute on a partitioned table:
ALTER TABLE mydataset.mypartitionedtable SET OPTIONS (require_partition_filter=true)
Queries that reference this table must use a filter on the partitioning column,
or else BigQuery returns an error. Setting this option to true
can help prevent mistakes in querying more data than intended.
Clearing the expiration timestamp on a table
The following example clears the expiration timestamp on a table so that it will not expire:
ALTER TABLE mydataset.mytable SET OPTIONS (expiration_timestamp=NULL)
ALTER TABLE ADD COLUMN
statement
Adds one or more new columns to an existing table schema.
Syntax
ALTER TABLE table_name
ADD COLUMN [IF NOT EXISTS] column[, ...]
Arguments
table_name
: The name of the table. See Table path syntax.IF EXISTS
: If the column name already exists, the statement has no effect.column
: The column to add. This include the name of the column and schema to add. This column name and schema use the same syntax used inCREATE TABLE
statement.
Details
You cannot use this statement to create:
- Partitioned columns.
- Clustered columns.
- Nested columns inside existing
RECORD
fields.
You cannot add a REQUIRED
column to an existing table schema. However, you
can create a nested REQUIRED
column as part of a new RECORD
field.
This statement is not supported for external tables.
Without the IF NOT EXISTS
clause, if the table already contains a column with
that name, the statement returns an error. If the IF NOT EXISTS
clause is
included and the column name already exists, no error is returned, and no
action is taken.
The value of the new column for existing rows is set to one of the following:
NULL
if the new column was added withNULLABLE
mode. This is the default mode.- An empty
ARRAY
if the new column was added withREPEATED
mode.
For more information about schema modifications in BigQuery, see Modifying table schemas.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The table to alter. |
bigquery.tables.update |
The table to alter. |
Examples
Adding columns
The following example adds the following columns to an existing table named
mytable
:
- Column
A
of typeSTRING
. - Column
B
of typeGEOGRAPHY
. - Column
C
of typeNUMERIC
withREPEATED
mode. - Column
D
of typeDATE
with a description.
ALTER TABLE mydataset.mytable
ADD COLUMN A STRING,
ADD COLUMN IF NOT EXISTS B GEOGRAPHY,
ADD COLUMN C ARRAY<NUMERIC>,
ADD COLUMN D DATE OPTIONS(description="my description")
If any of the columns named A
, C
, or D
already exist, the statement fails.
If column B
already exists, the statement succeeds because of the IF NOT
EXISTS
clause.
Adding a RECORD
column
The following example adds a column named A
of type STRUCT
that contains the
following nested columns:
- Column
B
of typeGEOGRAPHY
. - Column
C
of typeINT64
withREPEATED
mode. - Column
D
of typeINT64
withREQUIRED
mode. - Column
E
of typeTIMESTAMP
with a description.
ALTER TABLE mydataset.mytable
ADD COLUMN A STRUCT<
B GEOGRAPHY,
C ARRAY<INT64>,
D INT64 NOT NULL,
E TIMESTAMP OPTIONS(description="creation time")
>
The query fails if the table already has a column named A
, even if that
column does not contain any of the nested columns that are specified.
The new STRUCT
named A
is nullable, but the nested column D
within A
is
required for any STRUCT
values of A
.
Adding collation support to a column
When you create a new column for your table, you can specifically assign a new collation specification to that column.
ALTER TABLE mydataset.mytable ADD COLUMN word STRING COLLATE 'und:ci'
ALTER TABLE RENAME TO
statement
Renames a clone, snapshot or table.
Syntax
ALTER TABLE [IF EXISTS] table_name RENAME TO new_table_name
Arguments
IF EXISTS
: If no table exists with that name, the statement has no effect.table_name
: The name of the table to rename. See Table path syntax.new_table_name
: The new name of the table. The new name cannot be an existing table name.
Details
- This statement is not supported for external tables.
- If you change table policies or row-level access policies when you rename the table, then those changes might not be effective.
- If you want to rename a table that has data streaming into it, you must stop the streaming and wait for BigQuery to indicate that streaming is not in use.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The table to alter. |
bigquery.tables.update |
The table to alter. |
Examples
Renaming a table
The following example renames the table mydataset.mytable
to
mydataset.mynewtable
:
ALTER TABLE mydataset.mytable RENAME TO mynewtable
ALTER TABLE DROP COLUMN
statement
Drops one or more columns from an existing table schema.
Syntax
ALTER TABLE table_name
DROP COLUMN [IF EXISTS] column_name [, ...]
Arguments
table_name
: The name of the table to alter. See Table path syntax. The table must already exist and have a schema.IF EXISTS
: If the specified column does not exist, the statement has no effect.column_name
: The name of the column to drop.
Details
The statement does not immediately free up the storage that's associated with the dropped column. Storage is claimed in the background over the period of 7 days from the day that a column is dropped.
For information about immediately reclaiming storage, see Deleting a column from a table schema.
You cannot use this statement to drop the following:
- Partitioned columns
- Clustered columns
- Nested columns inside existing
RECORD
fields
This statement is not supported for external tables.
Without the IF EXISTS
clause, if the table does not contain a column with that
name, then the statement returns an error. If the IF EXISTS
clause is included and
the column name does not exist, then no error is returned, and no action is taken.
This statement only removes the column from the table. Any objects that refer to the column, such as views or materialized views, must be updated or recreated separately.
For more information about schema modifications in BigQuery, see Modifying table schemas.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The table to alter. |
bigquery.tables.update |
The table to alter. |
Examples
Dropping columns
The following example drops the following columns from an existing table named
mytable
:
- Column
A
- Column
B
ALTER TABLE mydataset.mytable
DROP COLUMN A,
DROP COLUMN IF EXISTS B
If the column named A
does not exist, then the statement fails. If column B
does not exist, then the statement still succeeds because of the IF EXISTS
clause.
ALTER TABLE SET DEFAULT COLLATE
statement
Sets collation specifications on a table.
Syntax
ALTER TABLE table_name SET DEFAULT COLLATE collate_specification
Arguments
table_name
: The name of the table to alter. See Table path syntax. The table must already exist and have a schema.SET DEFAULT COLLATE collate_specification
: When a new column is created in the schema, and if the column does not have an explicit collation specification, the column inherits this colloation specification forSTRING
types. The updated collation specification only applies to columns added afterwards.If you want to update an existing collation specification, you must alter the column that contains the specification. If you want to add a collation specification on a new column in an existing table, you can do this when you add the column. If you add a collation specification directly on a column, the collation specification for the column has precedence over a table's default collation specification.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The table to alter. |
bigquery.tables.update |
The table to alter. |
Example
Assume you have an existing table, mytable
, in a schema called mydataset
.
CREATE TABLE mydataset.mytable ( number INT64, word STRING ) DEFAULT COLLATE 'und:ci'
When you create mytable
, all STRING
columns inherit COLLATE 'und:ci'
.
The resulting table has this structure:
+--------------------------------+
| mydataset.mytable |
| number INT64 |
| word STRING COLLATE 'und:ci' |
+--------------------------------+
At a later time, you decide to change the collation specification for your table.
ALTER TABLE mydataset.mytable SET DEFAULT COLLATE ''
Although you have updated the collation specification, your existing column,
word
, continues to use the previous collation specification.
+--------------------------------+
| mydataset.mytable |
| number INT64 |
| word STRING COLLATE 'und:ci' |
+--------------------------------+
However, if you create a new column for your table, the new column includes the
new collation specification. In the following example a column called name
is added. Because the new collation specification is empty, the default
collation specification is used.
ALTER TABLE mydataset.mytable ADD COLUMN name STRING
+--------------------------------+
| mydataset.mytable |
| number INT64 |
| word STRING COLLATE 'und:ci' |
| name STRING COLLATE |
+--------------------------------+
ALTER COLUMN SET OPTIONS
statement
Sets options, such as the column description, on a column in a table in BigQuery.
Syntax
ALTER TABLE [IF EXISTS] table_name ALTER COLUMN [IF EXISTS] column_name SET OPTIONS(column_set_options_list)
Arguments
(ALTER TABLE) IF EXISTS
: If no table exists with that name, the statement has no effect.table_name
: The name of the table to alter. See Table path syntax.(ALTER COLUMN) IF EXISTS
: If the specified column does not exist, the statement has no effect.column_name
: The name of the top level column you're altering. Modifying subfields, such as nested columns in aSTRUCT
, is not supported.column_set_options_list
: The list of options to set on the column.
Details
This statement is not supported for external tables.
column_set_options_list
Specify a column option list in the following format:
NAME=VALUE, ...
NAME
and VALUE
must be one of the following combinations:
NAME |
VALUE |
Details |
---|---|---|
description |
|
Example: |
VALUE
is a constant expression containing only literals, query parameters,
and scalar functions.
The constant expression cannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT
,CREATE
, orUPDATE
- User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRING
REPLACE
REGEXP_REPLACE
RAND
FORMAT
LPAD
RPAD
REPEAT
SESSION_USER
GENERATE_ARRAY
GENERATE_DATE_ARRAY
Setting the VALUE
replaces the existing value of that option for the column, if
there was one. Setting the VALUE
to NULL
clears the column's value for that
option.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The table to alter. |
bigquery.tables.update |
The table to alter. |
Examples
The following example sets a new description on a column called price
:
ALTER TABLE mydataset.mytable ALTER COLUMN price SET OPTIONS ( description="Price per unit" )
ALTER COLUMN DROP NOT NULL
statement
Removes a NOT NULL
constraint from a column in a table in BigQuery.
Syntax
ALTER TABLE [IF EXISTS] table_name ALTER COLUMN [IF EXISTS] column DROP NOT NULL
Arguments
(ALTER TABLE) IF EXISTS
: If no table exists with that name, the statement has no effect.table_name
: The name of the table to alter. See Table path syntax.(ALTER COLUMN) IF EXISTS
: If the specified column does not exist, the statement has no effect.column_name
: The name of the top level column you're altering. Modifying subfields is not supported.
Details
If a column does not have a NOT NULL
constraint the query returns an error.
This statement is not supported for external tables.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The table to alter. |
bigquery.tables.update |
The table to alter. |
Examples
The following example removes the NOT NULL
constraint from a column called mycolumn
:
ALTER TABLE mydataset.mytable ALTER COLUMN mycolumn DROP NOT NULL
ALTER COLUMN SET DATA TYPE
statement
Changes the data type of a column in a table in BigQuery
to a less restrictive data type. For example, a NUMERIC
data type can be changed
to a BIGNUMERIC
type but not the reverse.
Syntax
ALTER TABLE [IF EXISTS] table_name ALTER COLUMN [IF EXISTS] column_name SET DATA TYPE column_schema
Arguments
(ALTER TABLE) IF EXISTS
: If no table exists with that name, the statement has no effect.table_name
: The name of the table to alter. See Table path syntax.(ALTER COLUMN) IF EXISTS
: If the specified column does not exist, the statement has no effect.column_name
: The name of the top level column you're altering. Modifying subfields is not supported.column_schema
: The schema that you're converting the column to. This schema uses the same syntax used in theCREATE TABLE
statement.
Details
For a table of valid data type coercions, compare the "From Type" column to the "Coercion To" column in the Conversion rules in Standard SQL page.
The following are examples of valid data type coercions:
- INT64 to NUMERIC, BIGNUMERIC, FLOAT64
- NUMERIC to BIGNUMERIC, FLOAT64
This statement is not supported for external tables.
Without the IF EXISTS clause, if the table does not contain a column with that name, the statement returns an error. If the IF EXISTS clause is included and the column name does not exist, no error is returned, and no action is taken.
You can also coerce data types from more restrictive to less restrictive parameterized data types. For example, you can increase the maximum length of a string type or increase the precision or scale of a numeric type.
The following are examples of valid parameterized data type changes:
- NUMERIC(6,10) to NUMERIC(8,12)
- NUMERIC to BIGNUMERIC(40, 20)
- STRING(5) to STRING(7)
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The table to alter. |
bigquery.tables.update |
The table to alter. |
Examples
Changing the data type for a column
The following example changes the data type of column c1
from an INT64
to NUMERIC
:
CREATE TABLE dataset.table(c1 INT64); ALTER TABLE dataset.table ALTER COLUMN c1 SET DATA TYPE NUMERIC;
Changing the data type for a field
The following example changes the data type of one of the fields in the s1
column:
CREATE TABLE dataset.table(s1 STRUCT<a INT64, b STRING>); ALTER TABLE dataset.table ALTER COLUMN s1 SET DATA TYPE STRUCT<a NUMERIC, b STRING>;
Changing precision
The following example changes the precision of a parameterized data type column:
CREATE TABLE dataset.table (pt NUMERIC(7,2)); ALTER TABLE dataset.table ALTER COLUMN pt SET DATA TYPE NUMERIC(8,2);
ALTER VIEW SET OPTIONS
statement
Sets the options on a view.
Syntax
ALTER VIEW [IF EXISTS] view_name SET OPTIONS(view_set_options_list)
Arguments
IF EXISTS
: If no view exists with that name, the statement has no effect.view_name
: The name of the view to alter. See Table path syntax.view_set_options_list
: The list of options to set.
view_set_options_list
The option list allows you to set view options such as a label and an expiration time. You can include multiple options using a comma-separated list.
Specify a view option list in the following format:
NAME=VALUE, ...
NAME
and VALUE
must be one of the following combinations:
NAME |
VALUE |
Details |
---|---|---|
expiration_timestamp |
TIMESTAMP |
Example: This property is equivalent to the expirationTime table resource property. |
friendly_name |
|
Example: This property is equivalent to the friendlyName table resource property. |
description |
|
Example: This property is equivalent to the description table resource property. |
labels |
|
Example: This property is equivalent to the labels table resource property. |
VALUE
is a constant expression containing only literals, query parameters,
and scalar functions.
The constant expression cannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT
,CREATE
, orUPDATE
- User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRING
REPLACE
REGEXP_REPLACE
RAND
FORMAT
LPAD
RPAD
REPEAT
SESSION_USER
GENERATE_ARRAY
GENERATE_DATE_ARRAY
Setting the value replaces the existing value of that option for the view, if
there was one. Setting the value to NULL
clears the view's value for that
option.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The view to alter. |
bigquery.tables.update |
The view to alter. |
Examples
Setting the expiration timestamp and description on a view
The following example sets the expiration timestamp on a view to seven days
from the execution time of the ALTER VIEW
statement, and sets the description
as well:
ALTER VIEW mydataset.myview SET OPTIONS ( expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 7 DAY), description="View that expires seven days from now" )
ALTER MATERIALIZED VIEW SET OPTIONS
statement
Sets the options on a materialized view.
Syntax
ALTER MATERIALIZED VIEW [IF EXISTS] materialized_view_name SET OPTIONS(materialized_view_set_options_list)
Arguments
IF EXISTS
: If no materialized view exists with that name, the statement has no effect.materialized_view_name
: The name of the materialized view to alter. See Table path syntax.materialized_view_set_options_list
: The list of options to set.
materialized_view_set_options_list
The option list allows you to set materialized view options such as a whether refresh is enabled. the refresh interval, a label and an expiration time. You can include multiple options using a comma-separated list.
Specify a materialized view option list in the following format:
NAME=VALUE, ...
NAME
and VALUE
must be one of the following combinations:
NAME |
VALUE |
Details |
---|---|---|
enable_refresh |
BOOLEAN |
Example: |
refresh_interval_minutes |
FLOAT64 |
Example: |
expiration_timestamp |
TIMESTAMP |
Example: This property is equivalent to the expirationTime table resource property. |
friendly_name |
|
Example: This property is equivalent to the friendlyName table resource property. |
description |
|
Example: This property is equivalent to the description table resource property. |
labels |
|
Example: This property is equivalent to the labels table resource property. |
Setting the value replaces the existing value of that option for the
materialized view, if there was one. Setting the value to NULL
clears the
materialized view's value for that option.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.get |
The materialized view to alter. |
bigquery.tables.update |
The materialized view to alter. |
Examples
Setting the enable refresh state and refresh interval on a materialized view
The following example enables refresh and sets the refresh interval to 20 minutes on a materialized view:
ALTER MATERIALIZED VIEW mydataset.my_mv
SET OPTIONS (
enable_refresh=true,
refresh_interval_minutes=20
)
ALTER ORGANIZATION SET OPTIONS
statement
Sets the options on an organization.
Syntax
ALTER ORGANIZATION SET OPTIONS ( organization_set_options_list);
Arguments
organization_set_options_list
: The list of options to set.
organization_set_options_list
The option list specifies options for the organization. Specify the options in the
following format: NAME=VALUE, ...
The following options are supported:
NAME |
VALUE |
Details |
---|---|---|
default_kms_key_name |
STRING |
Example: This property is equivalent to the
See more details about protecting data with Cloud KMS keys. |
default_time_zone |
STRING |
Example: `region-us.default_time_zone` = "America/Los_Angeles" . Sets the default time zone to America/Los_Angeles in the us region. |
default_query_job_timeout_ms |
INT64 |
Example: `region-us.default_query_job_timeout_ms` = 1800000 . Sets the default query job timeout time to 30 minutes for an organization in the us region. |
Setting the value replaces the existing value of that option for the
organization, if there is one. Setting the value to NULL
clears the
organization's value for that option.
Required permissions
The ALTER ORGANIZATION SET OPTIONS
statement requires the following
IAM permissions:
Permission | Resource |
---|---|
bigquery.config.update |
The organization to alter. |
Examples
The following example sets the default time zone to America/Chicago and the default query job timeout to one hour for an organization in the US region.
ALTER ORGANIZATION
SET OPTIONS (
`region-us.default_time_zone` = "America/Chicago",
`region-us.default_job_query_timeout_ms` = 3600000
);
ALTER PROJECT SET OPTIONS
statement
Sets the options on a project.
Syntax
ALTER PROJECT project_id SET OPTIONS (project_set_options_list);
Arguments
project_id
: The name of the project you're altering. Defaults to the project that runs this DDL query.project_set_options_list
: The list of options to set.
project_set_options_list
The option list specifies options for the project. Specify the options in the
following format: NAME=VALUE, ...
The following options are supported:
NAME |
VALUE |
Details |
---|---|---|
default_kms_key_name |
STRING |
Example: This property is equivalent to the
See more details about protecting data with Cloud KMS keys. |
default_time_zone |
STRING |
Example: `region-us.default_time_zone` = "America/Los_Angeles" . Sets the default time zone to America/Los_Angeles in the us region. |
default_query_job_timeout_ms |
INT64 |
Example: `region-us.default_query_job_timeout_ms` = 1800000 . Sets the default query job timeout time to 30 minutes for a project in the us region. |
Setting the value replaces the existing value of that option for the project, if there was one. Setting the value to NULL
clears the
project's value for that option.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.config.update |
The project to alter. |
Examples
The following example sets the default time zone to America/New_York
and the default query job timeout to 30 minutes for a project in the us
region.
ALTER PROJECT project_id
SET OPTIONS (
`region-us.default_time_zone` = "America/New_York",
`region-us.default_job_query_timeout_ms` = 1800000
);
ALTER BI_CAPACITY SET OPTIONS
statement
Sets the options on BigQuery BI Engine capacity.
Syntax
ALTER BI_CAPACITY `project_id.location_id.default` SET OPTIONS(bi_capacity_options_list)
Arguments
project_id
: Optional project ID of the project that will benefit from BI Engine acceleration. If omitted, the query project ID is used.location_id
: The location where data needs to be cached, prefixed withregion-
. Examples:region-us
,region-us-central1
.bi_capacity_options_list
: The list of options to set.
bi_capacity_options_list
The option list specifies a set of options for BigQuery BI Engine capacity.
Specify a column option list in the following format:
NAME=VALUE, ...
The following options are supported:
NAME |
VALUE |
Details |
---|---|---|
size_gb |
INT64 |
Specifies the size of the reservation in gigabytes. |
preferred_tables |
<ARRAY<STRING>> |
List of tables that acceleration should be applied to. Format:
project.dataset.table or dataset.table . If project is omitted, query project
is used. |
Setting VALUE
replaces the existing value of that option for the BI Engine
capacity, if there is one. Setting VALUE
to NULL
clears the value
for that option.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.bireservations.update |
BI Engine reservation |
Examples
Allocating BI Engine capacity without preferred tables
ALTER BI_CAPACITY `my-project.region-us.default` SET OPTIONS( size_gb = 250 )
Deallocating BI capacity
ALTER BI_CAPACITY `my-project.region-us.default` SET OPTIONS( size_gb = 0 )
Removing a set of preferred tables from reservation
ALTER BI_CAPACITY `my-project.region-us.default` SET OPTIONS( preferred_tables = NULL )
Allocating BI Capacity with preferred tables list
ALTER BI_CAPACITY `my-project.region-us.default` SET OPTIONS( size_gb = 250, preferred_tables = ["data_project1.dataset1.table1", "data_project2.dataset2.table2"] )
Overwriting list of preferred tables without changing the size
ALTER BI_CAPACITY `region-us.default` SET OPTIONS( preferred_tables = ["dataset1.table1", "data_project2.dataset2.table2"] )
DROP SCHEMA
statement
Deletes a dataset.
Syntax
DROP SCHEMA [IF EXISTS]
[project_name.]dataset_name
[ CASCADE | RESTRICT ]
Arguments
IF EXISTS
: If no dataset exists with that name, the statement has no effect.project_name
: The name of the project that contains the dataset. Defaults to the project that runs this DDL statement.dataset_name
: The name of the dataset to delete.CASCADE
: Deletes the dataset and all resources within the dataset, such as tables, views, and functions. You must have permission to delete the resources, or else the statement returns an error. For a list of BigQuery permissions, see Predefined roles and permissions.RESTRICT
: Deletes the dataset only if it's empty. Otherwise, returns an error. If you don't specify eitherCASCADE
orRESTRICT
, then the default behavior isRESTRICT
.
Details
The statement runs in the location of the dataset if it exists, unless you specify the location in the query settings. For more information, see Specifying your location.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.datasets.delete |
The dataset to delete. |
bigquery.tables.delete
|
The dataset to delete. If the dataset is empty, then this permission is not required. |
Examples
The following example deletes the dataset named mydataset
. If the dataset does
not exist or is not empty, then the statement returns an error.
DROP SCHEMA mydataset
The following example drops the dataset named mydataset
and any resources
in that dataset. If the dataset does not exist, then no error is returned.
DROP SCHEMA IF EXISTS mydataset CASCADE
DROP TABLE
statement
Deletes a table or table clone.
Syntax
DROP TABLE [IF EXISTS] table_name
Arguments
IF EXISTS
: If no table exists with that name, the statement has no effect.table_name
: The name of the table to delete. See Table path syntax.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.delete |
The table to delete. |
bigquery.tables.get |
The table to delete. |
Examples
Deleting a table
The following example deletes a table named mytable
in the mydataset
:
DROP TABLE mydataset.mytable
If the table name does not exist in the dataset, the following error is returned:
Error: Not found: Table myproject:mydataset.mytable
Deleting a table only if the table exists
The following example deletes a table named mytable
in mydataset
only if
the table exists. If the table name does not exist in the dataset, no error is
returned, and no action is taken.
DROP TABLE IF EXISTS mydataset.mytable
DROP SNAPSHOT TABLE
statement
Deletes a table snapshot.
Syntax
DROP SNAPSHOT TABLE [IF EXISTS] table_snapshot_name
Arguments
IF EXISTS
: If no table snapshot exists with that name, then the statement has no effect.table_snapshot_name
: The name of the table snapshot to delete. See Table path syntax.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.deleteSnapshot |
The table snapshot to delete. |
Examples
Delete a table snapshot: fail if it doesn't exist
The following example deletes the table snapshot named mytablesnapshot
in the
mydataset
dataset:
DROP SNAPSHOT TABLE mydataset.mytablesnapshot
If the table snapshot does not exist in the dataset, then the following error is returned:
Error: Not found: Table snapshot myproject:mydataset.mytablesnapshot
Delete a table snapshot: ignore if it doesn't exist
The following example deletes the table snapshot named mytablesnapshot
in the
mydataset
dataset.
DROP SNAPSHOT TABLE IF EXISTS mydataset.mytablesnapshot
If the table snapshot doesn't exist in the dataset, then no action is taken, and no error is returned.
For information about creating table snapshots, see CREATE SNAPSHOT TABLE.
For information about restoring table snapshots, see CREATE TABLE CLONE.
DROP EXTERNAL TABLE
statement
Deletes an external table.
Syntax
DROP EXTERNAL TABLE [IF EXISTS] table_name
Arguments
IF EXISTS
: If no external table exists with that name, then the statement has no effect.table_name
: The name of the external table to delete. See Table path syntax.
Details
If table_name
exists but is not an external table, the statement returns the following
error:
Cannot drop table_name which has type TYPE. An
external table was expected.
The DROP EXTERNAL
statement only removes the external table definition from
BigQuery. The data stored in the external location is not
affected.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.delete |
The external table to delete. |
bigquery.tables.get |
The external table to delete. |
Examples
The following example drops the external table named external_table
from the
dataset mydataset
. It returns an error if the external table does not exist.
DROP EXTERNAL TABLE mydataset.external_table
The following example drops the external table named external_table
from the
dataset mydataset
. If the external table does not exist, no error is returned.
DROP EXTERNAL TABLE IF EXISTS mydataset.external_table
DROP VIEW
statement
Deletes a view.
Syntax
DROP VIEW [IF EXISTS] view_name
Arguments
IF EXISTS
: If no view exists with that name, the statement has no effect.view_name
: The name of the view to delete. See Table path syntax.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.delete |
The view to delete. |
bigquery.tables.get |
The view to delete. |
Examples
Deleting a view
The following example deletes a view named myview
in mydataset
:
DROP VIEW mydataset.myview
If the view name does not exist in the dataset, the following error is returned:
Error: Not found: Table myproject:mydataset.myview
Deleting a view only if the view exists
The following example deletes a view named myview
in mydataset
only if
the view exists. If the view name does not exist in the dataset, no error is
returned, and no action is taken.
DROP VIEW IF EXISTS mydataset.myview
DROP MATERIALIZED VIEW
statement
Deletes a materialized view.
Syntax
DROP MATERIALIZED VIEW [IF EXISTS] mv_name
Arguments
IF EXISTS
: If no materialized view exists with that name, the statement has no effect.mv_name
: The name of the materialized view to delete. See Table path syntax.
Required permissions
This statement requires the following IAM permissions:
Permission | Resource |
---|---|
bigquery.tables.delete |
The materialized view to delete. |
bigquery.tables.get |
The materialized view to delete. |
Examples
Deleting a materialized view
The following example deletes a materialized view named my_mv
in mydataset
:
DROP MATERIALIZED VIEW mydataset.my_mv
If the materialized view name does not exist in the dataset, the following error is returned:
Error: Not found: Table myproject:mydataset.my_mv
If you are deleting a materialized view in another project, you must specify the
project, dataset, and materialized view in the following format:
`project_id.dataset.materialized_view`
(including the backticks if project_id
contains special characters); for example,
`myproject.mydataset.my_mv`
.
Deleting a materialized view only if it exists
The following example deletes a materialized view named my_mv
in mydataset
only if the materialized view exists. If the materialized view name does not
exist in the dataset, no error is returned, and no action is taken.
DROP MATERIALIZED VIEW IF EXISTS mydataset.my_mv
If you are deleting a materialized view in another project, you must specify the
project, dataset, and materialized view in the following format:
`project_id.dataset.materialized_view`,
(including the backticks if project_id
contains special characters); for example,
`myproject.mydataset.my_mv`
.
DROP FUNCTION
statement
Deletes a persistent user-defined function (UDF).
Syntax
DROP FUNCTION [IF EXISTS] [[project_name.]dataset_name.]function_name
Arguments
IF EXISTS
: If no function exists with that name, the statement has no effect.project_name
: The name of the project containing the function to delete. Defaults to the project that runs this DDL query. If the project name contains special characters such as colons, it should be quoted in backticks`
(example:`google.com:my_project`
).