This document describes how to modify the schema definitions for existing BigQuery tables. BigQuery natively supports the following schema modifications:
- Adding columns to a schema definition
- Relaxing a column's mode from
REQUIRED
toNULLABLE
It is valid to create a table without defining an initial schema and to add a schema definition to the table at a later time.
All other schema modifications are unsupported and require manual workarounds, including:
- Changing a column's name
- Changing a column's data type
- Changing a column's mode (aside from relaxing
REQUIRED
columns toNULLABLE
) - Deleting a column
For information on unsupported schema changes that require workarounds, see Manually changing table schemas.
Adding columns to a table's schema definition
You can add columns to an existing table's schema definition:
- Manually (creates an empty column)
- When you use a load or query job to overwrite a table
- When you append data to a table using a load or query job
Any column you add must adhere to BigQuery's rules for column names. For more information on creating schema components, see Specifying a schema.
Manually adding an empty column
You can add an empty column to an existing table by:
- Using the Cloud Console
- Using the
bq
command-line tool'sbq update
command - Calling the
tables.patch
API method - Using the
ALTER TABLE ADD COLUMN
data definition language (DDL) statement. - Using the client libraries
If you add new columns to an existing table schema, the columns must be
NULLABLE
or REPEATED
. You cannot add a REQUIRED
column to an existing
table schema. If you attempt to add a REQUIRED
column to an existing table
schema in the API or bq
command-line tool, the following error is returned:
BigQuery error in update operation: Provided Schema does not
match Table project_id:dataset.table. Cannot add required columns to
an existing schema.
REQUIRED
columns can be added only when you
create a table while loading data, or when you create an empty table with a
schema definition.
After adding a new column to your table's schema definition, you can load data into the new column by using a:
- DML statement to perform a bulk update on every row
- Load job that overwrites the table
- Query result that overwrites the table
To add empty columns to a table's schema definition:
Console
In the Explorer panel, expand your project and dataset, then select the table.
In the details panel, click the Schema tab.
Click Edit schema. You might need to scroll to see this button.
In the Current schema page, under New fields, click Add field.
When you are done adding columns, click Save.
bq
Issue the bq update
command and provide a JSON schema file. If the table
you're updating is in a project other than your default project, add the
project ID to the dataset name in the following format:
project_id:dataset
.
bq update project_id:dataset.table schema
Where:
- project_id is your project ID.
- dataset is the name of the dataset that contains the table you're updating.
- table is the name of the table you're updating.
- schema is the path to the JSON schema file on your local machine.
When you specify the schema using the bq
command-line tool, you cannot include a
RECORD
(STRUCT
)
type, you cannot include a column description, and you cannot specify the
column's mode. All modes default to NULLABLE
.
If you attempt to add columns using an inline schema definition, you must
supply the entire schema definition including the new columns. Because you
cannot specify column modes using an inline schema definition, the update
will attempt to change any existing REQUIRED
column to NULLABLE
. This
results in the following error: BigQuery error in update
operation: Provided Schema does not match Table
project_id:dataset.table. Field field has changed mode
from REPEATED to NULLABLE.
The preferred method of adding columns to an existing table using the bq
command-line tool is
to supply a JSON schema file.
To add empty columns to a table's schema using a JSON schema file:
First, issue the
bq show
command with the--schema
flag and write the existing table schema to a file. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format:project_id:dataset
.bq show \ --schema \ --format=prettyjson \ project_id:dataset.table > schema_file
Where:
- project_id is your project ID.
- dataset is the name of the dataset that contains the table you're updating.
- table is the name of the table you're updating.
- schema_file is the schema definition file written to your local machine.
For example, to write the schema definition of
mydataset.mytable
to a file, enter the following command.mydataset.mytable
is in your default project.bq show \ --schema \ --format=prettyjson \ mydataset.mytable > /tmp/myschema.json
Open the schema file in a text editor. The schema should look like the following:
[ { "mode": "REQUIRED", "name": "column1", "type": "STRING" }, { "mode": "REQUIRED", "name": "column2", "type": "FLOAT" }, { "mode": "REPEATED", "name": "column3", "type": "STRING" } ]
Add the new columns to the end of the schema definition. If you attempt to add new columns elsewhere in the array, the following error is returned:
BigQuery error in update operation: Precondition Failed
.Using a JSON file, you can specify descriptions,
NULLABLE
orREPEATED
modes, andRECORD
types for new columns. For example, using the schema definition from the previous step, your new JSON array would look like the following. In this example, a newNULLABLE
column is added namedcolumn4
.column4
includes a description.[ { "mode": "REQUIRED", "name": "column1", "type": "STRING" }, { "mode": "REQUIRED", "name": "column2", "type": "FLOAT" }, { "mode": "REPEATED", "name": "column3", "type": "STRING" }, { "description": "my new column", "mode": "NULLABLE", "name": "column4", "type": "STRING" } ]
For more information on working with JSON schema files, see Specifying a JSON schema file.
After updating your schema file, issue the following command to update the table's schema. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format:
project_id:dataset
.bq update project_id:dataset.table schema
Where:
- project_id is your project ID.
- dataset is the name of the dataset that contains the table you're updating.
- table is the name of the table you're updating.
- schema is the path to the JSON schema file on your local machine.
For example, enter the following command to update the schema definition of
mydataset.mytable
in your default project. The path to the schema file on your local machine is/tmp/myschema.json
.bq update mydataset.mytable /tmp/myschema.json
API
Call the tables.patch
method and use the schema
property to add empty columns to your schema
definition. Because the tables.update
method replaces the entire table
resource, the tables.patch
method is preferred.
Go
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Node.js API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Python API reference documentation.
Adding a nested column to a RECORD
In addition to adding new columns to a table's schema, you can also add new
nested columns to a RECORD
. The process for adding a new nested column is very
similar to the process for adding a new column.
Console
Adding a new nested field to an exising RECORD
column is not currently
supported by the Cloud Console.
bq
Issue the bq update
command and provide a JSON schema file that adds the
nested field to the existing RECORD
column's schema definition. If the
table you're updating is in a project other than your default project, add
the project ID to the dataset name in the following format:
project_id:dataset
.
bq update project_id:dataset.table schema
Where:
- project_id is your project ID.
- dataset is the name of the dataset that contains the table you're updating.
- table is the name of the table you're updating.
- schema is the path to the JSON schema file on your local machine.
When you specify the schema using the bq
command-line tool, you cannot include a
RECORD
(STRUCT
)
type, you cannot include a column description, and you cannot specify the
column's mode. All modes default to NULLABLE
. As a result, if you are
adding a new nested column to a RECORD
, you must
supply a JSON schema file.
To add a nested column to a RECORD
using a JSON schema file:
First, issue the
bq show
command with the--schema
flag and write the existing table schema to a file. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format:project_id:dataset.table
.bq show \ --schema \ --format=prettyjson \ project_id:dataset.table > schema_file
Where:
- project_id is your project ID.
- dataset is the name of the dataset that contains the table you're updating.
- table is the name of the table you're updating.
- schema_file is the schema definition file written to your local machine.
For example, to write the schema definition of
mydataset.mytable
to a file, enter the following command.mydataset.mytable
is in your default project.bq show \ --schema \ --format=prettyjson \ mydataset.mytable > /tmp/myschema.json
Open the schema file in a text editor. The schema should look like the following. In this example,
column3
is a nested repeated column. The nested columns arenested1
andnested2
. Thefields
array lists the fields nested withincolumn3
.[ { "mode": "REQUIRED", "name": "column1", "type": "STRING" }, { "mode": "REQUIRED", "name": "column2", "type": "FLOAT" }, { "fields": [ { "mode": "NULLABLE", "name": "nested1", "type": "STRING" }, { "mode": "NULLABLE", "name": "nested2", "type": "STRING" } ], "mode": "REPEATED", "name": "column3", "type": "RECORD" } ]
Add the new nested column to the end of the
fields
array. In this example,nested3
is the new nested column.[ { "mode": "REQUIRED", "name": "column1", "type": "STRING" }, { "mode": "REQUIRED", "name": "column2", "type": "FLOAT" }, { "fields": [ { "mode": "NULLABLE", "name": "nested1", "type": "STRING" }, { "mode": "NULLABLE", "name": "nested2", "type": "STRING" }, { "mode": "NULLABLE", "name": "nested3", "type": "STRING" } ], "mode": "REPEATED", "name": "column3", "type": "RECORD" } ]
For more information on working with JSON schema files, see Specifying a JSON schema file.
After updating your schema file, issue the following command to update the table's schema. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format:
project_id:dataset
.bq update project_id:dataset.table schema
Where:
- project_id is your project ID.
- dataset is the name of the dataset that contains the table you're updating.
- table is the name of the table you're updating.
- schema is the path to the JSON schema file on your local machine.
For example, enter the following command to update the schema definition of
mydataset.mytable
in your default project. The path to the schema file on your local machine is/tmp/myschema.json
.bq update mydataset.mytable /tmp/myschema.json
API
Call the tables.patch
method and use the schema
property to add the nested columns to your
schema definition. Because the tables.update
method replaces the entire
table resource, the tables.patch
method is preferred.
Adding a column when you overwrite or append data
You can add new columns to an existing table when you load data into it and choose to overwrite the existing table. When you overwrite an existing table, the schema of the data you're loading is used to overwrite the existing table's schema. For information on overwriting a table using a load job, see:
- Appending to or overwriting a table with Avro data
- Appending to or overwriting a table with Parquet data
- Appending to or overwriting a table with ORC data
- Appending to or overwriting a table with CSV data
- Appending to or overwriting a table with JSON data
You can also add new columns to an existing table when you append data to it using a load or query job.
Adding a column in a load append job
You can add columns to a table while appending data to it in a load job by:
- Using the
bq
command-line tool'sbq load
command - Calling the API's
jobs.insert
method and configuring aload
job - Using the client libraries
Adding a column to an existing table during an append operation is not currently supported by the Cloud Console.
When you add columns using an append operation in a load job, the updated schema can be:
- Automatically detected (for CSV and JSON files)
- Specified in a JSON schema file (for CSV and JSON files)
- Retrieved from the self-describing source data for Avro, ORC, Parquet and Datastore export files
If you specify the schema in a JSON file, the new columns must be defined in it.
If the new column definitions are missing, the following error is returned when
you attempt to append the data: Error while reading data, error message:
parsing error in row starting at position int: No such field:
field.
When you add new columns during an append operation,
the values in the new columns are set to NULL
for existing rows.
To add a new column when you append data to a table during a load job:
Console
You cannot add new columns to an existing table when you load data using the Cloud Console.
bq
Use the bq load
command to load your data and specify the --noreplace
flag to indicate that you are appending the data to an existing table.
If the data you're appending is in CSV or newline-delimited JSON format,
specify the --autodetect
flag to use schema auto-detection
or supply the schema in a JSON schema file. The added columns can be
automatically inferred from Avro or Datastore export files.
Set the --schema_update_option
flag to ALLOW_FIELD_ADDITION
to indicate
that the data you're appending contains new columns.
If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format: project_id:dataset
.
(Optional) Supply the --location
flag and set the value to your
location.
Enter the load
command as follows:
bq --location=location load \ --noreplace \ --autodetect \ --schema_update_option=ALLOW_FIELD_ADDITION \ --source_format=format \ project_id:dataset.table \ path_to_source \ schema
Where:
- location is the name of your location. The
--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, set the flag's value toasia-northeast1
. You can set a default value for the location using the .bigqueryrc file. - format is
NEWLINE_DELIMITED_JSON
,CSV
,AVRO
,PARQUET
,ORC
, orDATASTORE_BACKUP
. - project_id is your project ID.
- dataset is the name of the dataset that contains the table.
- table is the name of the table you're appending.
- path_to_source is a fully-qualified Cloud Storage URI, a comma-separated list of URIs, or the path to a data file on your local machine.
- schema is the path to a local JSON schema file. A schema file
is required only for CSV and JSON files when
--autodetect
is unspecified. Avro and Datastore schemas are inferred from the source data.
Examples:
Enter the following command to append a local Avro data file,
/tmp/mydata.avro
, to mydataset.mytable
using a load job. Because schemas
can be automatically inferred from Avro data you do not need to use
the --autodetect
flag. mydataset
is in your default project.
bq load \
--noreplace \
--schema_update_option=ALLOW_FIELD_ADDITION \
--source_format=AVRO \
mydataset.mytable \
/tmp/mydata.avro
Enter the following command append a newline-delimited JSON data file in
Cloud Storage to mydataset.mytable
using a load job. The --autodetect
flag is used to detect the new columns. mydataset
is in your default
project.
bq load \
--noreplace \
--autodetect \
--schema_update_option=ALLOW_FIELD_ADDITION \
--source_format=NEWLINE_DELIMITED_JSON \
mydataset.mytable \
gs://mybucket/mydata.json
Enter the following command append a newline-delimited JSON data file in
Cloud Storage to mydataset.mytable
using a load job. The schema
containing the new columns is specified in a local JSON schema file,
/tmp/myschema.json
. mydataset
is in myotherproject
, not your default
project.
bq load \
--noreplace \
--schema_update_option=ALLOW_FIELD_ADDITION \
--source_format=NEWLINE_DELIMITED_JSON \
myotherproject:mydataset.mytable \
gs://mybucket/mydata.json \
/tmp/myschema.json
API
Call the jobs.insert
method. Configure a load
job and set the following properties:
- Reference your data in Cloud Storage using the
sourceUris
property. - Specify the data format by setting the
sourceFormat
property. - Specify the schema in the
schema
property. - Specify the schema update option using the
schemaUpdateOptions
property. - Set the write disposition of the destination table to
WRITE_APPEND
using thewriteDisposition
property.
Go
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Node.js API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Python API reference documentation.
Adding a column in a query append job
You can add columns to a table while appending query results to it by:
- Using the
bq
command-line tool'sbq query
command - Calling the API's
jobs.insert
method and configuring aquery
job - Using the client libraries
Adding a column during an append operation is not currently supported by the Cloud Console.
When you add columns using an append operation in a query job, the schema of the query results is used to update the schema of the destination table. Note that you cannot query a table in one location and write the results to a table in another location.
To add a new column when you append data to a table during a query job:
Console
You cannot add new columns to an existing table when you append query results using the Cloud Console.
bq
Use the bq query
command to query your data and specify the
--destination_table
flag to indicate which table you're appending.
To specify that you are appending query results to an existing destination
table, specify the --append_table
flag.
Set the --schema_update_option
flag to ALLOW_FIELD_ADDITION
to indicate
that the query results you're appending contain new columns.
Specify the use_legacy_sql=false
flag to use standard SQL syntax for the
query.
If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format: project_id:dataset
. Note that the table
you're quering and the destination table must be in the same location.
(Optional) Supply the --location
flag and set the value to your
location.
bq --location=location query \ --destination_table project_id:dataset.table \ --append_table \ --schema_update_option=ALLOW_FIELD_ADDITION \ --use_legacy_sql=false \ 'query'
Where:
- location is the name of your location. The
--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, set the flag's value toasia-northeast1
. You can set a default value for the location using the .bigqueryrc file. Note that you cannot append query results to a table in another location. - project_id is your project ID.
- dataset is the name of the dataset that contains the table you're appending.
- table is the name of the table you're appending.
- query is a query in standard SQL syntax.
Examples:
Enter the following command to query mydataset.mytable
in your default
project and to append the query results to mydataset.mytable2
(also in
your default project).
bq query \
--destination_table mydataset.mytable2 \
--append_table \
--schema_update_option=ALLOW_FIELD_ADDITION \
--use_legacy_sql=false \
'SELECT
column1,column2
FROM
mydataset.mytable'
Enter the following command to query mydataset.mytable
in your default
project and to append the query results to mydataset.mytable2
in
myotherproject
.
bq query \
--destination_table myotherproject:mydataset.mytable2 \
--append_table \
--schema_update_option=ALLOW_FIELD_ADDITION \
--use_legacy_sql=false \
'SELECT
column1,column2
FROM
mydataset.mytable'
API
Call the jobs.insert
method. Configure a query
job and set the following properties:
- Specify the destination table using the
destinationTable
property. - Set the write disposition of the destination table to
WRITE_APPEND
using thewriteDisposition
property. - Specify the schema update option using the
schemaUpdateOptions
property. - Specify the standard SQL query using the
query
property.
Go
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Node.js API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Python API reference documentation.
Relaxing a column's mode
Currently, the only supported modification you can make to a column's mode is
changing it from REQUIRED
to NULLABLE
. Changing a column's mode from
REQUIRED
to NULLABLE
is also called column relaxation. You can relax
REQUIRED
columns:
- Manually
- When you overwrite a table using a load or query job
- When you append data to a table using a query job
Manually changing REQUIRED
columns to NULLABLE
You can manually change a column's mode from REQUIRED
to NULLABLE
by:
- Using the
bq
command-line tool'sbq update
command - Calling the
tables.patch
API method. - Using the client libraries
To manually change a column's mode from REQUIRED
to NULLABLE
:
Console
You cannot currently relax a column's mode using the Cloud Console.
bq
First, issue the
bq show
command with the--schema
flag and write the existing table schema to a file. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format:project_id:dataset
.bq show \ --schema \ --format=prettyjson \ project_id:dataset.table > schema_file
Where:
- project_id is your project ID.
- dataset is the name of the dataset that contains the table you're updating.
- table is the name of the table you're updating.
- schema_file is the schema definition file written to your local machine.
For example, to write the schema definition of
mydataset.mytable
to a file, enter the following command.mydataset.mytable
is in your default project.bq show \ --schema \ --format=prettyjson \ mydataset.mytable > /tmp/myschema.json
Open the schema file in a text editor. The schema should look like the following:
[ { "mode": "REQUIRED", "name": "column1", "type": "STRING" }, { "mode": "REQUIRED", "name": "column2", "type": "FLOAT" }, { "mode": "REPEATED", "name": "column3", "type": "STRING" } ]
Change an existing column's mode from
REQUIRED
toNULLABLE
. In this example, the mode forcolumn1
is relaxed.[ { "mode": "NULLABLE", "name": "column1", "type": "STRING" }, { "mode": "REQUIRED", "name": "column2", "type": "FLOAT" }, { "mode": "REPEATED", "name": "column3", "type": "STRING" } ]
For more information on working with JSON schema files, see Specifying a JSON schema file.
After updating your schema file, issue the following command to update the table's schema. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format:
project_id:dataset
.bq update project_id:dataset.table schema
Where:
- project_id is your project ID.
- dataset is the name of the dataset that contains the table you're updating.
- table is the name of the table you're updating.
- schema is the path to the JSON schema file on your local machine.
For example, enter the following command to update the schema definition of
mydataset.mytable
in your default project. The path to the schema file on your local machine is/tmp/myschema.json
.bq update mydataset.mytable /tmp/myschema.json
API
Call tables.patch
and
use the schema
property to change a REQUIRED
column to NULLABLE
in
your schema definition. Because the tables.update
method replaces the
entire table resource, the tables.patch
method is preferred.
Go
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Node.js API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Python API reference documentation.
'NULLABLE'
Changing REQUIRED
to NULLABLE
in a load or query job
You can relax REQUIRED
columns to NULLABLE
in an existing table's schema
when you load data into it and choose to overwrite the existing table. When you
overwrite an existing table, the schema of the data you're loading is used to
overwrite the existing table's schema. For information on overwriting a table
using a load job, see:
- Appending to or overwriting a table with Avro data
- Appending to or overwriting a table with Parquet data
- Appending to or overwriting a table with ORC data
- Appending to or overwriting a table with CSV data
- Appending to or overwriting a table with JSON data
You can also relax REQUIRED
columns to NULLABLE
in an existing table's
schema when you append data to it using a query job.
Changing REQUIRED
to NULLABLE
in a load append job
You can relax a column's mode while appending data to a table in a load job by:
- Using the
bq
command-line tool'sbq load
command - Calling the API's
jobs.insert
method and configuring a load job - Using the client libraries
Changing a column's mode during an append operation is not currently supported by the Cloud Console.
When you relax a column's mode using an append operation in a load job, you can:
- Relax the mode for individual columns by specifying a JSON schema file (when appending data from CSV and JSON files)
- Relax columns to
null
in your Avro, ORC, or Parquet schema and allow schema inference to detect the relaxed columns
To relax a column from REQUIRED
to NULLABLE
when you append data to a table
during a load job:
Console
You cannot currently relax a column's mode using the Cloud Console.
bq
Use the bq load
command to load your data and specify the --noreplace
flag to indicate that you are appending the data to an existing table.
If the data you're appending is in CSV or newline-delimited JSON format,
specify the relaxed columns in a local JSON schema file or use the
--autodetect
flag to use schema detection
to discover relaxed columns in the source data. For information on relaxing
column modes using a JSON schema file, see
Manually changing REQUIRED
columns to NULLABLE
.
Relaxed columns can be automatically inferred from Avro, ORC, and Parquet
files. Column relaxation does not apply to Datastore export
appends. The columns in tables created by loading Datastore export
files are always NULLABLE
.
Set the --schema_update_option
flag to ALLOW_FIELD_RELAXATION
to
indicate that the data you're appending contains relaxed columns.
If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format: project_id:dataset
.
(Optional) Supply the --location
flag and set the value to your
location.
Enter the load
command as follows:
bq --location=location load \ --noreplace \ --schema_update_option=ALLOW_FIELD_RELAXATION \ --source_format=format \ project_id:dataset.table \ path_to_source \ schema
Where:
- location is the name of your location. The
--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, set the flag's value toasia-northeast1
. You can set a default value for the location using the .bigqueryrc file. - format is
NEWLINE_DELIMITED_JSON
,CSV
,PARQUET
,ORC
, orAVRO
.DATASTORE_BACKUP
files do not require column relaxation. The columns in tables created from Datastore export files are alwaysNULLABLE
. - project_id is your project ID.
- dataset is the name of the dataset that contains the table.
- table is the name of the table you're appending.
- path_to_source is a fully-qualified Cloud Storage URI, a comma-separated list of URIs, or the path to a data file on your local machine.
- schema is the path to a local JSON schema file. This option is used only for CSV and JSON files. Relaxed columns are automatically inferred from Avro files.
Examples:
Enter the following command to append a local Avro data file,
/tmp/mydata.avro
, to mydataset.mytable
using a load job. Since relaxed
columns can be automatically inferred from Avro data you do not need to
specify a schema file. mydataset
is in your default project.
bq load \
--noreplace \
--schema_update_option=ALLOW_FIELD_RELAXATION \
--source_format=AVRO \
mydataset.mytable \
/tmp/mydata.avro
Enter the following command to append data from a newline-delimited JSON
file in Cloud Storage to mydataset.mytable
using a load job. The
schema containing the relaxed columns is in a local JSON schema file —
/tmp/myschema.json
. mydataset
is in your default project.
bq load \
--noreplace \
--schema_update_option=ALLOW_FIELD_RELAXATION \
--source_format=NEWLINE_DELIMITED_JSON \
mydataset.mytable \
gs://mybucket/mydata.json \
/tmp/myschema.json
Enter the following command to append data in a CSV file on your local
machine to mydataset.mytable
using a load job. The command uses schema
auto-detection to discover relaxed columns in the source data. mydataset
is in myotherproject
, not your default project.
bq load \
--noreplace \
--schema_update_option=ALLOW_FIELD_RELAXATION \
--source_format=CSV \
--autodetect \
myotherproject:mydataset.mytable \
mydata.csv
API
Call the jobs.insert
method. Configure a load
job and set the following properties:
- Reference your data in Cloud Storage using the
sourceUris
property. - Specify the data format by setting the
sourceFormat
property. - Specify the schema in the
schema
property. - Specify the schema update option using the
schemaUpdateOptions
property. - Set the write disposition of the destination table to
WRITE_APPEND
using thewriteDisposition
property.
Go
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Node.js API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Python API reference documentation.
Changing REQUIRED
to NULLABLE
in a query append job
You can relax all columns in a table while appending query results to it by:
- Using the
bq
command-line tool'sbq query
command - Calling the API's
jobs.insert
method and configuring a query job - Using the client libraries
Relaxing columns during an append operation is not currently supported by the Cloud Console.
When you relax columns using an append operation in a query job, you can relax
all required fields in the destination table by setting the
--schema_update_option
flag to ALLOW_FIELD_RELAXATION
. You cannot relax
individual columns in a destination table using a query append.
To relax all columns in a destination table when you append data to it during a query job:
Console
You cannot currently relax a column's mode using the Cloud Console.
bq
Use the bq query
command to query your data and specify the
--destination_table
flag to indicate which table you're appending.
To specify that you are appending query results to an existing destination
table, specify the --append_table
flag.
Set the --schema_update_option
flag to ALLOW_FIELD_RELAXATION
to
indicate that all REQUIRED
columns in the table you're appending should be
changed to NULLABLE
.
Specify the use_legacy_sql=false
flag to use standard SQL syntax for the
query.
If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format: project_id:dataset
.
(Optional) Supply the --location
flag and set the value to your
location.
bq --location=location query \ --destination_table project_id:dataset.table \ --append_table \ --schema_update_option=ALLOW_FIELD_RELAXATION \ --use_legacy_sql=false \ 'query'
Where:
- location is the name of your location. The
--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, set the flag's value toasia-northeast1
. You can set a default value for the location using the .bigqueryrc file. - project_id is your project ID.
- dataset is the name of the dataset that contains the table you're appending.
- table is the name of the table you're appending.
- query is a query in standard SQL syntax.
Examples:
Enter the following command query mydataset.mytable
in your default
project and to append the query results to mydataset.mytable2
(also in
your default project). The command changes all REQUIRED
columns in the
destination table to NULLABLE
.
bq query \
--destination_table mydataset.mytable2 \
--append_table \
--schema_update_option=ALLOW_FIELD_RELAXATION \
--use_legacy_sql=false \
'SELECT
column1,column2
FROM
mydataset.mytable'
Enter the following command query mydataset.mytable
in your default
project and to append the query results to mydataset.mytable2
in
myotherproject
. The command changes all REQUIRED
columns in the
destination table to NULLABLE
.
bq query \
--destination_table myotherproject:mydataset.mytable2 \
--append_table \
--schema_update_option=ALLOW_FIELD_RELAXATION \
--use_legacy_sql=false \
'SELECT
column1,column2
FROM
mydataset.mytable'
API
Call the jobs.insert
method. Configure a query
job and set the following properties:
- Specify the destination table using the
destinationTable
property. - Set the write disposition of the destination table to
WRITE_APPEND
using thewriteDisposition
property. - Specify the schema update option using the
schemaUpdateOptions
property. - Specify the standard SQL query using the
query
property.
Go
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the
BigQuery Quickstart Using Client Libraries.
For more information, see the
BigQuery Python API reference documentation.