Updating dataset properties
This document describes how to update dataset properties in BigQuery. After you create a dataset, you can update the following dataset properties:
- Access controls
- Billing model
- Default expiration time for new tables
- Default partition expiration for new partitioned tables
- Default rounding mode for new tables
- Description
- Labels
- Time travel windows
Before you begin
Grant Identity and Access Management (IAM) roles that give users the necessary permissions to perform each task in this document.
Required permissions
To update dataset properties, you need the following IAM permissions:
bigquery.datasets.update
bigquery.datasets.get
bigquery.datasets.setIamPolicy
(only required when updating dataset access controls in the Google Cloud console)
The roles/bigquery.dataOwner
predefined IAM role includes the
permissions that you need to update dataset properties.
Additionally, if you have the bigquery.datasets.create
permission, you can
update properties of the datasets that you create.
For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.
Update dataset descriptions
You can update a dataset's description in the following ways:
- Using the Google Cloud console.
- Using the
bq
command-line tool'sbq update
command. - Calling the
datasets.patch
API method. - Using the client libraries.
To update a dataset's description:
Console
In the Explorer panel, expand your project and select a dataset.
Expand the
Actions option and click Open.In the Details panel, click
Edit details to edit the description text.In the Edit detail dialog that appears, do the following:
- In the Description field, enter a description or edit the existing description.
- To save the new description text, click Save.
SQL
To update a dataset's description, use the
ALTER SCHEMA SET OPTIONS
statement
to set the description
option.
The following example sets the description on a dataset named mydataset
:
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
ALTER SCHEMA mydataset SET OPTIONS ( description = 'Description of mydataset');
Click
Run.
For more information about how to run queries, see Running interactive queries.
bq
Issue the bq update
command with the --description
flag. If you are
updating a dataset in a project other than your default project, add the
project ID to the dataset name in the following format:
project_id:dataset
.
bq update \ --description "string" \ project_id:dataset
Replace the following:
string
: the text that describes the dataset, in quotesproject_id
: your project IDdataset
: the name of the dataset that you're updating
Examples:
Enter the following command to change the description of mydataset
to
"Description of mydataset." mydataset
is in your default project.
bq update --description "Description of mydataset" mydataset
Enter the following command to change the description of mydataset
to
"Description of mydataset." The dataset is in myotherproject
, not your
default project.
bq update \
--description "Description of mydataset" \
myotherproject:mydataset
API
Call datasets.patch
and
update the description
property in the
dataset resource.
Because the datasets.update
method replaces the entire dataset resource,
the datasets.patch
method is preferred.
Go
Before trying this sample, follow the Go setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Go API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Java
Before trying this sample, follow the Java setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Java API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Node.js
Before trying this sample, follow the Node.js setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Node.js API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Python API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Update default table expiration times
You can update a dataset's default table expiration time in the following ways:
- Using the Google Cloud console.
- Using the
bq
command-line tool'sbq update
command. - Calling the
datasets.patch
API method. - Using the client libraries.
You can set a default table expiration time at the dataset level, or you can set a table's expiration time when the table is created. If you set the expiration when the table is created, the dataset's default table expiration is ignored. If you do not set a default table expiration at the dataset level, and you do not set a table expiration when the table is created, the table never expires and you must delete the table manually. When a table expires, it is deleted along with all of the data it contains.
When you update a dataset's default table expiration setting:
- If you change the value from
Never
to a defined expiration time, any tables that already exist in the dataset will not expire unless the expiration time was set on the table when it was created. - If you are changing the value for the default table expiration, any tables that already exist expire according to the original table expiration setting. Any new tables created in the dataset have the new table expiration setting applied unless you specify a different table expiration on the table when it is created.
The value for default table expiration is expressed differently depending on where the value is set. Use the method that gives you the appropriate level of granularity:
- In the Google Cloud console, expiration is expressed in days.
- In the
bq
command-line tool, expiration is expressed in seconds. - In the API, expiration is expressed in milliseconds.
To update the default expiration time for a dataset:
Console
In the Explorer panel, expand your project and select a dataset.
Expand the
Actions option and click Open.In the details panel, click the pencil icon next to Dataset info to edit the expiration.
In the Dataset info dialog, in the Default table expiration section, enter a value for Number of days after table creation.
Click Save.
SQL
To update the default table expiration time, use the
ALTER SCHEMA SET OPTIONS
statement
to set the default_table_expiration_days
option.
The following example updates the default table expiration for a dataset
named mydataset
.
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
ALTER SCHEMA mydataset SET OPTIONS( default_table_expiration_days = 3.75);
Click
Run.
For more information about how to run queries, see Running interactive queries.
bq
To update the default expiration time for newly created tables in a dataset,
enter the bq update
command with the --default_table_expiration
flag.
If you are updating a dataset in a project other than your default project,
add the project ID to the dataset name in the following format:
project_id:dataset
.
bq update \ --default_table_expiration integer \ project_id:dataset
Replace the following:
integer
: the default lifetime, in seconds, for newly created tables. The minimum value is 3600 seconds (one hour). The expiration time evaluates to the current UTC time plus the integer value. Specify0
to remove the existing expiration time. Any table created in the dataset is deletedinteger
seconds after its creation time. This value is applied if you do not set a table expiration when the table is created.project_id
: your project ID.dataset
: the name of the dataset that you're updating.
Examples:
Enter the following command to set the default table expiration for
new tables created in mydataset
to two hours (7200 seconds) from the
current time. The dataset is in your default project.
bq update --default_table_expiration 7200 mydataset
Enter the following command to set the default table expiration for
new tables created in mydataset
to two hours (7200 seconds) from the
current time. The dataset is in myotherproject
, not your default project.
bq update --default_table_expiration 7200 myotherproject:mydataset
API
Call datasets.patch
and
update the defaultTableExpirationMs
property in the
dataset resource.
The expiration is expressed in milliseconds in the API. Because the
datasets.update
method replaces the entire dataset resource, the
datasets.patch
method is preferred.
Go
Before trying this sample, follow the Go setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Go API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Java
Before trying this sample, follow the Java setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Java API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Configure the default expiration time with the Dataset.Builder.setDefaultTableLifetime() method.
Node.js
Before trying this sample, follow the Node.js setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Node.js API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Python API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Update default partition expiration times
You can update a dataset's default partition expiration in the following ways:
- Using the
bq
command-line tool'sbq update
command. - Calling the
datasets.patch
API method. - Using the client libraries.
Setting or updating a dataset's default partition expiration isn't currently supported by the Google Cloud console.
You can set a default partition expiration time at the dataset level that affects all newly created partitioned tables, or you can set a partition expiration time for individual tables when the partitioned tables are created. If you set the default partition expiration at the dataset level, and you set the default table expiration at the dataset level, new partitioned tables will only have a partition expiration. If both options are set, the default partition expiration overrides the default table expiration.
If you set the partition expiration time when the partitioned table is created, that value overrides the dataset-level default partition expiration if it exists.
If you do not set a default partition expiration at the dataset level, and you do not set a partition expiration when the table is created, the partitions never expire and you must delete the partitions manually.
When you set a default partition expiration on a dataset, the expiration applies to all partitions in all partitioned tables created in the dataset. When you set the partition expiration on a table, the expiration applies to all partitions created in the specified table. Currently, you cannot apply different expiration times to different partitions in the same table.
When you update a dataset's default partition expiration setting:
- If you change the value from
never
to a defined expiration time, any partitions that already exist in partitioned tables in the dataset will not expire unless the partition expiration time was set on the table when it was created. - If you are changing the value for the default partition expiration, any partitions in existing partitioned tables expire according to the original default partition expiration. Any new partitioned tables created in the dataset have the new default partition expiration setting applied unless you specify a different partition expiration on the table when it is created.
The value for default partition expiration is expressed differently depending on where the value is set. Use the method that gives you the appropriate level of granularity:
- In the
bq
command-line tool, expiration is expressed in seconds. - In the API, expiration is expressed in milliseconds.
To update the default partition expiration time for a dataset:
Console
Updating a dataset's default partition expiration is not currently supported by the Google Cloud console.
SQL
To update the default partition expiration time, use the
ALTER SCHEMA SET OPTIONS
statement
to set the default_partition_expiration_days
option.
The following example updates the default partition expiration for a
dataset named mydataset
:
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
ALTER SCHEMA mydataset SET OPTIONS( default_partition_expiration_days = 3.75);
Click
Run.
For more information about how to run queries, see Running interactive queries.
bq
To update the default expiration time for a dataset, enter the bq update
command with the --default_partition_expiration
flag. If you are updating
a dataset in a project other than your default project,
add the project ID to the dataset name in the following format:
project_id:dataset
.
bq update \ --default_partition_expiration integer \ project_id:dataset
Replace the following:
integer
: the default lifetime, in seconds, for partitions in newly created partitioned tables. This flag has no minimum value. Specify0
to remove the existing expiration time. Any partitions in newly created partitioned tables are deletedinteger
seconds after the partition's UTC date. This value is applied if you do not set a partition expiration on the table when it is created.project_id
: your project ID.dataset
: the name of the dataset that you're updating.
Examples:
Enter the following command to set the default partition expiration for
new partitioned tables created in mydataset
to 26 hours (93,600 seconds).
The dataset is in your default project.
bq update --default_partition_expiration 93600 mydataset
Enter the following command to set the default partition expiration for
new partitioned tables created in mydataset
to 26 hours (93,600 seconds).
The dataset is in myotherproject
, not your default project.
bq update --default_partition_expiration 93600 myotherproject:mydataset
API
Call datasets.patch
and
update the defaultPartitionExpirationMs
property in the
dataset resource.
The expiration is expressed in milliseconds. Because the datasets.update
method replaces the entire dataset resource, the datasets.patch
method is
preferred.
Update rounding mode
You can update a dataset's default rounding mode
by using the
ALTER SCHEMA SET OPTIONS
DDL statement.
The following example updates the default rounding mode for mydataset
to
ROUND_HALF_EVEN
.
ALTER SCHEMA mydataset SET OPTIONS ( default_rounding_mode = "ROUND_HALF_EVEN");
This sets the default rounding mode for new tables created in the dataset. It has no impact on new columns added to existing tables. Setting the default rounding mode on a table in the dataset overrides this option.
Update dataset access controls
The process for updating a dataset's access controls is very similar to the
process for assigning access controls to a dataset. Access controls cannot be
applied during dataset creation using the Google Cloud console or the
bq
command-line tool. You must create the dataset first and then update the dataset's
access controls. The API lets you update dataset access controls by calling the
datasets.patch method.
When you update access controls on a dataset, you can modify access for the following entities:
IAM principals:
- Google Account email: Grants an individual Google Account access to the dataset.
- Google Group: Grants all members of a Google group access to the dataset.
- Google Workspace domain: Grants all users and groups in a Google domain access to the dataset.
- Service account: Grants a service account access to the dataset.
- Anybody: Enter
allUsers
to grant access to the general public. - All Google accounts: Enter
allAuthenticatedUsers
to grant access to any user signed in to a Google Account.
Resource types:
- Authorized datasets: Grants an authorized dataset access to the dataset.
- Authorized views: Grants an authorized view access to the dataset.
- Authorized functions: Grants an authorized UDF or table function access to the dataset.
To update access controls on a dataset:
Console
In the Explorer panel, expand your project and select a dataset.
Expand the
Actions option and click Open.Click Share Dataset.
In the Share Dataset dialog, to delete existing entries, expand the entry and then click the delete icon (trash can).
In the Share Dataset dialog, to add new entries:
Enter the entity in the Add principals box.
For Select a role, choose an appropriate IAM role from the list. For more information on the permissions assigned to each predefined BigQuery role, see the Predefined roles and permissions page.
Click Add.
To add an authorized view, click the Authorized View tab and enter the project, dataset, and view, and then click Add.
When you are done adding or deleting your access controls, click Done.
bq
Write the existing dataset information (including access controls) to a JSON file using the
show
command. If the dataset is in a project other than your default project, add the project ID to the dataset name in the following format:project_id:dataset
.bq show \ --format=prettyjson \ project_id:dataset > path_to_file
Replace the following:
project_id
: your project ID.dataset
: the name of your dataset.path_to_file
: the path to the JSON file on your local machine.
Examples:
Enter the following command to write the access controls for
mydataset
to a JSON file.mydataset
is in your default project.bq show --format=prettyjson mydataset > /tmp/mydataset.json
Enter the following command to write the access controls for
mydataset
to a JSON file.mydataset
is inmyotherproject
.bq show --format=prettyjson \ myotherproject:mydataset > /tmp/mydataset.json
Make your changes to the
"access"
section of the JSON file. You can add or remove any of thespecialGroup
entries:projectOwners
,projectWriters
,projectReaders
, andallAuthenticatedUsers
. You can also add, remove, or modify any of the following:userByEmail
,groupByEmail
, anddomain
.For example, the access section of a dataset's JSON file would look like the following:
{ "access": [ { "role": "READER", "specialGroup": "projectReaders" }, { "role": "WRITER", "specialGroup": "projectWriters" }, { "role": "OWNER", "specialGroup": "projectOwners" } { "role": "READER", "specialGroup": "allAuthenticatedUsers" } { "role": "READER", "domain": "[DOMAIN_NAME]" } { "role": "WRITER", "userByEmail": "[USER_EMAIL]" } { "role": "READER", "groupByEmail": "[GROUP_EMAIL]" } ], }
When your edits are complete, use the
update
command and include the JSON file using the--source
flag. If the dataset is in a project other than your default project, add the project ID to the dataset name in the following format:project_id:dataset
.bq update --source path_to_file project_id:dataset
Replace the following:
path_to_file
: the path to the JSON file on your local machine.project_id
: your project ID.dataset
: the name of your dataset.
Examples:
Enter the following command to update the access controls for
mydataset
.mydataset
is in your default project.bq update --source /tmp/mydataset.json mydataset
Enter the following command to update the access controls for
mydataset
.mydataset
is inmyotherproject
.bq update --source /tmp/mydataset.json myotherproject:mydataset
To verify your access control changes, enter the
show
command again without writing the information to a file.bq show --format=prettyjson dataset
or
bq show --format=prettyjson project_id:dataset
API
Call the datasets.patch
and update the access
property in the
dataset resource.
Because the datasets.update
method replaces the entire dataset resource,
datasets.patch
is the preferred method for updating access controls.
Go
Before trying this sample, follow the Go setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Go API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Java
Before trying this sample, follow the Java setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Java API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Configure the access controls with the Dataset.Builder.setAcl() method.
Node.js
Before trying this sample, follow the Node.js setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Node.js API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the
BigQuery quickstart using
client libraries.
For more information, see the
BigQuery Python API
reference documentation.
To authenticate to BigQuery, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Update time travel windows
You can update a dataset's time travel window in the following ways:
- Using the
ALTER SCHEMA SET OPTIONS
statement. - Using the
bq
command-line tool'sbq update
command. - Calling the
datasets.patch
ordatasets.update
API method. Theupdate
method replaces the entire dataset resource, whereas thepatch
method only replaces fields that are provided in the submitted dataset resource.
For more information on the time travel window, see Configuring the time travel window.
To update the time travel window for a dataset:
SQL
Use the
ALTER SCHEMA SET OPTIONS
statement with the max_time_travel_hours
option to specify the time travel
window when altering a dataset. The max_time_travel_hours
value must
be an integer expressed in multiples of 24 (48, 72, 96, 120, 144, 168)
between 48 (2 days) and 168 (7 days).
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
ALTER SCHEMA DATASET_NAME SET OPTIONS( max_time_travel_hours = HOURS);
Click
Run.
For more information about how to run queries, see Running interactive queries.
bq
Use the bq update
command with the --max_time_travel_hours
flag to specify the time travel
window when altering a dataset. The --max_time_travel_hours
value must
be an integer expressed in multiples of 24 (48, 72, 96, 120, 144, 168)
between 48 (2 days) and 168 (7 days).
bq update \
--dataset=true --max_time_travel_hours=HOURS \
PROJECT_ID:DATASET_NAME
API
Call the
datasets.patch
or
datasets.update
method with a defined
dataset resource in which you
have specified a value for the maxTimeTravelHours
field. The
maxTimeTravelHours
value must be an integer expressed in multiples of 24
(48, 72, 96, 120, 144, 168) between 48 (2 days) and 168 (7 days).
Update storage billing models
You can update a dataset's storage billing model to use physical bytes instead of the default logical bytes when calculating storage charges. When you change a dataset's billing model, it takes 24 hours for the change to take effect. If you change a dataset's storage billing model to use physical bytes, you can't change it back to using logical bytes.
For more information, see Dataset storage billing models.
SQL
To update the billing model for a dataset, use the
ALTER SCHEMA SET OPTIONS
statement
and set the storage_billing_model
option to physical
:
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
ALTER SCHEMA DATASET_NAME SET OPTIONS( storage_billing_model = 'physical');
Replace
DATASET_NAME
with the name of the dataset that you are changing.Click
Run.
For more information about how to run queries, see Running interactive queries.
To update the storage billing model for all datasets in a project, use the following SQL query:
FOR record IN (SELECT CONCAT(catalog_name, ':', schema_name) AS dataset_path FROM PROJECT_ID.INFORMATION_SCHEMA.SCHEMATA) DO ALTER SCHEMA record.dataset_path SET OPTIONS( storage_billing_model = 'physical'); END FOR;
Replace PROJECT_ID
with your project ID.
bq
To update the billing model for a dataset, use the
bq update
command
and set the --storage_billing_model
flag to PHYSICAL
:
bq update -d --storage_billing_model=PHYSICAL PROJECT_ID:DATASET_ID
Replace the following:
PROJECT_ID
: your project IDDATASET_ID
: the ID of the dataset that you're updating
API
Call the datasets.update
method
with a defined dataset resource
where the storageBillingModel
field is set to PHYSICAL
.
The following example shows how to call datasets.update
using curl
:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -L -X PUT https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT_ID/datasets/DATASET_ID -d '{"datasetReference": {"projectId": "PROJECT_ID", "datasetId": "DATASET_ID"}, "storageBillingModel": "PHYSICAL"}'
Replace the following:
PROJECT_ID
: your project IDDATASET_ID
: the ID of the dataset that you're updating
Dataset security
To control access to datasets in BigQuery, see Controlling access to datasets. For information about data encryption, see Encryption at rest.
Next steps
- For more information about creating datasets, see Creating datasets.
- For more information about managing datasets, see Managing datasets.