Introduction to datasets
This page provides an overview of datasets in BigQuery.
A dataset is contained within a specific project. Datasets
are top-level containers that are used to organize and control access to your
tables and views. A table
or view must belong to a dataset, so you need to create at least one dataset before
loading data into BigQuery.
Use the format
projectname.datasetname to fully qualify a dataset name when
using GoogleSQL, or the format
projectname:datasetname to fully qualify
a dataset name when using the
bq command-line tool.
You specify a location for storing your BigQuery data when you create a dataset. For a list of BigQuery dataset locations, see BigQuery locations. After you create the dataset, the location cannot be changed, but you can copy datasets to different locations, or manually move (recreate) the dataset in a different location.
BigQuery processes queries in the same location as the dataset that contains the tables you're querying. BigQuery stores your data in the selected location in accordance with the Service Specific Terms.
BigQuery datasets are subject to the following limitations:
- The dataset location can only be set at creation time. After a dataset is created, its location cannot be changed.
All tables that are referenced in a query must be stored in datasets in the same location.
When you copy a table, the datasets that contain the source table and destination table must reside in the same location.
Dataset names must be unique for each project.
For more information on dataset quotas and limits, see Quotas and limits.
Dataset storage billing models
When you create a dataset, the storage used by that dataset is billed to you using logical bytes as the default unit of consumption. However, when you create a dataset using SQL or the BigQuery API, you can choose to use physical bytes for billing instead. You can also change an existing dataset's storage billing model to use physical bytes.
Once you change a dataset's storage billing model to use physical bytes, you can't change it back to using logical bytes again.
When you set your storage billing model to use physical bytes, the total storage costs you are billed for include the bytes used for time travel storage. You can configure the time travel window to balance storage costs with your data retention needs. For more information on forecasting your storage costs, see Forecast storage billing.
The physical storage billing model is only available for your datasets if your organization does not have any active flat-rate slot commitments. You will not be able to enroll datasets for physical storage billing until all flat-rate commitments for your organization are no longer active.
You are not charged for creating, updating, or deleting a dataset.
For more information on BigQuery pricing, see Pricing.
To control access to datasets in BigQuery, see Controlling access to datasets. For information about data encryption, see Encryption at rest.
- For more information on creating datasets, see Creating datasets.
- For more information on assigning access controls to datasets, see Controlling access to datasets.