Introduction to datasets
This page provides an overview of datasets in BigQuery.
A dataset is contained within a specific project. Datasets
are top-level containers that are used to organize and control access to your
tables and views. A table
or view must belong to a dataset, so you need to create at least one dataset before
loading data into BigQuery.
Use the format
projectname.datasetname to fully qualify a dataset name when
using GoogleSQL, or the format
projectname:datasetname to fully qualify
a dataset name when using the bq command-line tool.
You specify a location for storing your BigQuery data when you create a dataset. For a list of BigQuery dataset locations, see BigQuery locations. After you create the dataset, the location cannot be changed, but you can copy datasets to different locations, or manually move (recreate) the dataset in a different location.
BigQuery processes queries in the same location as the dataset that contains the tables you're querying. BigQuery stores your data in the selected location in accordance with the Service Specific Terms.
BigQuery datasets are subject to the following limitations:
- The dataset location can only be set at creation time. After a dataset is created, its location cannot be changed.
All tables that are referenced in a query must be stored in datasets in the same location.
When you copy a table, the datasets that contain the source table and destination table must reside in the same location.
Dataset names must be unique for each project.
For more information on dataset quotas and limits, see Quotas and limits.
Datasets use time travel in conjunction with the fail-safe period to retain deleted and modified data for a short time, in case you need to recover it. For more information, see Data retention with time travel and fail-safe.
Storage billing models
When you create a dataset, the storage used by that dataset is billed to you using logical bytes as the default unit of consumption. However, you can choose to use physical bytes for billing instead. You can also change an existing dataset's storage billing model to use physical bytes.
When you change a dataset's billing model, it takes 24 hours for the change to take effect.
Once you change a dataset's storage billing model, you must wait 14 days before you can change the storage billing model again.
When you set your storage billing model to use physical bytes, the total active storage costs you are billed for include the bytes used for time travel and fail-safe storage. You can configure the time travel window to balance storage costs with your data retention needs. For more information on forecasting your storage costs, see Forecast storage billing.
The dataset storage billing model is only available for your datasets if your organization does not have any existing flat-rate slot commitments located in the same region as the dataset. Your organization can enroll datasets for physical storage billing when there are no flat-rate commitments located in the same region as the dataset.
You are not charged for creating, updating, or deleting a dataset.
For more information on BigQuery pricing, see Pricing.
- For more information on creating datasets, see Creating datasets.
- For more information on assigning access controls to datasets, see Controlling access to datasets.