Introduction to datasets

This page provides an overview of datasets in BigQuery.

Datasets

A dataset is contained within a specific project. Datasets are top-level containers that are used to organize and control access to your tables and views. A table or view must belong to a dataset, so you need to create at least one dataset before loading data into BigQuery. Use the format projectname.datasetname to fully qualify a dataset name when using GoogleSQL, or the format projectname:datasetname to fully qualify a dataset name when using the bq command-line tool.

Dataset location

You specify a location for storing your BigQuery data when you create a dataset. For a list of BigQuery dataset locations, see BigQuery locations. After you create the dataset, the location cannot be changed, but you can copy datasets to different locations, or manually move (recreate) the dataset in a different location.

BigQuery processes queries in the same location as the dataset that contains the tables you're querying. BigQuery stores your data in the selected location in accordance with the Service Specific Terms.

Dataset limitations

BigQuery datasets are subject to the following limitations:

  • The dataset location can only be set at creation time. After a dataset is created, its location cannot be changed.
  • All tables that are referenced in a query must be stored in datasets in the same location.

  • When you copy a table, the datasets that contain the source table and destination table must reside in the same location.

  • Dataset names must be unique for each project.

Dataset quotas

For more information on dataset quotas and limits, see Quotas and limits.

Dataset storage billing models

When you create a dataset, the storage used by that dataset is billed to you using logical bytes as the default unit of consumption. However, when you create a dataset using SQL or the BigQuery API, you can choose to use physical bytes for billing instead. You can also change an existing dataset's storage billing model to use physical bytes.

Once you change a dataset's storage billing model to use physical bytes, you can't change it back to using logical bytes again.

When you set your storage billing model to use physical bytes, the total storage costs you are billed for include the bytes used for time travel storage. You can configure the time travel window to balance storage costs with your data retention needs. For more information on forecasting your storage costs, see Forecast storage billing.

Eligibility criteria:

The physical storage billing model is only available for your datasets if your organization does not have any active flat-rate slot commitments. You will not be able to enroll datasets for physical storage billing until all flat-rate commitments for your organization are no longer active.

Dataset pricing

You are not charged for creating, updating, or deleting a dataset.

For more information on BigQuery pricing, see Pricing.

Dataset security

To control access to datasets in BigQuery, see Controlling access to datasets. For information about data encryption, see Encryption at rest.

Next steps