Optimize storage in BigQuery

This page provides best practices for optimizing BigQuery storage.

Use the expiration settings to remove unneeded tables and partitions

Best practice: Configure the default table expiration for your datasets, configure the expiration time for your tables, and configure the partition expiration for partitioned tables.

You can control storage costs and optimize storage usage by setting the default table expiration for newly created tables in a dataset. When a table expires, it gets deleted along with all of the data that the table contains. If you set the property when the dataset is created, any table created in the dataset is deleted after the expiration period. If you set the property after the dataset is created, only new tables are deleted after the expiration period.

For example, if you set the default table expiration to 7 days, older data is automatically deleted after 1 week.

This option is useful if you need access to only the most recent data. It is also useful if you are experimenting with data and do not need to preserve it.

If your tables are partitioned by date, the dataset's default table expiration applies to the individual partitions. You can also control partition expiration using the time_partitioning_expiration flag in the bq command-line tool or the expirationMs configuration setting in the API. When a partition expires, data in the partition is deleted but the partitioned table is not dropped even if the table is empty. For example, the following command expires partitions after 3 days:

bq mk \
--time_partitioning_type=DAY \
--time_partitioning_expiration=259200 \

Take advantage of long-term storage

Best practice: Keep your data in BigQuery.

When you load data into BigQuery from Cloud Storage, you are not charged for the load operation, but you do incur charges for storing the data in Cloud Storage. After the data is loaded into BigQuery, the data is subject to BigQuery's storage pricing.

Rather than exporting your older data to another storage option (such as Cloud Storage), take advantage of BigQuery's long-term storage pricing.

If you have a table that is not edited for 90 consecutive days, the price of storage for that table automatically drops by 50 percent to $0.01 per GB, per month.

Each partition of a partitioned table is considered separately for long-term storage pricing. If a partition hasn't been modified in the last 90 days, the data in that partition is considered long term storage and is charged at the discounted price.

Use the pricing calculator to estimate storage costs

Best practice: Estimate your storage costs using the Google Cloud Pricing Calculator.

To estimate storage costs in the Google Cloud Platform Pricing Calculator, enter the number of bytes that are stored as MB, GB, TB, or PB. BigQuery provides 10 GB of storage free per month.

Pricing calculator.

Use the same organization for managing BigQuery operations

Best practice: Before running a BigQuery job, check that both the source and destination tables belong to the same organization.

You can significantly improve the latency of read and write operations in running BigQuery operations by ensuring that both the source and destination tables are in the same organization.