[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-17。"],[[["\u003cp\u003eThis document outlines best practices for optimizing BigQuery storage to enhance query performance, which also impacts cost control.\u003c/p\u003e\n"],["\u003cp\u003eClustering frequently used columns in tables reduces the amount of data scanned, improving query performance.\u003c/p\u003e\n"],["\u003cp\u003ePartitioning large tables by characteristics like time or integer columns improves performance and reduces costs by limiting the data read during a query.\u003c/p\u003e\n"],["\u003cp\u003eSetting expiration policies for datasets, tables, and partitions automatically deletes old data, optimizing storage usage and costs.\u003c/p\u003e\n"],["\u003cp\u003eAggregating long-term data instead of storing all row-level details can significantly reduce storage charges while keeping key metrics available.\u003c/p\u003e\n"]]],[],null,["# Optimize storage for query performance\n======================================\n\nThis page provides best practices for optimizing BigQuery\nstorage for query performance. You can also [optimize storage for cost](/bigquery/docs/best-practices-costs#control-storage-cost).\nWhile these best practices are primarily focused on tables using BigQuery\nstorage, they can be applied to external tables as well.\n\nBigQuery stores data in columnar format. Column-oriented\ndatabases are optimized for analytic workloads that aggregate data over a very\nlarge number of records. As columns have typically more redundancy than rows,\nthis characteristic allows for greater data compression by using techniques such\nas run-length encoding. For more information about how BigQuery\nstores data, see [Overview of BigQuery storage](/bigquery/docs/storage_overview).\nOptimizing BigQuery storage improves\n[query performance](/bigquery/docs/best-practices-performance-compute)\nand [controls cost](/bigquery/docs/best-practices-costs).\n\nBigQuery provides details about the storage consumption of your\nresources.\nTo view the table storage metadata, query the following `INFORMATION_SCHEMA` views:\n\n- [`INFORMATION_SCHEMA.TABLE_STORAGE`](/bigquery/docs/information-schema-table-storage)\n- [`INFORMATION_SCHEMA.TABLE_STORAGE_BY_ORGANIZATION`](/bigquery/docs/information-schema-table-storage-by-organization)\n- [`INFORMATION_SCHEMA.TABLE_STORAGE_USAGE_TIMELINE`](/bigquery/docs/information-schema-table-storage-usage)\n- [`INFORMATION_SCHEMA.TABLE_STORAGE_USAGE_TIMELINE_BY_ORGANIZATION`](/bigquery/docs/information-schema-table-storage-usage-by-organization)\n\nCluster table data\n------------------\n\n**Best practice:** Create clustered tables.\n\nTo optimize storage for queries, start by clustering table data. By clustering\nfrequently used columns, you can reduce the total volume of data scanned by the\nquery. For information about how to create clusters, see\n[Create and use clustered tables](/bigquery/docs/creating-clustered-tables).\n\nPartition table data\n--------------------\n\n**Best practice:** Divide large tables with partitions.\n\nWith partitions, you can group and sort your data by a set of defined\ncolumn characteristics, such as an integer column, a time-unit column, or the\ningestion time. Partitioning improves the query performance and control\ncosts by reducing the number of bytes read by a query.\n\nFor more information about partitions, see [Introduction to partitioned tables](/bigquery/docs/partitioned-tables).\n\nUse the table and partition expiration settings\n-----------------------------------------------\n\n**Best practice:** To optimize storage, configure the default expiration\nsettings for [datasets](/bigquery/docs/updating-datasets#table-expiration),\n[tables](/bigquery/docs/managing-tables#updating_a_tables_expiration_time),\nand [partitioned tables](/bigquery/docs/managing-partitioned-tables#partition-expiration).\n\nYou can control storage costs and optimize storage usage by setting the default\ntable expiration for newly created tables in a dataset. When a table expires,\nit gets deleted along with all of the data that the table contains. If you set\nthe property when the dataset is created, any table created in the dataset is\ndeleted after the expiration period. If you set the property after the dataset\nis created, only new tables are deleted after the expiration period.\n\nFor example, if you set the default table expiration to seven days, older data is\nautomatically deleted after one week.\n\nThis option is useful if you need access to only the most recent data. It is\nalso useful if you are experimenting with data and don't need to preserve it.\n\nIf your tables are partitioned by date, the dataset's default table expiration\napplies to the individual partitions. You can also control partition expiration\nusing the `time_partitioning_expiration` flag in the bq command-line tool or\nthe `expirationMs` configuration setting in the API. When a partition expires,\ndata in the partition is deleted but the partitioned table is not dropped even\nif the table is empty.\n\nFor example, the following command expires partitions after three days: \n\n```bash\nbq mk \\\n--time_partitioning_type=DAY \\\n--time_partitioning_expiration=259200 \\\nproject_id:dataset.table\n```\n\nAggregate long-term data\n------------------------\n\n**Best practice:** Identify if row-level data needs to be stored long term, and\nif not, only store aggregated data long term.\n\nIn many cases, details contained in transactional or row-level data are useful\nin the short term, but are referenced less over the long term. In these\nsituations, you can build aggregation queries to compute and store the metrics\nassociated with this data, and then use table or partition expiration to\nsystematically remove the row-level data. This reduces storage charges\nwhile keeping metrics available for long-term consumption.\n\nWhat's next\n-----------\n\n- Learn how to [optimize cost](/bigquery/docs/best-practices-costs).\n- Learn how to [optimize query](/bigquery/docs/best-practices-performance-compute).\n- Learn how to [optimize functions](/bigquery/docs/best-practices-performance-functions)."]]