How to manage query performance for clustered tables

Problem

How can you manage query performance between partitioned clustered tables and unpartitioned clustered tables.

Environment

  • BigQuery

Solution

Apply one of the following options depeding on your clustered tables needs:
  • If given bubmeterid with just a few rows, clustering might be beneficial to group them together, and ignore large swaths of data belonging to other submeterid.
  • But if given submeterid with a lot of data spread across many years, then partitioning might be beneficial to trim down scanned partitions to relevant days, ignoring long periods of time,

There is definitely some sweet spot in the middle, but this depends on data layout.

Cause

BigQuery query optimization trade-offs.