Keeping only the most recent value

All Cloud Bigtable client libraries let you use filters to read the most recent value, or cell, at a given row and column. In some cases, you might not ever need to read older versions of your data. To avoid paying to store older data that you don't need, you can delete the data using the strategy on this page.

Before you read this page, see the garbage collection overview.

Timestamp of zero

If you only ever want to read the most recent value in a column family's columns, and you don't want to wait for garbage collection to remove older cells, you can set the timestamp to zero (1970-01-01 00:00:00 UTC) every time you write data to the column family. In this scenario, new writes immediately hide old ones, so reads always return a single value for each column. You might still need to wait for a compaction before older cells stop taking up space in the table and incurring storage costs, depending on how recently the previous data was written.

Pros

  • You don't need to use filters when you read the data, because you can only read the single most recent value of a column.
  • You don't need to set a garbage collection policy in this case, because you're already removing old data every time you write over an existing cell.

Cons

  • Previous values for a cell are immediately overwritten and cannot be retrieved.
  • Because your timestamps aren't a date and time, you cannot use the cells' timestamps for any other use case, such as determining how old a value is. As a workaround, you can write a real timestamp to a separate column, but this will increase the amount of data you store.

What's next