Garbage collection for sequential numbers in timestamps

You may have a reason, not related to garbage collection, to assign sequential numbers to the timestamp property for a cell, rather than assigning a date and time. This page describes Bigtable garbage collection for data with this type of artificial timestamps.

Before you read this page, you should read the overview of garbage collection, including the description of real and artificial timestamps.

Number of versions

If timestamps are sequence numbers, your garbage collection policy should be based on the number of versions. This means that you specify the number of cells to retain. An age-based garbage collection policy is unsafe if you use sequential numbers instead of real timestamps, because age-based policies remove data based on the timestamp.

Advantages of storing sequential numbers in timestamps

  • You are able to use monotonically increasing timestamps if you need to.

Disadvantages of storing sequential numbers in timestamps

  • You cannot switch to age-based garbage collection.

  • Because your timestamps aren't an actual date and time, you cannot use the cells' timestamps for any other use case, such as determining how old a value is. As a workaround, you can write a real timestamp to a separate column, but this will increase the amount of data you store.

  • Because garbage collection is asynchronous, you still should always use filters when you read the data.

What's next