Column Families

When creating a ColumnFamily, it is possible to set garbage collection rules for expired data.

By setting a rule, cells in the table matching the rule will be deleted during periodic garbage collection (which executes opportunistically in the background).

The types MaxAgeGCRule, MaxVersionsGCRule, GarbageCollectionRuleUnion and GarbageCollectionRuleIntersection can all be used as the optional gc_rule argument in the ColumnFamily constructor. This value is then used in the create() and update() methods.

These rules can be nested arbitrarily, with a MaxAgeGCRule or MaxVersionsGCRule at the lowest level of the nesting:

import datetime

max_age = datetime.timedelta(days=3)
rule1 = MaxAgeGCRule(max_age)
rule2 = MaxVersionsGCRule(1)

# Make a composite that matches anything older than 3 days **AND**
# with more than 1 version.
rule3 = GarbageCollectionIntersection(rules=[rule1, rule2])

# Make another composite that matches our previous intersection
# **OR** anything that has more than 3 versions.
rule4 = GarbageCollectionRule(max_num_versions=3)
rule5 = GarbageCollectionUnion(rules=[rule3, rule4])

User friendly container for Google Cloud Bigtable Column Family.

class google.cloud.bigtable.column_family.ColumnFamily(column_family_id, table, gc_rule=None)

Bases: object

Representation of a Google Cloud Bigtable Column Family.

We can use a ColumnFamily to:

  • create() itself

  • update() itself

  • delete() itself

  • Parameters

    • column_family_id (str) – The ID of the column family. Must be of the form [_a-zA-Z0-9][-_.a-zA-Z0-9]\*.

    • table (Table) – The table that owns the column family.

    • gc_rule (GarbageCollectionRule) – (Optional) The garbage collection settings for this column family.

create()

Create this column family.

For example:

from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)

column_family_id = "column_family_id1"
gc_rule = column_family.MaxVersionsGCRule(2)
column_family_obj = table.column_family(column_family_id, gc_rule=gc_rule)
column_family_obj.create()

delete()

Delete this column family.

For example:

from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)

column_family_id = "column_family_id1"
column_family_obj = table.column_family(column_family_id)
column_family_obj.delete()

property name()

Column family name used in requests.

For example:

from google.cloud.bigtable import Client

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)

column_families = table.list_column_families()
column_family_obj = column_families[COLUMN_FAMILY_ID]
column_family_name = column_family_obj.name

NOTE: This property will not change if column_family_id does not, but the return value is not cached.

The Column family name is of the form

"projects/../zones/../clusters/../tables/../columnFamilies/.."

  • Return type

    str

  • Returns

    The column family name.

to_pb()

Converts the column family to a protobuf.

  • Return type

    table_v2_pb2.ColumnFamily

  • Returns

    The converted current object.

update()

Update this column family.

For example:

from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)

# Already existing column family id
column_family_id = "column_family_id1"
# Define the GC rule to retain data with max age of 5 days
max_age_rule = column_family.MaxAgeGCRule(datetime.timedelta(days=5))
column_family_obj = table.column_family(column_family_id, gc_rule=max_age_rule)
column_family_obj.update()

NOTE: Only the GC rule can be updated. By changing the column family ID, you will simply be referring to a different column family.

class google.cloud.bigtable.column_family.GCRuleIntersection(rules)

Bases: google.cloud.bigtable.column_family.GarbageCollectionRule

Intersection of garbage collection rules.

For example:

from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)

max_versions_rule = column_family.MaxVersionsGCRule(2)
max_age_rule = column_family.MaxAgeGCRule(datetime.timedelta(days=5))

intersection_rule = column_family.GCRuleIntersection(
    [max_versions_rule, max_age_rule]
)

column_family_obj = table.column_family("cf4", intersection_rule)
column_family_obj.create()
  • Parameters

    rules (list) – List of GarbageCollectionRule.

to_pb()

Converts the intersection into a single GC rule as a protobuf.

  • Return type

    table_v2_pb2.GcRule

  • Returns

    The converted current object.

class google.cloud.bigtable.column_family.GCRuleUnion(rules)

Bases: google.cloud.bigtable.column_family.GarbageCollectionRule

Union of garbage collection rules.

For example:

from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)

max_versions_rule = column_family.MaxVersionsGCRule(2)
max_age_rule = column_family.MaxAgeGCRule(datetime.timedelta(days=5))

union_rule = column_family.GCRuleUnion([max_versions_rule, max_age_rule])

column_family_obj = table.column_family("cf3", union_rule)
column_family_obj.create()
  • Parameters

    rules (list) – List of GarbageCollectionRule.

to_pb()

Converts the union into a single GC rule as a protobuf.

  • Return type

    table_v2_pb2.GcRule

  • Returns

    The converted current object.

class google.cloud.bigtable.column_family.GarbageCollectionRule()

Bases: object

Garbage collection rule for column families within a table.

Cells in the column family (within a table) fitting the rule will be deleted during garbage collection.

NOTE: This class is a do-nothing base class for all GC rules.

NOTE: A string gc_expression can also be used with API requests, but that value would be superceded by a gc_rule. As a result, we don’t support that feature and instead support via native classes.

class google.cloud.bigtable.column_family.MaxAgeGCRule(max_age)

Bases: google.cloud.bigtable.column_family.GarbageCollectionRule

Garbage collection limiting the age of a cell.

For example:

from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)

# Define the GC rule to retain data with max age of 5 days
max_age_rule = column_family.MaxAgeGCRule(datetime.timedelta(days=5))

column_family_obj = table.column_family("cf1", max_age_rule)
column_family_obj.create()
  • Parameters

    max_age (datetime.timedelta) – The maximum age allowed for a cell in the table.

to_pb()

Converts the garbage collection rule to a protobuf.

  • Return type

    table_v2_pb2.GcRule

  • Returns

    The converted current object.

class google.cloud.bigtable.column_family.MaxVersionsGCRule(max_num_versions)

Bases: google.cloud.bigtable.column_family.GarbageCollectionRule

Garbage collection limiting the number of versions of a cell.

For example:

from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)

# Define the GC policy to retain only the most recent 2 versions
max_versions_rule = column_family.MaxVersionsGCRule(2)

column_family_obj = table.column_family("cf2", max_versions_rule)
column_family_obj.create()
  • Parameters

    max_num_versions (int) – The maximum number of versions

to_pb()

Converts the garbage collection rule to a protobuf.

  • Return type

    table_v2_pb2.GcRule

  • Returns

    The converted current object.