Column Families
When creating a
ColumnFamily
, it is
possible to set garbage collection rules for expired data.
By setting a rule, cells in the table matching the rule will be deleted during periodic garbage collection (which executes opportunistically in the background).
The types
MaxAgeGCRule
,
MaxVersionsGCRule
,
GarbageCollectionRuleUnion
and
GarbageCollectionRuleIntersection
can all be used as the optional gc_rule
argument in the
ColumnFamily
constructor. This value is then used in the
create()
and
update()
methods.
These rules can be nested arbitrarily, with a
MaxAgeGCRule
or
MaxVersionsGCRule
at the lowest level of the nesting:
import datetime
max_age = datetime.timedelta(days=3)
rule1 = MaxAgeGCRule(max_age)
rule2 = MaxVersionsGCRule(1)
# Make a composite that matches anything older than 3 days **AND**
# with more than 1 version.
rule3 = GarbageCollectionIntersection(rules=[rule1, rule2])
# Make another composite that matches our previous intersection
# **OR** anything that has more than 3 versions.
rule4 = GarbageCollectionRule(max_num_versions=3)
rule5 = GarbageCollectionUnion(rules=[rule3, rule4])
User friendly container for Google Cloud Bigtable Column Family.
class google.cloud.bigtable.column_family.ColumnFamily(column_family_id, table, gc_rule=None)
Bases: object
Representation of a Google Cloud Bigtable Column Family.
We can use a ColumnFamily
to:
create()
itselfupdate()
itselfdelete()
itselfParameters
create()
Create this column family.
For example:
from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family
client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
column_family_id = "column_family_id1"
gc_rule = column_family.MaxVersionsGCRule(2)
column_family_obj = table.column_family(column_family_id, gc_rule=gc_rule)
column_family_obj.create()
delete()
Delete this column family.
For example:
from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family
client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
column_family_id = "column_family_id1"
column_family_obj = table.column_family(column_family_id)
column_family_obj.delete()
property name()
Column family name used in requests.
For example:
from google.cloud.bigtable import Client
client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
column_families = table.list_column_families()
column_family_obj = column_families[COLUMN_FAMILY_ID]
column_family_name = column_family_obj.name
NOTE: This property will not change if column_family_id
does not, but
the return value is not cached.
The Column family name is of the form
"projects/../zones/../clusters/../tables/../columnFamilies/.."
Return type
Returns
The column family name.
to_pb()
Converts the column family to a protobuf.
Return type
table_v2_pb2.ColumnFamily
Returns
The converted current object.
update()
Update this column family.
For example:
from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family
client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
# Already existing column family id
column_family_id = "column_family_id1"
# Define the GC rule to retain data with max age of 5 days
max_age_rule = column_family.MaxAgeGCRule(datetime.timedelta(days=5))
column_family_obj = table.column_family(column_family_id, gc_rule=max_age_rule)
column_family_obj.update()
NOTE: Only the GC rule can be updated. By changing the column family ID, you will simply be referring to a different column family.
class google.cloud.bigtable.column_family.GCRuleIntersection(rules)
Bases: google.cloud.bigtable.column_family.GarbageCollectionRule
Intersection of garbage collection rules.
For example:
from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family
client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
max_versions_rule = column_family.MaxVersionsGCRule(2)
max_age_rule = column_family.MaxAgeGCRule(datetime.timedelta(days=5))
intersection_rule = column_family.GCRuleIntersection(
[max_versions_rule, max_age_rule]
)
column_family_obj = table.column_family("cf4", intersection_rule)
column_family_obj.create()
Parameters
rules (list) – List of
GarbageCollectionRule
.
to_pb()
Converts the intersection into a single GC rule as a protobuf.
Return type
table_v2_pb2.GcRule
Returns
The converted current object.
class google.cloud.bigtable.column_family.GCRuleUnion(rules)
Bases: google.cloud.bigtable.column_family.GarbageCollectionRule
Union of garbage collection rules.
For example:
from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family
client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
max_versions_rule = column_family.MaxVersionsGCRule(2)
max_age_rule = column_family.MaxAgeGCRule(datetime.timedelta(days=5))
union_rule = column_family.GCRuleUnion([max_versions_rule, max_age_rule])
column_family_obj = table.column_family("cf3", union_rule)
column_family_obj.create()
Parameters
rules (list) – List of
GarbageCollectionRule
.
to_pb()
Converts the union into a single GC rule as a protobuf.
Return type
table_v2_pb2.GcRule
Returns
The converted current object.
class google.cloud.bigtable.column_family.GarbageCollectionRule()
Bases: object
Garbage collection rule for column families within a table.
Cells in the column family (within a table) fitting the rule will be deleted during garbage collection.
NOTE: This class is a do-nothing base class for all GC rules.
NOTE: A string gc_expression
can also be used with API requests, but
that value would be superceded by a gc_rule
. As a result, we
don’t support that feature and instead support via native classes.
class google.cloud.bigtable.column_family.MaxAgeGCRule(max_age)
Bases: google.cloud.bigtable.column_family.GarbageCollectionRule
Garbage collection limiting the age of a cell.
For example:
from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family
client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
# Define the GC rule to retain data with max age of 5 days
max_age_rule = column_family.MaxAgeGCRule(datetime.timedelta(days=5))
column_family_obj = table.column_family("cf1", max_age_rule)
column_family_obj.create()
Parameters
max_age (
datetime.timedelta
) – The maximum age allowed for a cell in the table.
to_pb()
Converts the garbage collection rule to a protobuf.
Return type
table_v2_pb2.GcRule
Returns
The converted current object.
class google.cloud.bigtable.column_family.MaxVersionsGCRule(max_num_versions)
Bases: google.cloud.bigtable.column_family.GarbageCollectionRule
Garbage collection limiting the number of versions of a cell.
For example:
from google.cloud.bigtable import Client
from google.cloud.bigtable import column_family
client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
# Define the GC policy to retain only the most recent 2 versions
max_versions_rule = column_family.MaxVersionsGCRule(2)
column_family_obj = table.column_family("cf2", max_versions_rule)
column_family_obj.create()
Parameters
max_num_versions (int) – The maximum number of versions
to_pb()
Converts the garbage collection rule to a protobuf.
Return type
table_v2_pb2.GcRule
Returns
The converted current object.