示例:Python“Hello World”应用

本示例是一个非常简单的“hello world”应用,采用 Python 编写而成,旨在说明如何实现以下操作:

  • 连接到 Cloud Bigtable 实例。
  • 新建一个表。
  • 将数据写入表中。
  • 重新读取这些数据。
  • 删除表。

运行示例

本示例使用 Python 版 Google Cloud 客户端库Cloud Bigtable 软件包与 Cloud Bigtable 通信。Cloud Bigtable 软件包是新应用的最佳选择。如果您需要将现有 HBase 工作负载移至 Cloud Bigtable,请参阅使用 HappyBase 软件包的“hello world”示例

要运行此示例程序,请按照 GitHub 上的示例说明执行操作。

将 Cloud 客户端库用于 Cloud Bigtable

示例应用会连接到 Cloud Bigtable 并演示一些简单操作。

安装和导入客户端库

使用 PIP 将所需的 Python 软件包安装到 virtualenv 环境中。该示例包含一个需求文件,其中定义了所需的软件包。

google-cloud-bigtable==1.5.1
google-cloud-core==1.4.3

导入模块。

import datetime

from google.cloud import bigtable
from google.cloud.bigtable import column_family
from google.cloud.bigtable import row_filters

连接到 Cloud Bigtable

使用 bigtable.Client 连接到 Cloud Bigtable。

# The client must be created with admin=True because it will create a
# table.
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)

创建表

使用 Instance.table() 实例化表对象。创建列族并设置其垃圾回收政策,然后将列族传递到 Table.create() 以创建表。

print('Creating the {} table.'.format(table_id))
table = instance.table(table_id)

print('Creating column family cf1 with Max Version GC rule...')
# Create a column family with GC policy : most recent N versions
# Define the GC policy to retain only the most recent 2 versions
max_versions_rule = column_family.MaxVersionsGCRule(2)
column_family_id = 'cf1'
column_families = {column_family_id: max_versions_rule}
if not table.exists():
    table.create(column_families=column_families)
else:
    print("Table {} already exists.".format(table_id))

将行写入表

循环遍历一系列问候语字符串,从而为该表创建一些新行。 在每次迭代中,使用 Table.row() 来定义行并为其分配一个行键,调用 Row.set_cell() 来为当前单元设置值,并将新行附加到行数组中。最后,调用 Table.mutate_rows() 将行添加到表中。

print('Writing some greetings to the table.')
greetings = ['Hello World!', 'Hello Cloud Bigtable!', 'Hello Python!']
rows = []
column = 'greeting'.encode()
for i, value in enumerate(greetings):
    # Note: This example uses sequential numeric IDs for simplicity,
    # but this can result in poor performance in a production
    # application.  Since rows are stored in sorted order by key,
    # sequential keys can result in poor distribution of operations
    # across nodes.
    #
    # For more information about how to design a Bigtable schema for
    # the best performance, see the documentation:
    #
    #     https://cloud.google.com/bigtable/docs/schema-design
    row_key = 'greeting{}'.format(i).encode()
    row = table.direct_row(row_key)
    row.set_cell(column_family_id,
                 column,
                 value,
                 timestamp=datetime.datetime.utcnow())
    rows.append(row)
table.mutate_rows(rows)

创建过滤条件

在读取您写入的数据之前,请使用 row_filters.CellsColumnLimitFilter() 创建过滤条件,以限制 Cloud Bigtable 返回的数据。此过滤条件指示 Cloud Bigtable 仅返回每个值的最新版本,即使该表包含尚未被垃圾回收的旧版本。

row_filter = row_filters.CellsColumnLimitFilter(1)

按行键读取行

调用表的 Table.read_row() 方法以通过特定行键引用行,传入行键和过滤条件,以获取该行中每个值的一个版本。

print('Getting a single greeting by row key.')
key = 'greeting0'.encode()

row = table.read_row(key, row_filter)
cell = row.cells[column_family_id][column][0]
print(cell.value.decode('utf-8'))

扫描所有表行

使用 Table.read_rows() 从表中读取一系列行。

print('Scanning for all greetings:')
partial_rows = table.read_rows(filter_=row_filter)

for row in partial_rows:
    cell = row.cells[column_family_id][column][0]
    print(cell.value.decode('utf-8'))

删除表

使用 Table.delete() 删除表。

print('Deleting the {} table.'.format(table_id))
table.delete()

综合应用

以下为不包含注释的完整示例。



"""Demonstrates how to connect to Cloud Bigtable and run some basic operations.

Prerequisites:

- Create a Cloud Bigtable cluster.
  https://cloud.google.com/bigtable/docs/creating-cluster
- Set your Google Application Default Credentials.
  https://developers.google.com/identity/protocols/application-default-credentials
"""

import argparse
import datetime

from google.cloud import bigtable
from google.cloud.bigtable import column_family
from google.cloud.bigtable import row_filters

def main(project_id, instance_id, table_id):
    client = bigtable.Client(project=project_id, admin=True)
    instance = client.instance(instance_id)

    print('Creating the {} table.'.format(table_id))
    table = instance.table(table_id)

    print('Creating column family cf1 with Max Version GC rule...')
    max_versions_rule = column_family.MaxVersionsGCRule(2)
    column_family_id = 'cf1'
    column_families = {column_family_id: max_versions_rule}
    if not table.exists():
        table.create(column_families=column_families)
    else:
        print("Table {} already exists.".format(table_id))

    print('Writing some greetings to the table.')
    greetings = ['Hello World!', 'Hello Cloud Bigtable!', 'Hello Python!']
    rows = []
    column = 'greeting'.encode()
    for i, value in enumerate(greetings):
        row_key = 'greeting{}'.format(i).encode()
        row = table.direct_row(row_key)
        row.set_cell(column_family_id,
                     column,
                     value,
                     timestamp=datetime.datetime.utcnow())
        rows.append(row)
    table.mutate_rows(rows)

    row_filter = row_filters.CellsColumnLimitFilter(1)

    print('Getting a single greeting by row key.')
    key = 'greeting0'.encode()

    row = table.read_row(key, row_filter)
    cell = row.cells[column_family_id][column][0]
    print(cell.value.decode('utf-8'))

    print('Scanning for all greetings:')
    partial_rows = table.read_rows(filter_=row_filter)

    for row in partial_rows:
        cell = row.cells[column_family_id][column][0]
        print(cell.value.decode('utf-8'))

    print('Deleting the {} table.'.format(table_id))
    table.delete()

if __name__ == '__main__':
    parser = argparse.ArgumentParser(
        description=__doc__,
        formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument('project_id', help='Your Cloud Platform project ID.')
    parser.add_argument(
        'instance_id', help='ID of the Cloud Bigtable instance to connect to.')
    parser.add_argument(
        '--table',
        help='Table to create and destroy.',
        default='Hello-Bigtable')

    args = parser.parse_args()
    main(args.project_id, args.instance_id, args.table)