Python 版 Hello World

本示例是一个“hello world”应用,采用 Python 编写而成,旨在说明如何完成以下操作:

  • 设置身份验证。
  • 连接到 Bigtable 实例。
  • 新建一个表。
  • 将数据写入表中。
  • 重新读取这些数据。
  • 删除表。

Bigtable 的 Python 客户端库提供两个 API,asyncio 和同步 API。如果您的应用是异步的,请使用 asyncio

设置身份验证

如需在本地开发环境中使用本页面上的 Python 示例,请安装并初始化 gcloud CLI,然后使用您的用户凭据设置应用默认凭据。

  1. Install the Google Cloud CLI.
  2. To initialize the gcloud CLI, run the following command:

    gcloud init
  3. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

如需了解详情,请参阅 Set up authentication for a local development environment

运行示例

本示例使用 Python 版 Cloud 客户端库Bigtable 软件包与 Bigtable 通信。Bigtable 软件包是新应用的最佳选择。如果您需要将现有 HBase 工作负载移至 Bigtable,请参阅使用 HappyBase 软件包的“hello world”示例

要运行此示例程序,请按照 GitHub 上的示例说明执行操作。

将 Cloud 客户端库与 Bigtable 搭配使用

示例应用会连接到 Bigtable 并演示一些操作。

安装和导入客户端库

使用 PIP 将所需的 Python 软件包安装到 virtualenv 环境中。该示例包含一个需求文件,其中定义了所需的软件包。

google-cloud-bigtable==2.25.0
google-cloud-core==2.4.1

导入模块。

Asyncio

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import bigtable
from google.cloud.bigtable.data import row_filters
from google.cloud.bigtable.data import RowMutationEntry
from google.cloud.bigtable.data import SetCell
from google.cloud.bigtable.data import ReadRowsQuery

同步

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import datetime

from google.cloud import bigtable
from google.cloud.bigtable import column_family
from google.cloud.bigtable import row_filters

连接到 Bigtable

使用 bigtable.Client 连接到 Bigtable。

Asyncio

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

client = bigtable.data.BigtableDataClientAsync(project=project_id)
table = client.get_table(instance_id, table_id)

同步

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

# The client must be created with admin=True because it will create a
# table.
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)

创建表

使用 Instance.table() 实例化表对象。创建列族并设置其垃圾回收政策,然后将列族传递到 Table.create() 以创建表。

print("Creating the {} table.".format(table_id))
table = instance.table(table_id)

print("Creating column family cf1 with Max Version GC rule...")
# Create a column family with GC policy : most recent N versions
# Define the GC policy to retain only the most recent 2 versions
max_versions_rule = column_family.MaxVersionsGCRule(2)
column_family_id = "cf1"
column_families = {column_family_id: max_versions_rule}
if not table.exists():
    table.create(column_families=column_families)
else:
    print("Table {} already exists.".format(table_id))

将行写入表

循环遍历一系列问候语字符串,从而为该表创建一些新行。 在每次迭代中,使用 Table.row() 来定义行并为其分配一个行键,调用 Row.set_cell() 来为当前单元设置值,并将新行附加到行数组中。最后,调用 Table.mutate_rows() 将行添加到表中。

Asyncio

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

print("Writing some greetings to the table.")
greetings = ["Hello World!", "Hello Cloud Bigtable!", "Hello Python!"]
mutations = []
column = "greeting"
for i, value in enumerate(greetings):
    # Note: This example uses sequential numeric IDs for simplicity,
    # but this can result in poor performance in a production
    # application.  Since rows are stored in sorted order by key,
    # sequential keys can result in poor distribution of operations
    # across nodes.
    #
    # For more information about how to design a Bigtable schema for
    # the best performance, see the documentation:
    #
    #     https://cloud.google.com/bigtable/docs/schema-design
    row_key = "greeting{}".format(i).encode()
    row_mutation = RowMutationEntry(
        row_key, SetCell(column_family_id, column, value)
    )
    mutations.append(row_mutation)
await table.bulk_mutate_rows(mutations)

同步

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

print("Writing some greetings to the table.")
greetings = ["Hello World!", "Hello Cloud Bigtable!", "Hello Python!"]
rows = []
column = "greeting".encode()
for i, value in enumerate(greetings):
    # Note: This example uses sequential numeric IDs for simplicity,
    # but this can result in poor performance in a production
    # application.  Since rows are stored in sorted order by key,
    # sequential keys can result in poor distribution of operations
    # across nodes.
    #
    # For more information about how to design a Bigtable schema for
    # the best performance, see the documentation:
    #
    #     https://cloud.google.com/bigtable/docs/schema-design
    row_key = "greeting{}".format(i).encode()
    row = table.direct_row(row_key)
    row.set_cell(
        column_family_id, column, value, timestamp=datetime.datetime.utcnow()
    )
    rows.append(row)
table.mutate_rows(rows)

创建过滤器

在读取您写入的数据之前,请使用 row_filters.CellsColumnLimitFilter() 创建过滤条件,以限制 Bigtable 返回的数据。此过滤条件指示 Bigtable 仅返回每列中的最新单元,即使表包含在垃圾回收期间尚未移除的旧单元也是如此。

Asyncio

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

# Create a filter to only retrieve the most recent version of the cell
# for each column across entire row.
row_filter = row_filters.CellsColumnLimitFilter(1)

同步

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

row_filter = row_filters.CellsColumnLimitFilter(1)

按行键读取行

调用表的 Table.read_row() 方法以通过特定行键引用行,传入行键和过滤条件,以获取该行中每个值的一个版本。

Asyncio

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

print("Getting a single greeting by row key.")
key = "greeting0".encode()

row = await table.read_row(key, row_filter=row_filter)
cell = row.cells[0]
print(cell.value.decode("utf-8"))

同步

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

print("Getting a single greeting by row key.")
key = "greeting0".encode()

row = table.read_row(key, row_filter)
cell = row.cells[column_family_id][column][0]
print(cell.value.decode("utf-8"))

扫描所有表行

使用 Table.read_rows() 从表中读取一系列行。

Asyncio

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

print("Scanning for all greetings:")
query = ReadRowsQuery(row_filter=row_filter)
async for row in await table.read_rows_stream(query):
    cell = row.cells[0]
    print(cell.value.decode("utf-8"))

同步

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

print("Scanning for all greetings:")
partial_rows = table.read_rows(filter_=row_filter)

for row in partial_rows:
    cell = row.cells[column_family_id][column][0]
    print(cell.value.decode("utf-8"))

删除表

使用 Table.delete() 删除表。

print("Deleting the {} table.".format(table_id))
table.delete()

总结

以下为不包含注释的完整示例。

Asyncio

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证



"""Demonstrates how to connect to Cloud Bigtable and run some basic operations with the async APIs

Prerequisites:

- Create a Cloud Bigtable instance.
  https://cloud.google.com/bigtable/docs/creating-instance
- Set your Google Application Default Credentials.
  https://developers.google.com/identity/protocols/application-default-credentials
"""

import argparse
import asyncio
from ..utils import wait_for_table

from google.cloud import bigtable
from google.cloud.bigtable.data import row_filters
from google.cloud.bigtable.data import RowMutationEntry
from google.cloud.bigtable.data import SetCell
from google.cloud.bigtable.data import ReadRowsQuery


async def main(project_id, instance_id, table_id):
    client = bigtable.data.BigtableDataClientAsync(project=project_id)
    table = client.get_table(instance_id, table_id)

    from google.cloud.bigtable import column_family

    print("Creating the {} table.".format(table_id))
    admin_client = bigtable.Client(project=project_id, admin=True)
    admin_instance = admin_client.instance(instance_id)
    admin_table = admin_instance.table(table_id)

    print("Creating column family cf1 with Max Version GC rule...")
    max_versions_rule = column_family.MaxVersionsGCRule(2)
    column_family_id = "cf1"
    column_families = {column_family_id: max_versions_rule}
    if not admin_table.exists():
        admin_table.create(column_families=column_families)
    else:
        print("Table {} already exists.".format(table_id))

    try:
        wait_for_table(admin_table)
        print("Writing some greetings to the table.")
        greetings = ["Hello World!", "Hello Cloud Bigtable!", "Hello Python!"]
        mutations = []
        column = "greeting"
        for i, value in enumerate(greetings):
            row_key = "greeting{}".format(i).encode()
            row_mutation = RowMutationEntry(
                row_key, SetCell(column_family_id, column, value)
            )
            mutations.append(row_mutation)
        await table.bulk_mutate_rows(mutations)

        row_filter = row_filters.CellsColumnLimitFilter(1)

        print("Getting a single greeting by row key.")
        key = "greeting0".encode()

        row = await table.read_row(key, row_filter=row_filter)
        cell = row.cells[0]
        print(cell.value.decode("utf-8"))

        print("Scanning for all greetings:")
        query = ReadRowsQuery(row_filter=row_filter)
        async for row in await table.read_rows_stream(query):
            cell = row.cells[0]
            print(cell.value.decode("utf-8"))
    finally:
        print("Deleting the {} table.".format(table_id))
        admin_table.delete()


if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description=__doc__, formatter_class=argparse.ArgumentDefaultsHelpFormatter
    )
    parser.add_argument("project_id", help="Your Cloud Platform project ID.")
    parser.add_argument(
        "instance_id", help="ID of the Cloud Bigtable instance to connect to."
    )
    parser.add_argument(
        "--table", help="Table to create and destroy.", default="Hello-Bigtable"
    )

    args = parser.parse_args()
    asyncio.run(main(args.project_id, args.instance_id, args.table))

同步

如需了解如何安装和使用 Bigtable 的客户端库,请参阅 Bigtable 客户端库

如需向 Bigtable 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证



"""Demonstrates how to connect to Cloud Bigtable and run some basic operations.

Prerequisites:

- Create a Cloud Bigtable instance.
  https://cloud.google.com/bigtable/docs/creating-instance
- Set your Google Application Default Credentials.
  https://developers.google.com/identity/protocols/application-default-credentials
"""

import argparse
from ..utils import wait_for_table

import datetime

from google.cloud import bigtable
from google.cloud.bigtable import column_family
from google.cloud.bigtable import row_filters



def main(project_id, instance_id, table_id):
    client = bigtable.Client(project=project_id, admin=True)
    instance = client.instance(instance_id)

    print("Creating the {} table.".format(table_id))
    table = instance.table(table_id)

    print("Creating column family cf1 with Max Version GC rule...")
    max_versions_rule = column_family.MaxVersionsGCRule(2)
    column_family_id = "cf1"
    column_families = {column_family_id: max_versions_rule}
    if not table.exists():
        table.create(column_families=column_families)
    else:
        print("Table {} already exists.".format(table_id))

    try:
        wait_for_table(table)

        print("Writing some greetings to the table.")
        greetings = ["Hello World!", "Hello Cloud Bigtable!", "Hello Python!"]
        rows = []
        column = "greeting".encode()
        for i, value in enumerate(greetings):
            row_key = "greeting{}".format(i).encode()
            row = table.direct_row(row_key)
            row.set_cell(
                column_family_id, column, value, timestamp=datetime.datetime.utcnow()
            )
            rows.append(row)
        table.mutate_rows(rows)

        row_filter = row_filters.CellsColumnLimitFilter(1)

        print("Getting a single greeting by row key.")
        key = "greeting0".encode()

        row = table.read_row(key, row_filter)
        cell = row.cells[column_family_id][column][0]
        print(cell.value.decode("utf-8"))

        print("Scanning for all greetings:")
        partial_rows = table.read_rows(filter_=row_filter)

        for row in partial_rows:
            cell = row.cells[column_family_id][column][0]
            print(cell.value.decode("utf-8"))

    finally:
        print("Deleting the {} table.".format(table_id))
        table.delete()


if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description=__doc__, formatter_class=argparse.ArgumentDefaultsHelpFormatter
    )
    parser.add_argument("project_id", help="Your Cloud Platform project ID.")
    parser.add_argument(
        "instance_id", help="ID of the Cloud Bigtable instance to connect to."
    )
    parser.add_argument(
        "--table", help="Table to create and destroy.", default="Hello-Bigtable"
    )

    args = parser.parse_args()
    main(args.project_id, args.instance_id, args.table)