异步二级索引查询

本文档提供了在 Bigtable 中构建异步二级索引的常见查询模式示例。这些示例使用的 ID 和值与示例数据中的类似,您可以创建测试表来测试查询。

在阅读本页内容之前,请先熟悉创建异步二级索引GoogleSQL for Bigtable

示例查询

以下示例展示了如何使用不同的 GoogleSQL 查询创建异步二级索引。

基于列限定符和值创建倒排索引

以下查询通过从具有灵活架构的列族中提取所有键值对来创建倒排索引。它使用 MAP_ENTRIES 获取 stats_summary 列族存储的所有数据的数组,然后使用 UNNEST 将每个键值对转换为单独的行。生成的索引支持基于 stat_descriptionstat_value 的快速查找。如果您向源表中添加了新的统计信息类型,Bigtable 会自动将其纳入异步索引中。

SELECT
 stats.key as stat_description,
 stats.value as stat_value,
 _key as original_key,
 cell_plan as cell_plan
FROM test_table
CROSS JOIN UNNEST(MAP_ENTRIES(stats_summary)) stats
ORDER BY stat_description, stat_value, original_key

根据现有数据创建新的行键

以下示例通过对 cell_plan 列族存储的数据流量方案进行分类,创建新的索引键 total_plan_capacity。该查询使用 MAP_KEYS 获取 cell_plan 中的所有键,并使用 ARRAY_INCLUDES_ANY 过滤特定数据方案键。然后,CASE 语句定义了根据这些数据方案的存在情况和值来分配 total_plan_capacity 类别的逻辑。这样一来,便可根据组合的数据方案容量进行高效的查找。

SELECT
 CASE
    WHEN
     cell_plan['data_plan_01gb'] = "true"
     AND (cell_plan['data_plan_05gb'] = "false" OR cell_plan['data_plan_05gb'] IS NULL)
     AND (cell_plan['data_plan_10gb'] = "false" OR cell_plan['data_plan_10gb'] IS NULL)
   THEN 'x-small'

   WHEN
     cell_plan['data_plan_01gb'] = "true"
     AND (cell_plan['data_plan_05gb'] = "true")
     AND (cell_plan['data_plan_10gb'] = "false" OR cell_plan['data_plan_10gb'] IS NULL)
   THEN 'small'

   WHEN
     cell_plan['data_plan_01gb'] = "true"
     AND (cell_plan['data_plan_05gb'] = "false" OR cell_plan['data_plan_05gb'] IS NULL)
     AND (cell_plan['data_plan_10gb'] = "true")
   THEN 'medium'

   WHEN
     (cell_plan['data_plan_01gb'] = "false" OR cell_plan['data_plan_01gb'] IS NULL)
     AND (cell_plan['data_plan_05gb'] = "true")
     AND (cell_plan['data_plan_10gb'] = "true")
   THEN 'large'

   WHEN
     cell_plan['data_plan_01gb'] = "true"
     AND (cell_plan['data_plan_05gb'] = "true")
     AND (cell_plan['data_plan_10gb'] = "true")
   THEN 'x-large'
 END as total_plan_capacity,
 _key as original_key,
 stats_summary
FROM test_table
WHERE ARRAY_INCLUDES_ANY(MAP_KEYS(cell_plan), ["data_plan_01gb", "data_plan_05gb","data_plan_10gb"])
ORDER BY total_plan_capacity, original_key

创建基于时序的指数

Bigtable 会存储每个单元格值及其关联的时间戳。通过在 FROM 子句中使用 WITH_HISTORY=>TRUE 标志,您可以检索所有版本的数据及其时间戳。然后,UNPACK 表函数会展开查询结果,使每个带时间戳的值都显示在单独的行中,并带有单独的 _timestamp 列。这样一来,您就可以创建异步辅助索引,其中新行键基于这些时间戳值,从而实现基于时间范围的快速查找。

SELECT
 _timestamp as stats_timestamp,
 _key as original_key,
 stats_summary
FROM UNPACK((
 select
   _key,
   stats_summary
  FROM
 test_table (WITH_HISTORY=>TRUE)
))
ORDER BY stats_timestamp, original_key

为 JSON 单元格中的项和值创建索引条目

Bigtable 通常用作大规模工作负载(包括 JSON 等格式的数据)的键值存储区。以下示例展示了如何创建异步二级索引,以便快速查找 JSON 存储的数据。该查询使用 JSON_EXTRACT 从名为 sales 的表中 transaction_information 列族内的 JSON 字符串中提取 store_id。此索引可用于按特定商店 ID 快速检索销售交易。

SELECT
JSON_EXTRACT(transaction_information["json"], "$.store_id") as store_id,
_key as original_key,
transaction_information
from sales
ORDER BY store_id, original_key;

后续步骤