WRITE_API_TIMELINE 视图
INFORMATION_SCHEMA.WRITE_API_TIMELINE
视图包含当前项目的每分钟汇总 BigQuery Storage Write API 注入统计信息。
您可以查询 INFORMATION_SCHEMA
Write API 视图以检索有关使用 BigQuery Storage Write API 将数据注入到 BigQuery 的历史信息和实时信息。如需了解详情,请参阅 BigQuery Storage Write API。
所需权限
要查询 INFORMATION_SCHEMA.WRITE_API_TIMELINE
视图,您需要拥有项目的 bigquery.tables.list
Identity and Access Management (IAM) 权限。
以下每个预定义的 IAM 角色均可提供所需的权限:
roles/bigquery.user
roles/bigquery.dataViewer
roles/bigquery.dataEditor
roles/bigquery.dataOwner
roles/bigquery.metadataViewer
roles/bigquery.resourceAdmin
roles/bigquery.admin
如需详细了解 BigQuery 权限,请参阅使用 IAM 进行访问权限控制。
架构
查询 INFORMATION_SCHEMA
BigQuery Storage Write API 视图时,查询结果包含有关使用 BigQuery Storage Write API 将数据注入到 BigQuery 的历史信息和实时信息。以下视图中的每一行均表示注入特定表的统计信息,这些信息从 start_timestamp
开始以一分钟为间隔进行了汇总。统计信息按流类型和错误代码分组,因此对于每个时间戳和表的组合,在一分钟间隔时间内每种流类型和每个遇到的错误代码都会有对应的一行。如果请求成功,则错误代码设置为 OK
。如果在特定时间段内没有数据注入到表中,则该表对应的时间戳没有行。
INFORMATION_SCHEMA.WRITE_API_TIMELINE_BY_*
视图具有以下架构:
列名 | 数据类型 | 值 |
---|---|---|
start_timestamp |
TIMESTAMP |
(分区列)汇总统计信息的 1 分钟间隔的开始时间戳。 |
project_id |
STRING |
(聚簇列)项目的 ID。 |
project_number |
INTEGER |
项目编号。 |
dataset_id |
STRING |
(聚簇列)数据集的 ID。 |
table_id |
STRING |
(聚簇列)表的 ID。 |
stream_type |
STRING |
用于通过 BigQuery Storage Write API 注入数据的流类型。它应该是“DEFAULT”“COMMITTED”“BUFFERED”或“PENDING”之一。 |
error_code |
STRING |
为此行指定的请求返回的错误代码。如果请求成功,则为“OK”。 |
total_requests |
INTEGER |
1 分钟间隔内的请求总数。 |
total_rows |
INTEGER |
1 分钟间隔内所有请求的总行数。 |
total_input_bytes |
INTEGER |
1 分钟间隔内所有行的总字节数。 |
数据保留
此视图包含过去 180 天的 BigQuery Storage Write API 注入历史记录。
范围和语法
针对此视图的查询必须包含区域限定符。如果您未指定区域限定符,则会从所有区域检索元数据。下表说明了此视图的区域范围:
视图名称 | 资源范围 | 区域范围 |
---|---|---|
[PROJECT_ID.]`region-REGION`.INFORMATION_SCHEMA.WRITE_API_TIMELINE[_BY_PROJECT] |
项目级 | REGION |
可选:PROJECT_ID
:您的Google Cloud 项目的 ID。如果未指定,则使用默认项目。
REGION
:任何数据集区域名称。例如 `region-us`
。
示例
- 如需查询位于美国多区域的数据,请使用
`region-us`.INFORMATION_SCHEMA.WRITE_API_TIMELINE_BY_PROJECT
- 如需查询位于欧盟多区域的数据,请使用
`region-eu`.INFORMATION_SCHEMA.WRITE_API_TIMELINE_BY_PROJECT
- 如需查询位于 asia-northeast1 区域的数据,请使用
`region-asia-northeast1`.INFORMATION_SCHEMA.WRITE_API_TIMELINE_BY_PROJECT
如需查看可用区域的列表,请参阅数据集位置。
示例
示例 1:最近的 BigQuery Storage Write API 注入失败
以下示例计算最近 30 分钟内项目中所有表的失败请求总数的每分钟明细,按数据流类型和错误代码划分:
SELECT start_timestamp, stream_type, error_code, SUM(total_requests) AS num_failed_requests FROM `region-us`.INFORMATION_SCHEMA.WRITE_API_TIMELINE WHERE error_code != 'OK' AND start_timestamp > TIMESTAMP_SUB(CURRENT_TIMESTAMP, INTERVAL 30 MINUTE) GROUP BY start_timestamp, stream_type, error_code ORDER BY start_timestamp DESC;
结果类似于以下内容:
+---------------------+-------------+------------------+---------------------+ | start_timestamp | stream_type | error_code | num_failed_requests | +---------------------+-------------+------------------+---------------------+ | 2023-02-24 00:25:00 | PENDING | NOT_FOUND | 5 | | 2023-02-24 00:25:00 | DEFAULT | INVALID_ARGUMENT | 1 | | 2023-02-24 00:25:00 | DEFAULT | DEADLINE_EXCEEDED| 4 | | 2023-02-24 00:24:00 | PENDING | INTERNAL | 3 | | 2023-02-24 00:24:00 | DEFAULT | INVALID_ARGUMENT | 1 | | 2023-02-24 00:24:00 | DEFAULT | DEADLINE_EXCEEDED| 2 | +---------------------+-------------+------------------+---------------------+
示例 2:所有产生错误代码的请求的每分钟明细
以下示例计算成功和失败的附加请求的每分钟明细,按错误代码划分。此查询可用于填充信息中心。
SELECT start_timestamp, SUM(total_requests) AS total_requests, SUM(total_rows) AS total_rows, SUM(total_input_bytes) AS total_input_bytes, SUM( IF( error_code IN ( 'INVALID_ARGUMENT', 'NOT_FOUND', 'CANCELLED', 'RESOURCE_EXHAUSTED', 'ALREADY_EXISTS', 'PERMISSION_DENIED', 'UNAUTHENTICATED', 'FAILED_PRECONDITION', 'OUT_OF_RANGE'), total_requests, 0)) AS user_error, SUM( IF( error_code IN ( 'DEADLINE_EXCEEDED','ABORTED', 'INTERNAL', 'UNAVAILABLE', 'DATA_LOSS', 'UNKNOWN'), total_requests, 0)) AS server_error, SUM(IF(error_code = 'OK', 0, total_requests)) AS total_error, FROM `region-us`.INFORMATION_SCHEMA.WRITE_API_TIMELINE GROUP BY start_timestamp ORDER BY start_timestamp DESC;
结果类似于以下内容:
+---------------------+----------------+------------+-------------------+------------+--------------+-------------+ | start_timestamp | total_requests | total_rows | total_input_bytes | user_error | server_error | total_error | +---------------------+----------------+------------+-------------------+------------+--------------+-------------+ | 2020-04-15 22:00:00 | 441854 | 441854 | 23784853118 | 0 | 17 | 17 | | 2020-04-15 21:59:00 | 355627 | 355627 | 26101982742 | 8 | 0 | 13 | | 2020-04-15 21:58:00 | 354603 | 354603 | 26160565341 | 0 | 0 | 0 | | 2020-04-15 21:57:00 | 298823 | 298823 | 23877821442 | 2 | 0 | 2 | +---------------------+----------------+------------+-------------------+------------+--------------+-------------+
示例 3:传入流量最多的表
以下示例返回传入流量最多的前 10 个表的 BigQuery Storage Write API 注入统计信息:
SELECT project_id, dataset_id, table_id, SUM(total_rows) AS num_rows, SUM(total_input_bytes) AS num_bytes, SUM(total_requests) AS num_requests FROM `region-us`.INFORMATION_SCHEMA.WRITE_API_TIMELINE_BY_PROJECT GROUP BY project_id, dataset_id, table_id ORDER BY num_bytes DESC LIMIT 10;
结果类似于以下内容:
+----------------------+------------+-------------------------------+------------+----------------+--------------+ | project_id | dataset_id | table_id | num_rows | num_bytes | num_requests | +----------------------+------------+-------------------------------+------------+----------------+--------------+ | my-project | dataset1 | table1 | 8016725532 | 73787301876979 | 8016725532 | | my-project | dataset1 | table2 | 26319580 | 34199853725409 | 26319580 | | my-project | dataset2 | table1 | 38355294 | 22879180658120 | 38355294 | | my-project | dataset1 | table3 | 270126906 | 17594235226765 | 270126906 | | my-project | dataset2 | table2 | 95511309 | 17376036299631 | 95511309 | | my-project | dataset2 | table3 | 46500443 | 12834920497777 | 46500443 | | my-project | dataset2 | table4 | 25846270 | 7487917957360 | 25846270 | | my-project | dataset1 | table4 | 18318404 | 5665113765882 | 18318404 | | my-project | dataset1 | table5 | 42829431 | 5343969665771 | 42829431 | | my-project | dataset1 | table6 | 8771021 | 5119004622353 | 8771021 | +----------------------+------------+-------------------------------+------------+----------------+--------------+
示例 4:表的 BigQuery Storage Write API 注入错误率
以下示例计算某个特定表的错误的每日明细,按错误代码划分:
SELECT TIMESTAMP_TRUNC(start_timestamp, DAY) as day, project_id, dataset_id, table_id, error_code, SUM(total_rows) AS num_rows, SUM(total_input_bytes) AS num_bytes, SUM(total_requests) AS num_requests FROM `region-us`.INFORMATION_SCHEMA.WRITE_API_TIMELINE_BY_PROJECT WHERE table_id LIKE 'my_table' GROUP BY project_id, dataset_id, table_id, error_code, day ORDER BY day, project_id, dataset_id DESC;
结果类似于以下内容:
+---------------------+-------------+------------+----------+----------------+----------+-----------+--------------+ | day | project_id | dataset_id | table_id | error_code | num_rows | num_bytes | num_requests | +---------------------+-------------+------------+----------+----------------+----------+-----------+--------------+ | 2020-04-21 00:00:00 | my_project | my_dataset | my_table | OK | 41 | 252893 | 41 | | 2020-04-20 00:00:00 | my_project | my_dataset | my_table | OK | 2798 | 10688286 | 2798 | | 2020-04-19 00:00:00 | my_project | my_dataset | my_table | OK | 2005 | 7979495 | 2005 | | 2020-04-18 00:00:00 | my_project | my_dataset | my_table | OK | 2054 | 7972378 | 2054 | | 2020-04-17 00:00:00 | my_project | my_dataset | my_table | OK | 2056 | 6978079 | 2056 | | 2020-04-17 00:00:00 | my_project | my_dataset | my_table | INTERNAL | 4 | 10825 | 4 | +---------------------+-------------+------------+----------+----------------+----------+-----------+--------------+