STREAMING_TIMELINE view

The INFORMATION_SCHEMA.STREAMING_TIMELINE view contains per minute aggregated streaming statistics for the current project.

You can query the INFORMATION_SCHEMA streaming views to retrieve historical and real-time information about streaming data into BigQuery that uses the legacy tabledata.insertAll method and not the BigQuery Storage Write API. For more information about streaming data into BigQuery, see Streaming data into BigQuery.

Required permission

To query the INFORMATION_SCHEMA.STREAMING_TIMELINE view, you need the bigquery.tables.list Identity and Access Management (IAM) permission for the project.

Each of the following predefined IAM roles includes the required permission:

roles/bigquery.user
roles/bigquery.dataViewer
roles/bigquery.dataEditor
roles/bigquery.dataOwner
roles/bigquery.metadataViewer
roles/bigquery.resourceAdmin
roles/bigquery.admin

For more information about BigQuery permissions, see Access control with IAM.

Schema

When you query the INFORMATION_SCHEMA streaming views, the query results contain historical and real-time information about streaming data into BigQuery. Each row in the following views represents statistics for streaming into a specific table, aggregated over a one minute interval starting at start_timestamp. Statistics are grouped by error code, so there will be one row for each error code encountered during the one minute interval for each timestamp and table combination. Successful requests have the error code set to NULL. If no data was streamed into a table during a certain time period, then no rows are present for the corresponding timestamps for that table.

The INFORMATION_SCHEMA.STREAMING_TIMELINE view has the following schema:

Column name	Data type	Value
`start_timestamp`	`TIMESTAMP`	(Partitioning column) Start timestamp of the 1 minute interval for the aggregated statistics.
`project_id`	`STRING`	(Clustering column) ID of the project.
`project_number`	`INTEGER`	Number of the project.
`dataset_id`	`STRING`	(Clustering column) ID of the dataset.
`table_id`	`STRING`	(Clustering column) ID of the table.
`error_code`	`STRING`	Error code returned for the requests specified by this row. NULL for successful requests.
`total_requests`	`INTEGER`	Total number of requests within the 1 minute interval.
`total_rows`	`INTEGER`	Total number of rows from all requests within the 1 minute interval.
`total_input_bytes`	`INTEGER`	Total number of bytes from all rows within the 1 minute interval.

Data retention

This view contains the streaming history of the past 180 days.

Scope and syntax

Queries against this view must include a region qualifier. If you don't specify a regional qualifier, metadata is retrieved from all regions. The following table explains the region scope for this view:

View name	Resource scope	Region scope
[PROJECT_ID.]`region-REGION`.INFORMATION_SCHEMA.STREAMING_TIMELINE[_BY_PROJECT]	Project level	`REGION`

Replace the following:

Optional: PROJECT_ID: the ID of your Google Cloud project. If not specified, the default project is used.
REGION: any dataset region name. For example, `region-us`.

Note: You must use a region qualifier to query INFORMATION_SCHEMA views. The location of the query execution must match the region of the INFORMATION_SCHEMA view.

Example

To query data in the US multi-region, use `region-us`.INFORMATION_SCHEMA.STREAMING_TIMELINE_BY_PROJECT
To query data in the EU multi-region, use `region-eu`.INFORMATION_SCHEMA.STREAMING_TIMELINE_BY_PROJECT
To query data in the asia-northeast1 region, use `region-asia-northeast1`.INFORMATION_SCHEMA.STREAMING_TIMELINE_BY_PROJECT

For a list of available regions, see Dataset locations.

Examples

Example 1: Recent streaming failures

The following example calculates the per minute breakdown of total failed requests for all tables in the project in the last 30 minutes, split by error code:

SELECT
  start_timestamp,
  error_code,
  SUM(total_requests) AS num_failed_requests
FROM
  `region-us`.INFORMATION_SCHEMA.STREAMING_TIMELINE_BY_PROJECT
WHERE
  error_code IS NOT NULL
  AND start_timestamp > TIMESTAMP_SUB(CURRENT_TIMESTAMP, INTERVAL 30 MINUTE)
GROUP BY
  start_timestamp,
  error_code
ORDER BY
  start_timestamp DESC;

The result is similar to the following:

+---------------------+------------------+---------------------+
|   start_timestamp   |    error_code    | num_failed_requests |
+---------------------+------------------+---------------------+
| 2020-04-15 20:55:00 | INTERNAL_ERROR   |                  41 |
| 2020-04-15 20:41:00 | CONNECTION_ERROR |                   5 |
| 2020-04-15 20:30:00 | INTERNAL_ERROR   |                 115 |
+---------------------+------------------+---------------------+

Example 2: Per minute breakdown for all requests with error codes

The following example calculates a per minute breakdown of successful and failed streaming requests, split into error code categories. This query could be used to populate a dashboard.

SELECT
  start_timestamp,
  SUM(total_requests) AS total_requests,
  SUM(total_rows) AS total_rows,
  SUM(total_input_bytes) AS total_input_bytes,
  SUM(
    IF(
      error_code IN ('QUOTA_EXCEEDED', 'RATE_LIMIT_EXCEEDED'),
      total_requests,
      0)) AS quota_error,
  SUM(
    IF(
      error_code IN (
        'INVALID_VALUE', 'NOT_FOUND', 'SCHEMA_INCOMPATIBLE',
        'BILLING_NOT_ENABLED', 'ACCESS_DENIED', 'UNAUTHENTICATED'),
      total_requests,
      0)) AS user_error,
  SUM(
    IF(
      error_code IN ('CONNECTION_ERROR','INTERNAL_ERROR'),
      total_requests,
      0)) AS server_error,
  SUM(IF(error_code IS NULL, 0, total_requests)) AS total_error,
FROM
  `region-us`.INFORMATION_SCHEMA.STREAMING_TIMELINE_BY_PROJECT
GROUP BY
  start_timestamp
ORDER BY
  start_timestamp DESC;

The result is similar to the following:

+---------------------+----------------+------------+-------------------+-------------+------------+--------------+-------------+
|   start_timestamp   | total_requests | total_rows | total_input_bytes | quota_error | user_error | server_error | total_error |
+---------------------+----------------+------------+-------------------+-------------+------------+--------------+-------------+
| 2020-04-15 22:00:00 |         441854 |     441854 |       23784853118 |           0 |          0 |           17 |          17 |
| 2020-04-15 21:59:00 |         355627 |     355627 |       26101982742 |           0 |          0 |            0 |           0 |
| 2020-04-15 21:58:00 |         354603 |     354603 |       26160565341 |           0 |          0 |            0 |           0 |
| 2020-04-15 21:57:00 |         298823 |     298823 |       23877821442 |           0 |          0 |            0 |           0 |
+---------------------+----------------+------------+-------------------+-------------+------------+--------------+-------------+

Example 3: Tables with the most incoming traffic

The following example returns the streaming statistics for the 10 tables with the most incoming traffic:

SELECT
  project_id,
  dataset_id,
  table_id,
  SUM(total_rows) AS num_rows,
  SUM(total_input_bytes) AS num_bytes,
  SUM(total_requests) AS num_requests
FROM
  `region-us`.INFORMATION_SCHEMA.STREAMING_TIMELINE_BY_PROJECT
GROUP BY
  project_id,
  dataset_id,
  table_id
ORDER BY
  num_bytes DESC
LIMIT 10;

The result is similar to the following:

+----------------------+------------+-------------------------------+------------+----------------+--------------+
|      project_id      | dataset_id |           table_id            |  num_rows  |   num_bytes    | num_requests |
+----------------------+------------+-------------------------------+------------+----------------+--------------+
| my-project           | dataset1   | table1                        | 8016725532 | 73787301876979 |   8016725532 |
| my-project           | dataset1   | table2                        |   26319580 | 34199853725409 |     26319580 |
| my-project           | dataset2   | table1                        |   38355294 | 22879180658120 |     38355294 |
| my-project           | dataset1   | table3                        |  270126906 | 17594235226765 |    270126906 |
| my-project           | dataset2   | table2                        |   95511309 | 17376036299631 |     95511309 |
| my-project           | dataset2   | table3                        |   46500443 | 12834920497777 |     46500443 |
| my-project           | dataset2   | table4                        |   25846270 |  7487917957360 |     25846270 |
| my-project           | dataset1   | table4                        |   18318404 |  5665113765882 |     18318404 |
| my-project           | dataset1   | table5                        |   42829431 |  5343969665771 |     42829431 |
| my-project           | dataset1   | table6                        |    8771021 |  5119004622353 |      8771021 |
+----------------------+------------+-------------------------------+------------+----------------+--------------+

Example 4: Streaming error ratio for a table

The following example calculates a per-day breakdown of errors for a specific table, split by error code:

SELECT
  TIMESTAMP_TRUNC(start_timestamp, DAY) as day,
  project_id,
  dataset_id,
  table_id,
  error_code,
  SUM(total_rows) AS num_rows,
  SUM(total_input_bytes) AS num_bytes,
  SUM(total_requests) AS num_requests
FROM
  `region-us`.INFORMATION_SCHEMA.STREAMING_TIMELINE_BY_PROJECT
WHERE
  table_id LIKE 'my_table'
GROUP BY
  project_id, dataset_id, table_id, error_code, day
ORDER BY
  day, project_id, dataset_id DESC;

The result is similar to the following:

+---------------------+-------------+------------+----------+----------------+----------+-----------+--------------+
|         day         |  project_id | dataset_id | table_id |   error_code   | num_rows | num_bytes | num_requests |
+---------------------+-------------+------------+----------+----------------+----------+-----------+--------------+
| 2020-04-21 00:00:00 | my_project  | my_dataset | my_table | NULL           |       41 |    252893 |           41 |
| 2020-04-20 00:00:00 | my_project  | my_dataset | my_table | NULL           |     2798 |  10688286 |         2798 |
| 2020-04-19 00:00:00 | my_project  | my_dataset | my_table | NULL           |     2005 |   7979495 |         2005 |
| 2020-04-18 00:00:00 | my_project  | my_dataset | my_table | NULL           |     2054 |   7972378 |         2054 |
| 2020-04-17 00:00:00 | my_project  | my_dataset | my_table | NULL           |     2056 |   6978079 |         2056 |
| 2020-04-17 00:00:00 | my_project  | my_dataset | my_table | INTERNAL_ERROR |        4 |     10825 |            4 |
+---------------------+-------------+------------+----------+----------------+----------+-----------+--------------+