JOBS_BY_USER view

The INFORMATION_SCHEMA.JOBS_BY_USER view contains near real-time metadata about the BigQuery jobs submitted by the current user in the current project.

Required role

To get the permission that you need to query the INFORMATION_SCHEMA.JOBS_BY_USER view, ask your administrator to grant you the BigQuery User (roles/bigquery.user) IAM role on your project. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the bigquery.jobs.list permission, which is required to query the INFORMATION_SCHEMA.JOBS_BY_USER view.

You might also be able to get this permission with custom roles or other predefined roles.

For more information about BigQuery permissions, see Access control with IAM.

Schema

The underlying data is partitioned by the creation_time column and clustered by project_id and user_email.

The INFORMATION_SCHEMA.JOBS_BY_USER view has the following schema:

Column name Data type Value
bi_engine_statistics RECORD If the project is configured to use the BI Engine SQL Interface, then this field contains BiEngineStatistics. Otherwise NULL.
cache_hit BOOLEAN Whether the query results of this job were from a cache. If you have a multi-query statement job, cache_hit for your parent query is NULL.
creation_time TIMESTAMP (Partitioning column) Creation time of this job. Partitioning is based on the UTC time of this timestamp.
destination_table RECORD Destination table for results, if any.
dml_statistics RECORD If the job is a query with a DML statement, the value is a record with the following fields:
  • inserted_row_count: The number of rows that were inserted.
  • deleted_row_count: The number of rows that were deleted.
  • updated_row_count: The number of rows that were updated.
For all other jobs, the value is NULL.
This column is present in the INFORMATION_SCHEMA.JOBS_BY_USER and INFORMATION_SCHEMA.JOBS_BY_PROJECT views.
end_time TIMESTAMP The end time of this job, in milliseconds since the epoch. This field represents the time when the job enters the DONE state.
error_result RECORD Details of any errors as ErrorProto objects.
job_creation_reason.code STRING Specifies the high level reason why a job was created.
Possible values are:
  • REQUESTED: job creation was requested.
  • LONG_RUNNING: the query request ran beyond a system defined timeout specified by the timeoutMs field in the QueryRequest. As a result it was considered a long running operation for which a job was created.
  • LARGE_RESULTS: the results from the query cannot fit in the in-line response.
  • OTHER: the system has determined that the query needs to be executed as a job.
job_id STRING The ID of the job if a job was created. Otherwise, the query ID of a query using short query mode. For example, bquxjob_1234.
job_stages RECORD Query stages of the job.

Note: This column's values are empty for queries that read from tables with row-level access policies. For more information, see best practices for row-level security in BigQuery.

job_type STRING The type of the job. Can be QUERY, LOAD, EXTRACT, COPY, or NULL. A NULL value indicates an internal job, such as a script job statement evaluation or a materialized view refresh.
labels RECORD Array of labels applied to the job as key-value pairs.
parent_job_id STRING ID of the parent job, if any.
priority STRING The priority of this job. Valid values include INTERACTIVE and BATCH.
project_id STRING (Clustering column) The ID of the project.
project_number INTEGER The number of the project.
query STRING SQL query text. Only the JOBS_BY_PROJECT view has the query column.
referenced_tables RECORD Array of tables referenced by the job. Only populated for query jobs that are not cache hits.
reservation_id STRING Name of the primary reservation assigned to this job, in the format RESERVATION_ADMIN_PROJECT:RESERVATION_LOCATION.RESERVATION_NAME.
In this output:
  • RESERVATION_ADMIN_PROJECT: the name of the Google Cloud project that administers the reservation
  • RESERVATION_LOCATION: the location of the reservation
  • RESERVATION_NAME: the name of the reservation
edition STRING The edition associated with the reservation assigned to this job. For more information about editions, see Introduction to BigQuery editions.
session_info RECORD Details about the session in which this job ran, if any.
start_time TIMESTAMP The start time of this job, in milliseconds since the epoch. This field represents the time when the job transitions from the PENDING state to either RUNNING or DONE.
state STRING Running state of the job. Valid states include PENDING, RUNNING, and DONE.
statement_type STRING The type of query statement. For example, DELETE, INSERT, SCRIPT, SELECT, or UPDATE. See QueryStatementType for list of valid values.
timeline RECORD Query timeline of the job. Contains snapshots of query execution.
total_bytes_billed INTEGER If the project is configured to use on-demand pricing, then this field contains the total bytes billed for the job. If the project is configured to use flat-rate pricing, then you are not billed for bytes and this field is informational only.

Note: This column's values are empty for queries that read from tables with row-level access policies. For more information, see best practices for row-level security in BigQuery.

total_bytes_processed INTEGER

Total bytes processed by the job.

Note: This column's values are empty for queries that read from tables with row-level access policies. For more information, see best practices for row-level security in BigQuery.

total_modified_partitions INTEGER The total number of partitions the job modified. This field is populated for LOAD and QUERY jobs.
total_slot_ms INTEGER Slot milliseconds for the job over its entire duration in the RUNNING state, including retries.
transaction_id STRING ID of the transaction in which this job ran, if any. (Preview)
user_email STRING (Clustering column) Email address or service account of the user who ran the job.
query_info.resource_warning STRING The warning message that appears if the resource usage during query processing is above the internal threshold of the system.
A successful query job can have the resource_warning field populated. With resource_warning, you get additional data points to optimize your queries and to set up monitoring for performance trends of an equivalent set of queries by using query_hashes.
query_info.query_hashes.normalized_literals STRING Contains the hashes of the query. normalized_literals is a hexadecimal STRING hash that ignores comments, parameter values, UDFs, and literals. The hash value will differ when underlying views change, or if the query implicitly references columns, such as SELECT *, and the table schema changes.
This field appears for successful GoogleSQL queries that are not cache hits.
query_info.performance_insights RECORD Performance insights for the job.
query_info.optimization_details STRUCT The history-based optimizations for the job.
transferred_bytes INTEGER Total bytes transferred for cross-cloud queries, such as BigQuery Omni cross-cloud transfer jobs.
materialized_view_statistics RECORD Statistics of materialized views considered in a query job. (Preview)

Data retention

This view contains currently running jobs and the job history of the past 180 days.

Scope and syntax

Queries against this view must include a region qualifier. The following table explains the region scope for this view:

View name Resource scope Region scope
[PROJECT_ID.]`region-REGION`.INFORMATION_SCHEMA.JOBS_BY_USER Jobs submitted by the current user in the specified project. REGION
Replace the following:

  • Optional: PROJECT_ID: the ID of your Google Cloud project. If not specified, the default project is used.

Example

To run the query against a project other than your default project, add the project ID in the following format:

`PROJECT_ID`.`region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_USER
Replace the following:

  • PROJECT_ID: the ID of the project
  • REGION_NAME: the region for your project

For example, `myproject`.`region-us`.INFORMATION_SCHEMA.JOBS_BY_USER.

View pending or running jobs

SELECT
  job_id,
  creation_time,
  query
FROM
  `region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_USER
WHERE
  state != 'DONE';

The result is similar to the following:

+--------------+---------------------------+---------------------------------+
| job_id       |  creation_time            |  query                          |
+--------------+---------------------------+---------------------------------+
| bquxjob_1    |  2019-10-10 00:00:00 UTC  |  SELECT ... FROM dataset.table1 |
| bquxjob_2    |  2019-10-10 00:00:01 UTC  |  SELECT ... FROM dataset.table2 |
| bquxjob_3    |  2019-10-10 00:00:02 UTC  |  SELECT ... FROM dataset.table3 |
| bquxjob_4    |  2019-10-10 00:00:03 UTC  |  SELECT ... FROM dataset.table4 |
| bquxjob_5    |  2019-10-10 00:00:04 UTC  |  SELECT ... FROM dataset.table5 |
+--------------+---------------------------+---------------------------------+

View performance insights for queries

The following example returns all your query jobs that have performance insights in the last 30 days, along with a URL that links to the query execution graph in the Google Cloud console.

SELECT
  `bigquery-public-data`.persistent_udfs.job_url(
    project_id || ':us.' || job_id) AS job_url,
  query_info.performance_insights
FROM
  `region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_USER
WHERE
  DATE(creation_time) >= CURRENT_DATE - 30 -- scan 30 days of query history
  AND job_type = 'QUERY'
  AND state = 'DONE'
  AND error_result IS NULL
  AND statement_type != 'SCRIPT'
  AND EXISTS ( -- Only include queries which had performance insights
    SELECT 1
    FROM UNNEST(
      query_info.performance_insights.stage_performance_standalone_insights
    )
    WHERE slot_contention OR insufficient_shuffle_quota
    UNION ALL
    SELECT 1
    FROM UNNEST(
      query_info.performance_insights.stage_performance_change_insights
    )
    WHERE input_data_change.records_read_diff_percentage IS NOT NULL
  );