- 3.27.0 (latest)
- 3.26.0
- 3.25.0
- 3.24.0
- 3.23.1
- 3.22.0
- 3.21.0
- 3.20.1
- 3.19.0
- 3.18.0
- 3.17.2
- 3.16.0
- 3.15.0
- 3.14.1
- 3.13.0
- 3.12.0
- 3.11.4
- 3.4.0
- 3.3.6
- 3.2.0
- 3.1.0
- 3.0.1
- 2.34.4
- 2.33.0
- 2.32.0
- 2.31.0
- 2.30.1
- 2.29.0
- 2.28.1
- 2.27.1
- 2.26.0
- 2.25.2
- 2.24.1
- 2.23.3
- 2.22.1
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.1
- 2.15.0
- 2.14.0
- 2.13.1
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.2
- 2.5.0
- 2.4.0
- 2.3.1
- 2.2.0
- 2.1.0
- 2.0.0
- 1.28.2
- 1.27.2
- 1.26.1
- 1.25.0
- 1.24.0
- 1.23.1
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
QueryJob(job_id, query, client, job_config=None)
Asynchronous job: query tables.
Parameters
Name | Description |
job_id |
str
the job's ID, within the project belonging to |
query |
str
SQL query string. |
client |
google.cloud.bigquery.client.Client
A client which holds credentials and project configuration for the dataset (which requires a project). |
job_config |
Optional[google.cloud.bigquery.job.QueryJobConfig]
Extra configuration options for the query job. |
Inheritance
builtins.object > google.api_core.future.base.Future > google.api_core.future.polling.PollingFuture > google.cloud.bigquery.job.base._AsyncJob > QueryJobProperties
allow_large_results
See allow_large_results.
billing_tier
Return billing tier from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.billing_tier
Type | Description |
Optional[int] | Billing tier used by the job, or None if job is not yet complete. |
cache_hit
Return whether or not query results were served from cache.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.cache_hit
Type | Description |
Optional[bool] | whether the query results were returned from cache, or None if job is not yet complete. |
clustering_fields
See clustering_fields.
create_disposition
See create_disposition.
created
Datetime at which the job was created.
Type | Description |
Optional[datetime.datetime] | the creation time (None until set from the server). |
ddl_operation_performed
Optional[str]: Return the DDL operation performed.
ddl_target_routine
Optional[google.cloud.bigquery.routine.RoutineReference]: Return the DDL target routine, present for CREATE/DROP FUNCTION/PROCEDURE queries.
ddl_target_table
Optional[google.cloud.bigquery.table.TableReference]: Return the DDL target table, present for CREATE/DROP TABLE/VIEW queries.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_target_table
default_dataset
See default_dataset.
destination
See destination.
destination_encryption_configuration
google.cloud.bigquery.encryption_configuration.EncryptionConfiguration: Custom encryption configuration for the destination table.
Custom encryption configuration (e.g., Cloud KMS keys) or :data:None
if using default encryption.
dry_run
See dry_run.
ended
Datetime at which the job finished.
Type | Description |
Optional[datetime.datetime] | the end time (None until set from the server). |
error_result
Error information about the job as a whole.
Type | Description |
Optional[Mapping] | the error information (None until set from the server). |
errors
Information about individual errors generated by the job.
Type | Description |
Optional[List[Mapping]] | the error information (None until set from the server). |
estimated_bytes_processed
Return the estimated number of bytes processed by the query.
Type | Description |
Optional[int] | number of DML rows affected by the job, or None if job is not yet complete. |
etag
ETag for the job resource.
Type | Description |
Optional[str] | the ETag (None until set from the server). |
flatten_results
See flatten_results.
job_id
str: ID of the job.
job_type
Type of job.
Type | Description |
str | one of 'load', 'copy', 'extract', 'query'. |
labels
Dict[str, str]: Labels for the job.
location
str: Location where the job runs.
maximum_billing_tier
See maximum_billing_tier.
maximum_bytes_billed
See maximum_bytes_billed.
num_child_jobs
The number of child jobs executed.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.num_child_jobs
num_dml_affected_rows
Return the number of DML rows affected by the job.
Type | Description |
Optional[int] | number of DML rows affected by the job, or None if job is not yet complete. |
parent_job_id
Return the ID of the parent job.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.parent_job_id
Type | Description |
Optional[str] | parent job id. |
path
URL path for the job's APIs.
Type | Description |
str | the path based on project and job ID. |
priority
See priority.
project
Project bound to the job.
Type | Description |
str | the project (derived from the client). |
query
str: The query text used in this query job.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery.FIELDS.query
query_parameters
See query_parameters.
query_plan
Return query plan from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.query_plan
Type | Description |
List[google.cloud.bigquery.job.QueryPlanEntry] | mappings describing the query plan, or an empty list if the query has not yet completed. |
range_partitioning
See range_partitioning.
referenced_tables
Return referenced tables from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.referenced_tables
Type | Description |
List[Dict] | mappings describing the query plan, or an empty list if the query has not yet completed. |
reservation_usage
Job resource usage breakdown by reservation.
Type | Description |
List[google.cloud.bigquery.job.ReservationUsage] | Reservation usage stats. Can be empty if not set from the server. |
schema_update_options
self_link
URL for the job resource.
Type | Description |
Optional[str] | the URL (None until set from the server). |
slot_millis
Union[int, None]: Slot-milliseconds used by this query job.
started
Datetime at which the job was started.
Type | Description |
Optional[datetime.datetime] | the start time (None until set from the server). |
state
Status of the job.
Type | Description |
Optional[str] | the state (None until set from the server). |
statement_type
Return statement type from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.statement_type
Type | Description |
Optional[str] | type of statement used by the job, or None if job is not yet complete. |
table_definitions
See table_definitions.
time_partitioning
See time_partitioning.
timeline
List(TimelineEntry): Return the query execution timeline from job statistics.
total_bytes_billed
Return total bytes billed from job statistics, if present.
Type | Description |
Optional[int] | Total bytes processed by the job, or None if job is not yet complete. |
total_bytes_processed
Return total bytes processed from job statistics, if present.
Type | Description |
Optional[int] | Total bytes processed by the job, or None if job is not yet complete. |
transaction_info
Information of the multi-statement transaction if this job is part of one.
.. versionadded:: 2.24.0
udf_resources
See udf_resources.
undeclared_query_parameters
Return undeclared query parameters from job statistics, if present.
Type | Description |
List[Union[ google.cloud.bigquery.query.ArrayQueryParameter, google.cloud.bigquery.query.ScalarQueryParameter, google.cloud.bigquery.query.StructQueryParameter ]] | Undeclared parameters, or an empty list if the query has not yet completed. |
use_legacy_sql
See use_legacy_sql.
use_query_cache
See use_query_cache.
user_email
E-mail address of user who submitted the job.
Type | Description |
Optional[str] | the URL (None until set from the server). |
write_disposition
See write_disposition.
dml_stats
API documentation for bigquery.job.QueryJob.dml_stats
property.
script_statistics
API documentation for bigquery.job.QueryJob.script_statistics
property.
Methods
add_done_callback
add_done_callback(fn)
Add a callback to be executed when the operation is complete.
If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.
Name | Description |
fn |
Callable[Future]
The callback to execute when the operation is complete. |
cancel
cancel(client=None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)
API call: cancel job via a POST request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel
Name | Description |
timeout |
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using |
client |
Optional[google.cloud.bigquery.client.Client]
the client to use. If not passed, falls back to the |
retry |
Optional[google.api_core.retry.Retry]
How to retry the RPC. |
Type | Description |
bool | Boolean indicating that the cancel request was sent. |
cancelled
cancelled()
Check if the job has been cancelled.
This always returns False. It's not possible to check if a job was
cancelled in the API. This method is here to satisfy the interface
for google.api_core.future.Future
.
Type | Description |
bool | False |
done
done(retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None, reload: bool = True)
Checks if the job is complete.
Name | Description |
timeout |
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using |
reload |
Optional[bool]
If |
retry |
Optional[google.api_core.retry.Retry]
How to retry the RPC. If the job state is |
Type | Description |
bool | True if the job is complete, False otherwise. |
exception
exception(timeout=None)
Get the exception from the operation, blocking if necessary.
Name | Description |
timeout |
int
How long to wait for the operation to complete. If None, wait indefinitely. |
Type | Description |
Optional[google.api_core.GoogleAPICallError] | The operation's error. |
exists
exists(client=None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)
API call: test for the existence of the job via a GET request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
Name | Description |
timeout |
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using |
client |
Optional[google.cloud.bigquery.client.Client]
the client to use. If not passed, falls back to the |
retry |
Optional[google.api_core.retry.Retry]
How to retry the RPC. |
Type | Description |
bool | Boolean indicating existence of the job. |
from_api_repr
from_api_repr(resource: dict, client)
Factory: construct a job given its API representation
Name | Description |
resource |
Dict
dataset job representation returned from the API |
client |
google.cloud.bigquery.client.Client
Client which holds credentials and project configuration for the dataset. |
Type | Description |
google.cloud.bigquery.job.QueryJob | Job parsed from ``resource``. |
reload
reload(client=None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)
API call: refresh job properties via a GET request.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
Name | Description |
timeout |
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using |
client |
Optional[google.cloud.bigquery.client.Client]
the client to use. If not passed, falls back to the |
retry |
Optional[google.api_core.retry.Retry]
How to retry the RPC. |
result
result(page_size: int = None, max_results: int = None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None, start_index: int = None, job_retry: retries.Retry = <google.api_core.retry.Retry object>)
Start the job and wait for it to complete and get the result.
Name | Description |
page_size |
Optional[int]
The maximum number of rows in each page of results from this request. Non-positive values are ignored. |
max_results |
Optional[int]
The maximum total number of rows from this request. |
timeout |
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using |
start_index |
Optional[int]
The zero-based index of the starting row to read. |
retry |
Optional[google.api_core.retry.Retry]
How to retry the call that retrieves rows. This only applies to making RPC calls. It isn't used to retry failed jobs. This has a reasonable default that should only be overridden with care. If the job state is |
job_retry |
Optional[google.api_core.retry.Retry]
How to retry failed jobs. The default retries rate-limit-exceeded errors. Passing |
Type | Description |
google.cloud.exceptions.GoogleAPICallError | If the job failed and retries aren't successful. |
concurrent.futures.TimeoutError | If the job did not complete in the given timeout. |
TypeError | If Non-``None`` and non-default ``job_retry`` is provided and the job is not retryable. |
Type | Description |
google.cloud.bigquery.table.RowIterator | Iterator of row data Row-s. During each page, the iterator will have the ``total_rows`` attribute set, which counts the total number of rows **in the result set** (this is distinct from the total number of rows in the current page: ``iterator.page.num_items``). If the query is a special query that produces no results, e.g. a DDL query, an ``_EmptyRowIterator`` instance is returned. |
running
running()
True if the operation is currently running.
set_exception
set_exception(exception)
Set the Future's exception.
set_result
set_result(result)
Set the Future's result.
to_api_repr
to_api_repr()
Generate a resource for _begin
.
to_arrow
to_arrow(
progress_bar_type: str = None,
bqstorage_client: bigquery_storage.BigQueryReadClient = None,
create_bqstorage_client: bool = True,
max_results: Optional[int] = None,
)
[Beta] Create a class:pyarrow.Table
by loading all pages of a
table or query.
Name | Description |
progress_bar_type |
Optional[str]
If set, use the |
create_bqstorage_client |
Optional[bool]
If |
max_results |
Optional[int]
Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0 |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires the |
Type | Description |
ValueError | If the `pyarrow` library cannot be imported. .. versionadded:: 1.17.0 |
to_dataframe
to_dataframe(
bqstorage_client: bigquery_storage.BigQueryReadClient = None,
dtypes: Dict[str, Any] = None,
progress_bar_type: str = None,
create_bqstorage_client: bool = True,
date_as_object: bool = True,
max_results: Optional[int] = None,
geography_as_object: bool = False,
)
Return a pandas DataFrame from a QueryJob
Name | Description |
dtypes |
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas |
progress_bar_type |
Optional[str]
If set, use the |
create_bqstorage_client |
Optional[bool]
If |
date_as_object |
Optional[bool]
If |
max_results |
Optional[int]
Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0 |
geography_as_object |
Optional[bool]
If |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires the |
Type | Description |
ValueError | If the `pandas` library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. Also if `geography_as_object` is `True`, but the `shapely` library cannot be imported. |
Type | Description |
pandas.DataFrame | A `pandas.DataFrame` populated with row data and column headers from the query results. The column headers are derived from the destination table's schema. |
to_geodataframe
to_geodataframe(
bqstorage_client: bigquery_storage.BigQueryReadClient = None,
dtypes: Dict[str, Any] = None,
progress_bar_type: str = None,
create_bqstorage_client: bool = True,
date_as_object: bool = True,
max_results: Optional[int] = None,
geography_column: Optional[str] = None,
)
Return a GeoPandas GeoDataFrame from a QueryJob
Name | Description |
dtypes |
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas |
progress_bar_type |
Optional[str]
If set, use the |
create_bqstorage_client |
Optional[bool]
If |
date_as_object |
Optional[bool]
If |
max_results |
Optional[int]
Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0 |
geography_column |
Optional[str]
If there are more than one GEOGRAPHY column, identifies which one to use to construct a GeoPandas GeoDataFrame. This option can be ommitted if there's only one GEOGRAPHY column. |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires the |
Type | Description |
ValueError | If the `geopandas` library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. .. versionadded:: 2.24.0 |
Type | Description |
geopandas.GeoDataFrame | A `geopandas.GeoDataFrame` populated with row data and column headers from the query results. The column headers are derived from the destination table's schema. |
__init__
__init__(job_id, query, client, job_config=None)
Initialize self. See help(type(self)) for accurate signature.
QueryJob
QueryJob(job_id, query, client, job_config=None)
Asynchronous job: query tables.
Name | Description |
job_id |
str
the job's ID, within the project belonging to |
query |
str
SQL query string. |
client |
google.cloud.bigquery.client.Client
A client which holds credentials and project configuration for the dataset (which requires a project). |
job_config |
Optional[google.cloud.bigquery.job.QueryJobConfig]
Extra configuration options for the query job. |