Class ExtractJob (2.28.1)

ExtractJob(job_id, source, destination_uris, client, job_config=None)

Asynchronous job: extract data from a table into Cloud Storage.

Parameters

Name	Description
job_id	`str` the job's ID.
source	`Union[ google.cloud.bigquery.table.TableReference, google.cloud.bigquery.model.ModelReference ]` Table or Model from which data is to be loaded or extracted.
destination_uris	`List[str]` URIs describing where the extracted data will be written in Cloud Storage, using the format `gs://<bucket_name>/<object_name_or_glob>`.
client	`google.cloud.bigquery.client.Client` A client which holds credentials and project configuration.
job_config	`Optional[google.cloud.bigquery.job.ExtractJobConfig]` Extra configuration options for the extract job.

Inheritance

builtins.object > google.api_core.future.base.Future > google.api_core.future.polling.PollingFuture > google.cloud.bigquery.job.base._AsyncJob > ExtractJob

Properties

compression

See compression.

created

Datetime at which the job was created.

Returns

Type	Description
Optional[datetime.datetime]	the creation time (None until set from the server).

destination_format

See destination_format.

destination_uri_file_counts

Return file counts from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics4.FIELDS.destination_uri_file_counts

Returns

Type	Description
List[int]	A list of integer counts, each representing the number of files per destination URI or URI pattern specified in the extract configuration. These values will be in the same order as the URIs specified in the 'destinationUris' field. Returns None if job is not yet complete.

destination_uris

List[str]: URIs describing where the extracted data will be written in Cloud Storage, using the format gs://<bucket_name>/<object_name_or_glob>.

ended

Datetime at which the job finished.

Returns

Type	Description
Optional[datetime.datetime]	the end time (None until set from the server).

error_result

Error information about the job as a whole.

Returns

Type	Description
Optional[Mapping]	the error information (None until set from the server).

errors

Information about individual errors generated by the job.

Returns

Type	Description
Optional[List[Mapping]]	the error information (None until set from the server).

etag

ETag for the job resource.

Returns

Type	Description
Optional[str]	the ETag (None until set from the server).

field_delimiter

See field_delimiter.

job_id

str: ID of the job.

job_type

Type of job.

Returns

Type	Description
str	one of 'load', 'copy', 'extract', 'query'.

labels

Dict[str, str]: Labels for the job.

location

str: Location where the job runs.

num_child_jobs

The number of child jobs executed.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.num_child_jobs

parent_job_id

Return the ID of the parent job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.parent_job_id

Returns

Type	Description
Optional[str]	parent job id.

path

URL path for the job's APIs.

Returns

Type	Description
str	the path based on project and job ID.

print_header

See print_header.

project

Project bound to the job.

Returns

Type	Description
str	the project (derived from the client).

reservation_usage

Job resource usage breakdown by reservation.

Returns

Type	Description
List[google.cloud.bigquery.job.ReservationUsage]	Reservation usage stats. Can be empty if not set from the server.

self_link

URL for the job resource.

Returns

Type	Description
Optional[str]	the URL (None until set from the server).

source

Union[ google.cloud.bigquery.table.TableReference, google.cloud.bigquery.model.ModelReference ]: Table or Model from which data is to be loaded or extracted.

started

Datetime at which the job was started.

Returns

Type	Description
Optional[datetime.datetime]	the start time (None until set from the server).

state

Status of the job.

Returns

Type	Description
Optional[str]	the state (None until set from the server).

transaction_info

Information of the multi-statement transaction if this job is part of one.

Since a scripting query job can execute multiple transactions, this property is only expected on child jobs. Use the list_jobs method with the parent_job parameter to iterate over child jobs.

.. versionadded:: 2.24.0

user_email

E-mail address of user who submitted the job.

Returns

Type	Description
Optional[str]	the URL (None until set from the server).

script_statistics

API documentation for bigquery.job.ExtractJob.script_statistics property.

Methods

add_done_callback

add_done_callback(fn)

Add a callback to be executed when the operation is complete.

If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.

Parameter

Name	Description
fn	`Callable[Future]` The callback to execute when the operation is complete.

cancel

cancel(client=None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)

API call: cancel job via a POST request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel

Parameters

Name	Description
timeout	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`
client	`Optional[google.cloud.bigquery.client.Client]` the client to use. If not passed, falls back to the `client` stored on the current dataset.
retry	`Optional[google.api_core.retry.Retry]` How to retry the RPC.

Returns

Type	Description
bool	Boolean indicating that the cancel request was sent.

cancelled

cancelled()

Check if the job has been cancelled.

This always returns False. It's not possible to check if a job was cancelled in the API. This method is here to satisfy the interface for google.api_core.future.Future.

Returns

Type	Description
bool	False

done

done(retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None, reload: bool = True)

Checks if the job is complete.

Parameters

Name	Description
timeout	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`.
reload	`Optional[bool]` If `True`, make an API call to refresh the job state of unfinished jobs before checking. Default `True`.
retry	`Optional[google.api_core.retry.Retry]` How to retry the RPC. If the job state is `DONE`, retrying is aborted early, as the job will not change anymore.

Returns

Type	Description
bool	True if the job is complete, False otherwise.

exception

exception(timeout=None)

Get the exception from the operation, blocking if necessary.

Parameter

Name	Description
timeout	`int` How long to wait for the operation to complete. If None, wait indefinitely.

Returns

Type	Description
Optional[google.api_core.GoogleAPICallError]	The operation's error.

exists

exists(client=None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)

API call: test for the existence of the job via a GET request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters

Name	Description
timeout	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`.
client	`Optional[google.cloud.bigquery.client.Client]` the client to use. If not passed, falls back to the `client` stored on the current dataset.
retry	`Optional[google.api_core.retry.Retry]` How to retry the RPC.

Returns

Type	Description
bool	Boolean indicating existence of the job.

from_api_repr

from_api_repr(resource: dict, client)

Factory: construct a job given its API representation

Parameters

Name	Description
resource	`Dict` dataset job representation returned from the API
client	`google.cloud.bigquery.client.Client` Client which holds credentials and project configuration for the dataset.

Returns

Type	Description
google.cloud.bigquery.job.ExtractJob	Job parsed from ``resource``.

reload

reload(client=None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)

API call: refresh job properties via a GET request.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters

Name	Description
timeout	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`.
client	`Optional[google.cloud.bigquery.client.Client]` the client to use. If not passed, falls back to the `client` stored on the current dataset.
retry	`Optional[google.api_core.retry.Retry]` How to retry the RPC.

result

result(retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)

Start the job and wait for it to complete and get the result.

Parameters

Name	Description
timeout	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`. If multiple requests are made under the hood, `timeout` applies to each individual request.
retry	`Optional[google.api_core.retry.Retry]` How to retry the RPC. If the job state is `DONE`, retrying is aborted early, as the job will not change anymore.

Exceptions

Type	Description
google.cloud.exceptions.GoogleAPICallError	if the job failed.
concurrent.futures.TimeoutError	if the job did not complete in the given timeout.

Returns

Type	Description
_AsyncJob	This instance.

running

running()

True if the operation is currently running.

set_exception

set_exception(exception)

Set the Future's exception.

set_result

set_result(result)

Set the Future's result.

to_api_repr

to_api_repr()

Generate a resource for _begin.

init

__init__(job_id, source, destination_uris, client, job_config=None)

Initialize self. See help(type(self)) for accurate signature.

ExtractJob

ExtractJob(job_id, source, destination_uris, client, job_config=None)

Asynchronous job: extract data from a table into Cloud Storage.

Parameters

Name	Description
job_id	`str` the job's ID.
source	`Union[ google.cloud.bigquery.table.TableReference, google.cloud.bigquery.model.ModelReference ]` Table or Model from which data is to be loaded or extracted.
destination_uris	`List[str]` URIs describing where the extracted data will be written in Cloud Storage, using the format `gs://<bucket_name>/<object_name_or_glob>`.
client	`google.cloud.bigquery.client.Client` A client which holds credentials and project configuration.
job_config	`Optional[google.cloud.bigquery.job.ExtractJobConfig]` Extra configuration options for the extract job.