Class Client

Client to bundle configuration needed for API requests.

Inheritance

builtins.object > google.cloud.client._ClientFactoryMixin > google.cloud.client.Client > builtins.object > google.cloud.client._ClientProjectMixin > google.cloud.client.ClientWithProject > Client

Properties

location

Default location for jobs / datasets / tables.

Methods

__getstate__

__getstate__()

Explicitly state that clients are not pickleable.

cancel_job

cancel_job(job_id: str, project: Optional[str] = None, location: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)
Parameters
NameDescription
job_id Union[ str, google.cloud.bigquery.job.LoadJob, google.cloud.bigquery.job.CopyJob, google.cloud.bigquery.job.ExtractJob, google.cloud.bigquery.job.QueryJob ] :keyword project: ID of the project which owns the job (defaults to the client's project). :kwtype project: Optional[str] :keyword location: Location where the job was run. Ignored if ``job_id`` is a job object. :kwtype location: Optional[str] :keyword retry: How to retry the RPC. :kwtype retry: Optional[google.api_core.retry.Retry] :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using ``retry``. :kwtype timeout: Optional[float]

Job identifier.

project str
location str
retry Retry
timeout None(
Returns
TypeDescription
Union[ google.cloud.bigquery.job.LoadJob, google.cloud.bigquery.job.CopyJob, google.cloud.bigquery.job.ExtractJob, google.cloud.bigquery.job.QueryJob, ]Job instance, based on the resource returned by the API.

close

close()

Close the underlying transport objects, releasing system resources.

.. note::

The client instance can be used for making additional requests even
after closing, in which case the underlying connections are
automatically re-created.

copy_table

copy_table(sources: Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str, Sequence[Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str]]], destination: Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str], job_id: Optional[str] = None, job_id_prefix: Optional[str] = None, location: Optional[str] = None, project: Optional[str] = None, job_config: Optional[google.cloud.bigquery.job.copy_.CopyJobConfig] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)
Parameters
NameDescription
sources Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str, Sequence[ Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str, ] ], ]

Table or tables to be copied.

destination Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str, ] :keyword job_id: The ID of the job. :kwtype job_id: Optional[str] :keyword job_id_prefix: The user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: Optional[str] :keyword location: Location where to run the job. Must match the location of any source table as well as the destination table. :kwtype location: Optional[str] :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: Optional[str] :keyword job_config: Extra configuration options for the job. :kwtype job_config: Optional[google.cloud.bigquery.job.CopyJobConfig] :keyword retry: How to retry the RPC. :kwtype retry: Optional[google.api_core.retry.Retry] :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using ``retry``. :kwtype timeout: Optional[float]

Table into which data is to be copied.

job_id str
job_id_prefix str
location str
project str
job_config CopyJobConfig
retry Retry
timeout None(
Exceptions
TypeDescription
TypeErrorIf ``job_config`` is not an instance of CopyJobConfig class.
Returns
TypeDescription
google.cloud.bigquery.job.CopyJobA new copy job instance.

create_dataset

create_dataset(dataset: Union[str, google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem], exists_ok: bool = False, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

API call: create the dataset via a POST request.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert

Parameters
NameDescription
dataset Union[ google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem, str, ]

A Dataset to create. If dataset is a reference, an empty dataset is created with the specified ID and client's default location.

exists_ok Optional[bool]

Defaults to False. If True, ignore "already exists" errors when creating the dataset.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Exceptions
TypeDescription
google.cloud.exceptions.ConflictIf the dataset already exists. .. rubric:: Example >>> from google.cloud import bigquery >>> client = bigquery.Client() >>> dataset = bigquery.Dataset('my_project.my_dataset') >>> dataset = client.create_dataset(dataset)
Returns
TypeDescription
google.cloud.bigquery.dataset.DatasetA new ``Dataset`` returned from the API.

create_job

create_job(job_config: dict, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

Create a new job.

Parameters
NameDescription
job_config dict :keyword retry: How to retry the RPC. :kwtype retry: Optional[google.api_core.retry.Retry] :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using ``retry``. :kwtype timeout: Optional[float]

configuration job representation returned from the API.

retry Retry
timeout None(
Returns
TypeDescription
Union[ google.cloud.bigquery.job.LoadJob, google.cloud.bigquery.job.CopyJob, google.cloud.bigquery.job.ExtractJob, google.cloud.bigquery.job.QueryJob ]A new job instance.

create_routine

create_routine(routine: google.cloud.bigquery.routine.routine.Routine, exists_ok: bool = False, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

[Beta] Create a routine via a POST request.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/routines/insert

Parameters
NameDescription
routine google.cloud.bigquery.routine.Routine

A Routine to create. The dataset that the routine belongs to must already exist.

exists_ok Optional[bool]

Defaults to False. If True, ignore "already exists" errors when creating the routine.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Exceptions
TypeDescription
google.cloud.exceptions.ConflictIf the routine already exists.
Returns
TypeDescription
google.cloud.bigquery.routine.RoutineA new ``Routine`` returned from the service.

create_table

create_table(table: Union[str, google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem], exists_ok: bool = False, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

API call: create a table via a PUT request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/insert

Parameters
NameDescription
table Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str, ]

A Table to create. If table is a reference, an empty table is created with the specified ID. The dataset that the table belongs to must already exist.

exists_ok Optional[bool]

Defaults to False. If True, ignore "already exists" errors when creating the table.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Exceptions
TypeDescription
google.cloud.exceptions.ConflictIf the table already exists.
Returns
TypeDescription
google.cloud.bigquery.table.TableA new ``Table`` returned from the service.

dataset

dataset(dataset_id: str, project: Optional[str] = None)

Deprecated: Construct a reference to a dataset.

.. deprecated:: 1.24.0 Construct a xref_DatasetReference using its constructor or use a string where previously a reference object was used.

As of google-cloud-bigquery version 1.7.0, all client methods that take a xref_DatasetReference or xref_TableReference also take a string in standard SQL format, e.g. project.dataset_id or project.dataset_id.table_id.

Parameters
NameDescription
dataset_id str

ID of the dataset.

project Optional[str]

Project ID for the dataset (defaults to the project of the client).

Returns
TypeDescription
google.cloud.bigquery.dataset.DatasetReferencea new ``DatasetReference`` instance.

delete_dataset

delete_dataset(dataset: Union[google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem, str], delete_contents: bool = False, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, not_found_ok: bool = False)
Parameters
NameDescription
dataset Union[ google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem, str, ]

A reference to the dataset to delete. If a string is passed in, this method attempts to create a dataset reference from a string using from_string.

delete_contents Optional[bool]

If True, delete all the tables in the dataset. If False and the dataset contains tables, the request will fail. Default is False.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

not_found_ok Optional[bool]

Defaults to False. If True, ignore "not found" errors when deleting the dataset.

delete_job_metadata

delete_job_metadata(job_id: Union[str, google.cloud.bigquery.job.load.LoadJob, google.cloud.bigquery.job.copy_.CopyJob, google.cloud.bigquery.job.extract.ExtractJob, google.cloud.bigquery.job.query.QueryJob], project: Optional[str] = None, location: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, not_found_ok: bool = False)

[Beta] Delete job metadata from job history.

Note: This does not stop a running job. Use xref_cancel_job instead.

Parameters
NameDescription
job_id

Job or job identifier. :keyword project: ID of the project which owns the job (defaults to the client's project). :keyword location: Location where the job was run. Ignored if job_id is a job object. :keyword retry: How to retry the RPC. :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using retry. :keyword not_found_ok: Defaults to False. If True, ignore "not found" errors when deleting the job.

project None(
location None(
retry Retry
timeout None(
not_found_ok bool

delete_model

delete_model(model: Union[google.cloud.bigquery.model.Model, google.cloud.bigquery.model.ModelReference, str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, not_found_ok: bool = False)
Parameters
NameDescription
model Union[ google.cloud.bigquery.model.Model, google.cloud.bigquery.model.ModelReference, str, ]

A reference to the model to delete. If a string is passed in, this method attempts to create a model reference from a string using from_string.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

not_found_ok Optional[bool]

Defaults to False. If True, ignore "not found" errors when deleting the model.

delete_routine

delete_routine(routine: Union[google.cloud.bigquery.routine.routine.Routine, google.cloud.bigquery.routine.routine.RoutineReference, str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, not_found_ok: bool = False)
Parameters
NameDescription
routine Union[ google.cloud.bigquery.routine.Routine, google.cloud.bigquery.routine.RoutineReference, str, ]

A reference to the routine to delete. If a string is passed in, this method attempts to create a routine reference from a string using from_string.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

not_found_ok Optional[bool]

Defaults to False. If True, ignore "not found" errors when deleting the routine.

delete_table

delete_table(table: Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, not_found_ok: bool = False)
Parameters
NameDescription
table Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str, ]

A reference to the table to delete. If a string is passed in, this method attempts to create a table reference from a string using from_string.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

not_found_ok Optional[bool]

Defaults to False. If True, ignore "not found" errors when deleting the table.

extract_table

extract_table(source: Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, google.cloud.bigquery.model.Model, google.cloud.bigquery.model.ModelReference, str], destination_uris: Union[str, Sequence[str]], job_id: Optional[str] = None, job_id_prefix: Optional[str] = None, location: Optional[str] = None, project: Optional[str] = None, job_config: Optional[google.cloud.bigquery.job.extract.ExtractJobConfig] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, source_type: str = 'Table')

Start a job to extract a table into Cloud Storage files.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#jobconfigurationextract

Parameters
NameDescription
source Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, google.cloud.bigquery.model.Model, google.cloud.bigquery.model.ModelReference, src, ]

Table or Model to be extracted.

destination_uris Union[str, Sequence[str]] :keyword job_id: The ID of the job. :kwtype job_id: Optional[str] :keyword job_id_prefix: The user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: Optional[str] :keyword location: Location where to run the job. Must match the location of the source table. :kwtype location: Optional[str] :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: Optional[str] :keyword job_config: Extra configuration options for the job. :kwtype job_config: Optional[google.cloud.bigquery.job.ExtractJobConfig] :keyword retry: How to retry the RPC. :kwtype retry: Optional[google.api_core.retry.Retry] :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using ``retry``. :kwtype timeout: Optional[float] :keyword source_type: Type of source to be extracted.``Table`` or ``Model``. Defaults to ``Table``. :kwtype source_type: Optional[str]

URIs of Cloud Storage file(s) into which table data is to be extracted; in format gs://<bucket_name>/<object_name_or_glob>.

job_id str
job_id_prefix str
location str
project str
job_config ExtractJobConfig
retry Retry
timeout None(
source_type str
Exceptions
TypeDescription
TypeErrorIf ``job_config`` is not an instance of ExtractJobConfig class.
ValueErrorIf ``source_type`` is not among ``Table``,``Model``.
Returns
TypeDescription
google.cloud.bigquery.job.ExtractJobA new extract job instance.

from_service_account_info

from_service_account_info(info, *args, **kwargs)

Factory to retrieve JSON credentials while creating client.

Parameters
NameDescription
args tuple

Remaining positional arguments to pass to constructor.

kwargs

Remaining keyword arguments to pass to constructor.

info dict

The JSON object with a private key and other credentials information (downloaded from the Google APIs console).

Exceptions
TypeDescription
TypeErrorif there is a conflict with the kwargs and the credentials created by the factory.
Returns
TypeDescription
`_ClientFactoryMixin`The client created with the retrieved JSON credentials.

from_service_account_json

from_service_account_json(json_credentials_path, *args, **kwargs)

Factory to retrieve JSON credentials while creating client.

Parameters
NameDescription
args tuple

Remaining positional arguments to pass to constructor.

kwargs

Remaining keyword arguments to pass to constructor.

json_credentials_path str

The path to a private key file (this file was given to you when you created the service account). This file must contain a JSON object with a private key and other credentials information (downloaded from the Google APIs console).

Exceptions
TypeDescription
TypeErrorif there is a conflict with the kwargs and the credentials created by the factory.
Returns
TypeDescription
`_ClientFactoryMixin`The client created with the retrieved JSON credentials.

get_dataset

get_dataset(dataset_ref: Union[google.cloud.bigquery.dataset.DatasetReference, str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

Fetch the dataset referenced by dataset_ref

Parameters
NameDescription
dataset_ref Union[ google.cloud.bigquery.dataset.DatasetReference, str, ]

A reference to the dataset to fetch from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Returns
TypeDescription
google.cloud.bigquery.dataset.DatasetA ``Dataset`` instance.

get_job

get_job(job_id: Union[str, google.cloud.bigquery.job.load.LoadJob, google.cloud.bigquery.job.copy_.CopyJob, google.cloud.bigquery.job.extract.ExtractJob, google.cloud.bigquery.job.query.QueryJob], project: Optional[str] = None, location: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

Fetch a job for the project associated with this client.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters
NameDescription
job_id

Job identifier. :keyword project: ID of the project which owns the job (defaults to the client's project). :kwtype project: Optional[str] :keyword location: Location where the job was run. Ignored if job_id is a job object. :kwtype location: Optional[str] :keyword retry: How to retry the RPC. :kwtype retry: Optional[google.api_core.retry.Retry] :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using retry. :kwtype timeout: Optional[float]

project str
location str
retry Retry
timeout None(

get_model

get_model(model_ref: Union[google.cloud.bigquery.model.ModelReference, str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

[Beta] Fetch the model referenced by model_ref.

Parameters
NameDescription
model_ref Union[ google.cloud.bigquery.model.ModelReference, str, ]

A reference to the model to fetch from the BigQuery API. If a string is passed in, this method attempts to create a model reference from a string using from_string.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Returns
TypeDescription
google.cloud.bigquery.model.ModelA ``Model`` instance.

get_routine

get_routine(routine_ref: Union[google.cloud.bigquery.routine.routine.Routine, google.cloud.bigquery.routine.routine.RoutineReference, str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

[Beta] Get the routine referenced by routine_ref.

Parameters
NameDescription
routine_ref Union[ google.cloud.bigquery.routine.Routine, google.cloud.bigquery.routine.RoutineReference, str, ]

A reference to the routine to fetch from the BigQuery API. If a string is passed in, this method attempts to create a reference from a string using from_string.

retry Optional[google.api_core.retry.Retry]

How to retry the API call.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Returns
TypeDescription
google.cloud.bigquery.routine.RoutineA ``Routine`` instance.

get_service_account_email

get_service_account_email(project: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

Get the email address of the project's BigQuery service account

.. note::

This is the service account that BigQuery uses to manage tables encrypted by a key in KMS.

Parameters
NameDescription
project Optional[str]

Project ID to use for retreiving service account email. Defaults to the client's project.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Returns
TypeDescription
str .. rubric:: Example >>> from google.cloud import bigquery >>> client = bigquery.Client() >>> client.get_service_account_email() my_service_account@my-project.iam.gserviceaccount.comservice account email address

get_table

get_table(table: Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

Fetch the table referenced by table.

Parameters
NameDescription
table Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str, ]

A reference to the table to fetch from the BigQuery API. If a string is passed in, this method attempts to create a table reference from a string using from_string.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Returns
TypeDescription
google.cloud.bigquery.table.TableA ``Table`` instance.

insert_rows

insert_rows(
    table: Union[
        google.cloud.bigquery.table.Table,
        google.cloud.bigquery.table.TableReference,
        str,
    ],
    rows: Union[Iterable[Tuple], Iterable[Dict]],
    selected_fields: Optional[
        Sequence[google.cloud.bigquery.schema.SchemaField]
    ] = None,
    **kwargs
)

Insert rows into a table via the streaming API.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll

Parameters
NameDescription
table Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, str, ]

The destination table for the row data, or a reference to it.

rows Union[Sequence[Tuple], Sequence[Dict]]

Row data to be inserted. If a list of tuples is given, each tuple should contain data for each schema field on the current table and in the same order as the schema fields. If a list of dictionaries is given, the keys must include all required fields in the schema. Keys which do not correspond to a field in the schema are ignored.

selected_fields Sequence[google.cloud.bigquery.schema.SchemaField]

The fields to return. Required if table is a TableReference.

kwargs dict

Keyword arguments to insert_rows_json.

Exceptions
TypeDescription
ValueErrorif table's schema is not set or `rows` is not a `Sequence`.
Returns
TypeDescription
Sequence[Mappings]One mapping per row with insert errors: the "index" key identifies the row, and the "errors" key contains a list of the mappings describing one or more problems with the row.

insert_rows_from_dataframe

insert_rows_from_dataframe(
    table: Union[
        google.cloud.bigquery.table.Table,
        google.cloud.bigquery.table.TableReference,
        str,
    ],
    dataframe,
    selected_fields: Optional[
        Sequence[google.cloud.bigquery.schema.SchemaField]
    ] = None,
    chunk_size: int = 500,
    **kwargs: Dict
)

Insert rows into a table from a dataframe via the streaming API.

Parameters
NameDescription
table Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, str, ]

The destination table for the row data, or a reference to it.

selected_fields Sequence[google.cloud.bigquery.schema.SchemaField]

The fields to return. Required if table is a TableReference.

chunk_size int

The number of rows to stream in a single chunk. Must be positive.

kwargs Dict

Keyword arguments to insert_rows_json.

dataframe pandas.DataFrame

A pandas.DataFrame containing the data to load. Any NaN values present in the dataframe are omitted from the streaming API request(s).

Exceptions
TypeDescription
ValueErrorif table's schema is not set
Returns
TypeDescription
Sequence[Sequence[Mappings]]A list with insert errors for each insert chunk. Each element is a list containing one mapping per row with insert errors: the "index" key identifies the row, and the "errors" key contains a list of the mappings describing one or more problems with the row.

insert_rows_json

insert_rows_json(table: Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str], json_rows: Sequence[Dict], row_ids: Optional[Union[Iterable[str], google.cloud.bigquery.enums.AutoRowIDs]] = <AutoRowIDs.GENERATE_UUID: 2>, skip_invalid_rows: Optional[bool] = None, ignore_unknown_values: Optional[bool] = None, template_suffix: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

Insert rows into a table without applying local type conversions.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll

Parameters
NameDescription
table Union[ google.cloud.bigquery.table.Table google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str ]

The destination table for the row data, or a reference to it.

json_rows Sequence[Dict]

Row data to be inserted. Keys must match the table schema fields and values must be JSON-compatible representations.

row_ids Union[Iterable[str], AutoRowIDs, None]

Unique IDs, one per row being inserted. An ID can also be None, indicating that an explicit insert ID should not be used for that row. If the argument is omitted altogether, unique IDs are created automatically. .. versionchanged:: 2.21.0 Can also be an iterable, not just a sequence, or an AutoRowIDs enum member. .. deprecated:: 2.21.0 Passing None to explicitly request autogenerating insert IDs is deprecated, use AutoRowIDs.GENERATE_UUID instead.

skip_invalid_rows Optional[bool]

Insert all valid rows of a request, even if invalid rows exist. The default value is False, which causes the entire request to fail if any invalid rows exist.

ignore_unknown_values Optional[bool]

Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is False, which treats unknown values as errors.

template_suffix Optional[str]

Treat name as a template table and provide a suffix. BigQuery will create the table <name> + <template_suffix> based on the schema of the template table. See https://cloud.google.com/bigquery/streaming-data-into-bigquery#template-tables

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Exceptions
TypeDescription
TypeErrorif `json_rows` is not a `Sequence`.
Returns
TypeDescription
Sequence[Mappings]One mapping per row with insert errors: the "index" key identifies the row, and the "errors" key contains a list of the mappings describing one or more problems with the row.

job_from_resource

job_from_resource(resource: dict)

Detect correct job type from resource and instantiate.

Parameter
NameDescription
resource Dict

one job resource from API response

list_datasets

list_datasets(project: Optional[str] = None, include_all: bool = False, filter: Optional[str] = None, max_results: Optional[int] = None, page_token: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, page_size: Optional[int] = None)

List datasets for the project associated with this client.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/list

Parameters
NameDescription
project Optional[str]

Project ID to use for retreiving datasets. Defaults to the client's project.

include_all Optional[bool]

True if results include hidden datasets. Defaults to False.

filter Optional[str]

An expression for filtering the results by label. For syntax, see https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/list#body.QUERY_PARAMETERS.filter

max_results Optional[int]

Maximum number of datasets to return.

page_token Optional[str]

Token representing a cursor into the datasets. If not passed, the API will return the first page of datasets. The token marks the beginning of the iterator to be returned and the value of the page_token can be accessed at next_page_token of the google.api_core.page_iterator.HTTPIterator.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

page_size Optional[int]

Maximum number of datasets to return per page.

Returns
TypeDescription
google.api_core.page_iterator.IteratorIterator of DatasetListItem. associated with the project.

list_jobs

list_jobs(project: Optional[str] = None, parent_job: Optional[Union[google.cloud.bigquery.job.query.QueryJob, str]] = None, max_results: Optional[int] = None, page_token: Optional[str] = None, all_users: Optional[bool] = None, state_filter: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, min_creation_time: Optional[datetime.datetime] = None, max_creation_time: Optional[datetime.datetime] = None, page_size: Optional[int] = None)

List jobs for the project associated with this client.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/list

Parameters
NameDescription
project Optional[str]

Project ID to use for retreiving datasets. Defaults to the client's project.

parent_job Optional[Union[ google.cloud.bigquery.job._AsyncJob, str, ]]

If set, retrieve only child jobs of the specified parent.

max_results Optional[int]

Maximum number of jobs to return.

page_token Optional[str]

Opaque marker for the next "page" of jobs. If not passed, the API will return the first page of jobs. The token marks the beginning of the iterator to be returned and the value of the page_token can be accessed at next_page_token of google.api_core.page_iterator.HTTPIterator.

all_users Optional[bool]

If true, include jobs owned by all users in the project. Defaults to :data:False.

state_filter Optional[str]

If set, include only jobs matching the given state. One of: * "done" * "pending" * "running"

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

min_creation_time Optional[datetime.datetime]

Min value for job creation time. If set, only jobs created after or at this timestamp are returned. If the datetime has no time zone assumes UTC time.

max_creation_time Optional[datetime.datetime]

Max value for job creation time. If set, only jobs created before or at this timestamp are returned. If the datetime has no time zone assumes UTC time.

page_size Optional[int]

Maximum number of jobs to return per page.

Returns
TypeDescription
google.api_core.page_iterator.IteratorIterable of job instances.

list_models

list_models(dataset: Union[google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem, str], max_results: Optional[int] = None, page_token: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, page_size: Optional[int] = None)
Parameters
NameDescription
dataset Union[ google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem, str, ]

A reference to the dataset whose models to list from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string.

max_results Optional[int]

Maximum number of models to return. Defaults to a value set by the API.

page_token Optional[str]

Token representing a cursor into the models. If not passed, the API will return the first page of models. The token marks the beginning of the iterator to be returned and the value of the page_token can be accessed at next_page_token of the google.api_core.page_iterator.HTTPIterator.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

page_size Optional[int] Returns: google.api_core.page_iterator.Iterator: Iterator of Model contained within the requested dataset.

Maximum number of models to return per page. Defaults to a value set by the API.

list_partitions

list_partitions(table: Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

List the partitions in a table.

Parameters
NameDescription
table Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str, ]

The table or reference from which to get partition info

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry. If multiple requests are made under the hood, timeout applies to each individual request.

Returns
TypeDescription
List[str]A list of the partition ids present in the partitioned table

list_projects

list_projects(max_results: Optional[int] = None, page_token: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, page_size: Optional[int] = None)

List projects for the project associated with this client.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/projects/list

Parameters
NameDescription
max_results Optional[int]

Maximum number of projects to return. Defaults to a value set by the API.

page_token Optional[str]

Token representing a cursor into the projects. If not passed, the API will return the first page of projects. The token marks the beginning of the iterator to be returned and the value of the page_token can be accessed at next_page_token of the google.api_core.page_iterator.HTTPIterator.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

page_size Optional[int]

Maximum number of projects to return in each page. Defaults to a value set by the API.

Returns
TypeDescription
google.api_core.page_iterator.IteratorIterator of Project accessible to the current client.

list_routines

list_routines(dataset: Union[google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem, str], max_results: Optional[int] = None, page_token: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, page_size: Optional[int] = None)
Parameters
NameDescription
dataset Union[ google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem, str, ]

A reference to the dataset whose routines to list from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string.

max_results Optional[int]

Maximum number of routines to return. Defaults to a value set by the API.

page_token Optional[str]

Token representing a cursor into the routines. If not passed, the API will return the first page of routines. The token marks the beginning of the iterator to be returned and the value of the page_token can be accessed at next_page_token of the google.api_core.page_iterator.HTTPIterator.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

page_size Optional[int] Returns: google.api_core.page_iterator.Iterator: Iterator of all Routines contained within the requested dataset, limited by ``max_results``.

Maximum number of routines to return per page. Defaults to a value set by the API.

list_rows

list_rows(table: Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableListItem, google.cloud.bigquery.table.TableReference, str], selected_fields: Optional[Sequence[google.cloud.bigquery.schema.SchemaField]] = None, max_results: Optional[int] = None, page_token: Optional[str] = None, start_index: Optional[int] = None, page_size: Optional[int] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

List the rows of the table.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/list

.. note::

This method assumes that the provided schema is up-to-date with the schema as defined on the back-end: if the two schemas are not identical, the values returned may be incomplete. To ensure that the local copy of the schema is up-to-date, call client.get_table.

Parameters
NameDescription
table Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableListItem, google.cloud.bigquery.table.TableReference, str, ]

The table to list, or a reference to it. When the table object does not contain a schema and selected_fields is not supplied, this method calls get_table to fetch the table schema.

selected_fields Sequence[google.cloud.bigquery.schema.SchemaField]

The fields to return. If not supplied, data for all columns are downloaded.

max_results Optional[int]

Maximum number of rows to return.

page_token Optional[str]

Token representing a cursor into the table's rows. If not passed, the API will return the first page of the rows. The token marks the beginning of the iterator to be returned and the value of the page_token can be accessed at next_page_token of the RowIterator.

start_index Optional[int]

The zero-based index of the starting row to read.

page_size Optional[int]

The maximum number of rows in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry. If multiple requests are made under the hood, timeout applies to each individual request.

Returns
TypeDescription
google.cloud.bigquery.table.RowIteratorIterator of row data Row-s. During each page, the iterator will have the ``total_rows`` attribute set, which counts the total number of rows **in the table** (this is distinct from the total number of rows in the current page: ``iterator.page.num_items``).

list_tables

list_tables(dataset: Union[google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem, str], max_results: Optional[int] = None, page_token: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, page_size: Optional[int] = None)
Parameters
NameDescription
dataset Union[ google.cloud.bigquery.dataset.Dataset, google.cloud.bigquery.dataset.DatasetReference, google.cloud.bigquery.dataset.DatasetListItem, str, ]

A reference to the dataset whose tables to list from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string.

max_results Optional[int]

Maximum number of tables to return. Defaults to a value set by the API.

page_token Optional[str]

Token representing a cursor into the tables. If not passed, the API will return the first page of tables. The token marks the beginning of the iterator to be returned and the value of the page_token can be accessed at next_page_token of the google.api_core.page_iterator.HTTPIterator.

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

page_size Optional[int]

Maximum number of tables to return per page. Defaults to a value set by the API.

Returns
TypeDescription
google.api_core.page_iterator.IteratorIterator of TableListItem contained within the requested dataset.

load_table_from_dataframe

load_table_from_dataframe(
    dataframe: pandas.DataFrame,
    destination: Union[
        google.cloud.bigquery.table.Table,
        google.cloud.bigquery.table.TableReference,
        str,
    ],
    num_retries: int = 6,
    job_id: str = None,
    job_id_prefix: str = None,
    location: str = None,
    project: str = None,
    job_config: google.cloud.bigquery.job.load.LoadJobConfig = None,
    parquet_compression: str = "snappy",
    timeout: Union[None, float, Tuple[float, float]] = None,
)

Upload the contents of a table from a pandas DataFrame.

Similar to load_table_from_uri, this method creates, starts and returns a xref_LoadJob.

.. note::

REPEATED fields are NOT supported when using the CSV source format.
They are supported when using the PARQUET source format, but
due to the way they are encoded in the ``parquet`` file,
a mismatch with the existing table schema can occur, so
REPEATED fields are not properly supported when using ``pyarrow<4.0.0``
using the parquet format.

https://github.com/googleapis/python-bigquery/issues/19
Parameters
NameDescription
destination

The destination table to use for loading the data. If it is an existing table, the schema of the pandas.DataFrame must match the schema of the destination table. If the table does not yet exist, the schema is inferred from the pandas.DataFrame. If a string is passed in, this method attempts to create a table reference from a string using from_string. :keyword num_retries: Number of upload retries. :keyword job_id: Name of the job. :keyword job_id_prefix: The user-provided prefix for a randomly generated job ID. This parameter will be ignored if a job_id is also given. :keyword location: Location where to run the job. Must match the location of the destination table. :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :keyword job_config: Extra configuration options for the job. To override the default pandas data type conversions, supply a value for schema with column names matching those of the dataframe. The BigQuery schema is used to determine the correct data type conversion. Indexes are not loaded. Requires the pyarrow library. By default, this method uses the parquet source format. To override this, supply a value for source_format with the format name. Currently only CSV and PARQUET are supported. :keyword parquet_compression: [Beta] The compression method to use if intermittently serializing dataframe to a parquet file. The argument is directly passed as the compression argument to the underlying pyarrow.parquet.write_table() method (the default value "snappy" gets converted to uppercase). https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html#pyarrow-parquet-write-table If the job config schema is missing, the argument is directly passed as the compression argument to the underlying DataFrame.to_parquet() method. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_parquet.html#pandas.DataFrame.to_parquet :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using retry. Depending on the retry strategy, a request may be repeated several times using the same timeout each time. Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request documentation for details.

num_retries int
job_id str
job_id_prefix str
location str
project str
job_config LoadJobConfig
parquet_compression str
timeout None(
dataframe

A pandas.DataFrame containing the data to load.

Exceptions
TypeDescription
ValueErrorIf a usable parquet engine cannot be found. This method requires `pyarrow` to be installed.
TypeErrorIf ``job_config`` is not an instance of LoadJobConfig class.
Returns
TypeDescription
google.cloud.bigquery.job.LoadJobA new load job.

load_table_from_file

load_table_from_file(
    file_obj: IO[bytes],
    destination: Union[
        google.cloud.bigquery.table.Table,
        google.cloud.bigquery.table.TableReference,
        google.cloud.bigquery.table.TableListItem,
        str,
    ],
    rewind: bool = False,
    size: Optional[int] = None,
    num_retries: int = 6,
    job_id: Optional[str] = None,
    job_id_prefix: Optional[str] = None,
    location: Optional[str] = None,
    project: Optional[str] = None,
    job_config: Optional[google.cloud.bigquery.job.load.LoadJobConfig] = None,
    timeout: Union[None, float, Tuple[float, float]] = None,
)

Upload the contents of this table from a file-like object.

Similar to load_table_from_uri, this method creates, starts and returns a xref_LoadJob.

Parameters
NameDescription
file_obj

A file handle opened in binary mode for reading.

destination

Table into which data is to be loaded. If a string is passed in, this method attempts to create a table reference from a string using from_string. :keyword rewind: If True, seek to the beginning of the file handle before reading the file. :keyword size: The number of bytes to read from the file handle. If size is None or large, resumable upload will be used. Otherwise, multipart upload will be used. :keyword num_retries: Number of upload retries. Defaults to 6. :keyword job_id: Name of the job. :keyword job_id_prefix: The user-provided prefix for a randomly generated job ID. This parameter will be ignored if a job_id is also given. :keyword location: Location where to run the job. Must match the location of the destination table. :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :keyword job_config: Extra configuration options for the job. :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using retry. Depending on the retry strategy, a request may be repeated several times using the same timeout each time. Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request documentation for details.

rewind bool
size int
num_retries int
job_id str
job_id_prefix str
location str
project str
job_config LoadJobConfig
timeout None(
Exceptions
TypeDescription
ValueErrorIf ``size`` is not passed in and can not be determined, or if the ``file_obj`` can be detected to be a file opened in text mode.
TypeErrorIf ``job_config`` is not an instance of LoadJobConfig class.
Returns
TypeDescription
google.cloud.bigquery.job.LoadJobA new load job.

load_table_from_json

load_table_from_json(
    json_rows: Iterable[Dict[str, Any]],
    destination: Union[
        google.cloud.bigquery.table.Table,
        google.cloud.bigquery.table.TableReference,
        google.cloud.bigquery.table.TableListItem,
        str,
    ],
    num_retries: int = 6,
    job_id: Optional[str] = None,
    job_id_prefix: Optional[str] = None,
    location: Optional[str] = None,
    project: Optional[str] = None,
    job_config: Optional[google.cloud.bigquery.job.load.LoadJobConfig] = None,
    timeout: Union[None, float, Tuple[float, float]] = None,
)

Upload the contents of a table from a JSON string or dict.

Parameters
NameDescription
json_rows Iterable[Dict[str, Any]]

Row data to be inserted. Keys must match the table schema fields and values must be JSON-compatible representations. .. note:: If your data is already a newline-delimited JSON string, it is best to wrap it into a file-like object and pass it to load_table_from_file:: import io from google.cloud import bigquery data = u'{"foo": "bar"}' data_as_file = io.StringIO(data) client = bigquery.Client() client.load_table_from_file(data_as_file, ...)

destination

Table into which data is to be loaded. If a string is passed in, this method attempts to create a table reference from a string using from_string. :keyword num_retries: Number of upload retries. :keyword job_id: Name of the job. :keyword job_id_prefix: The user-provided prefix for a randomly generated job ID. This parameter will be ignored if a job_id is also given. :keyword location: Location where to run the job. Must match the location of the destination table. :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :keyword job_config: Extra configuration options for the job. The source_format setting is always set to NEWLINE_DELIMITED_JSON. :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using retry. Depending on the retry strategy, a request may be repeated several times using the same timeout each time. Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request documentation for details.

num_retries int
job_id str
job_id_prefix str
location str
project str
job_config LoadJobConfig
timeout None(
Exceptions
TypeDescription
TypeErrorIf ``job_config`` is not an instance of LoadJobConfig class.
Returns
TypeDescription
google.cloud.bigquery.job.LoadJobA new load job.

load_table_from_uri

load_table_from_uri(source_uris: Union[str, Sequence[str]], destination: Union[google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str], job_id: Optional[str] = None, job_id_prefix: Optional[str] = None, location: Optional[str] = None, project: Optional[str] = None, job_config: Optional[google.cloud.bigquery.job.load.LoadJobConfig] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

Starts a job for loading data into a table from Cloud Storage.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#jobconfigurationload

Parameters
NameDescription
source_uris Union[str, Sequence[str]]

URIs of data files to be loaded; in format gs://<bucket_name>/<object_name_or_glob>.

destination Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, google.cloud.bigquery.table.TableListItem, str, ] :keyword job_id: Name of the job. :kwtype job_id: Optional[str] :keyword job_id_prefix: The user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: Optional[str] :keyword location: Location where to run the job. Must match the location of the destination table. :kwtype location: Optional[str] :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: Optional[str] :keyword job_config: Extra configuration options for the job. :kwtype job_config: Optional[google.cloud.bigquery.job.LoadJobConfig] :keyword retry: How to retry the RPC. :kwtype retry: Optional[google.api_core.retry.Retry] :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using ``retry``. :kwtype timeout: Optional[float]

Table into which data is to be loaded. If a string is passed in, this method attempts to create a table reference from a string using from_string.

job_id str
job_id_prefix str
location str
project str
job_config LoadJobConfig
retry Retry
timeout None(
Exceptions
TypeDescription
TypeErrorIf ``job_config`` is not an instance of LoadJobConfig class.
Returns
TypeDescription
google.cloud.bigquery.job.LoadJobA new load job.

query

query(query: str, job_config: Optional[google.cloud.bigquery.job.query.QueryJobConfig] = None, job_id: Optional[str] = None, job_id_prefix: Optional[str] = None, location: Optional[str] = None, project: Optional[str] = None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, job_retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>)
Parameters
NameDescription
query str :keyword job_config: Extra configuration options for the job. To override any options that were previously set in the ``default_query_job_config`` given to the ``Client`` constructor, manually set those options to ``None``, or whatever value is preferred. :kwtype job_config: Optional[google.cloud.bigquery.job.QueryJobConfig] :keyword job_id: ID to use for the query job. :kwtype job_id: Optional[str] :keyword job_id_prefix: The prefix to use for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: Optional[str] :keyword location: Location where to run the job. Must match the location of the any table used in the query as well as the destination table. :kwtype location: Optional[str] :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: Optional[str] :keyword retry: How to retry the RPC. This only applies to making RPC calls. It isn't used to retry failed jobs. This has a reasonable default that should only be overridden with care. :kwtype retry: Optional[google.api_core.retry.Retry] :keyword timeout: The number of seconds to wait for the underlying HTTP transport before using ``retry``. :kwtype timeout: Optional[float] :keyword job_retry: How to retry failed jobs. The default retries rate-limit-exceeded errors. Passing ``None`` disables job retry. Not all jobs can be retried. If ``job_id`` is provided, then the job returned by the query will not be retryable, and an exception will be raised if a non-``None`` (and non-default) value for ``job_retry`` is also provided. Note that errors aren't detected until ``result()`` is called on the job returned. The ``job_retry`` specified here becomes the default ``job_retry`` for ``result()``, where it can also be specified. :kwtype job_retry: Optional[google.api_core.retry.Retry]

SQL query to be executed. Defaults to the standard SQL dialect. Use the job_config parameter to change dialects.

job_config QueryJobConfig
job_id str
job_id_prefix str
location str
project str
retry Retry
timeout None(
job_retry Retry
Exceptions
TypeDescription
TypeErrorIf ``job_config`` is not an instance of QueryJobConfig class, or if both ``job_id`` and non-``None`` non-default ``job_retry`` are provided.
Returns
TypeDescription
google.cloud.bigquery.job.QueryJobA new query job instance.

schema_from_json

schema_from_json(file_or_path: PathType)

Takes a file object or file path that contains json that describes a table schema.

schema_to_json

schema_to_json(
    schema_list: Sequence[google.cloud.bigquery.schema.SchemaField],
    destination: PathType,
)

Takes a list of schema field objects.

Serializes the list of schema field objects as json to a file.

Destination is a file path or a file object.

Parameter
NameDescription
schema_list Sequence(google.cloud.bigquery.schema.SchemaField'>)

update_dataset

update_dataset(dataset: google.cloud.bigquery.dataset.Dataset, fields: Sequence[str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

Change some fields of a dataset.

Use fields to specify which fields to update. At least one field must be provided. If a field is listed in fields and is None in dataset, it will be deleted.

If dataset.etag is not None, the update will only succeed if the dataset on the server has the same ETag. Thus reading a dataset with get_dataset, changing its fields, and then passing it to update_dataset will ensure that the changes will only be saved if no modifications to the dataset occurred since the read.

Parameters
NameDescription
dataset google.cloud.bigquery.dataset.Dataset

The dataset to update.

fields Sequence[str]

The properties of dataset to change. These are strings corresponding to the properties of Dataset. For example, to update the default expiration times, specify both properties in the fields argument: .. code-block:: python bigquery_client.update_dataset( dataset, [ "default_partition_expiration_ms", "default_table_expiration_ms", ] )

retry Optional[google.api_core.retry.Retry]

How to retry the RPC.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Returns
TypeDescription
google.cloud.bigquery.dataset.DatasetThe modified ``Dataset`` instance.

update_model

update_model(model: google.cloud.bigquery.model.Model, fields: Sequence[str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

[Beta] Change some fields of a model.

Use fields to specify which fields to update. At least one field must be provided. If a field is listed in fields and is None in model, the field value will be deleted.

If model.etag is not None, the update will only succeed if the model on the server has the same ETag. Thus reading a model with get_model, changing its fields, and then passing it to update_model will ensure that the changes will only be saved if no modifications to the model occurred since the read.

Parameters
NameDescription
model google.cloud.bigquery.model.Model

The model to update.

fields Sequence[str]

The properties of model to change. These are strings corresponding to the properties of Model. For example, to update the descriptive properties of the model, specify them in the fields argument: .. code-block:: python bigquery_client.update_model( model, ["description", "friendly_name"] )

retry Optional[google.api_core.retry.Retry]

A description of how to retry the API call.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Returns
TypeDescription
google.cloud.bigquery.model.ModelThe model resource returned from the API call.

update_routine

update_routine(routine: google.cloud.bigquery.routine.routine.Routine, fields: Sequence[str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

[Beta] Change some fields of a routine.

Use fields to specify which fields to update. At least one field must be provided. If a field is listed in fields and is None in routine, the field value will be deleted.

.. warning:: During beta, partial updates are not supported. You must provide all fields in the resource.

If xref_etag is not None, the update will only succeed if the resource on the server has the same ETag. Thus reading a routine with xref_get_routine, changing its fields, and then passing it to this method will ensure that the changes will only be saved if no modifications to the resource occurred since the read.

Parameters
NameDescription
routine google.cloud.bigquery.routine.Routine

The routine to update.

fields Sequence[str]

The fields of routine to change, spelled as the Routine properties. For example, to update the description property of the routine, specify it in the fields argument: .. code-block:: python bigquery_client.update_routine( routine, ["description"] )

retry Optional[google.api_core.retry.Retry]

A description of how to retry the API call.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Returns
TypeDescription
google.cloud.bigquery.routine.RoutineThe routine resource returned from the API call.

update_table

update_table(table: google.cloud.bigquery.table.Table, fields: Sequence[str], retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

Change some fields of a table.

Use fields to specify which fields to update. At least one field must be provided. If a field is listed in fields and is None in table, the field value will be deleted.

If table.etag is not None, the update will only succeed if the table on the server has the same ETag. Thus reading a table with get_table, changing its fields, and then passing it to update_table will ensure that the changes will only be saved if no modifications to the table occurred since the read.

Parameters
NameDescription
table google.cloud.bigquery.table.Table

The table to update.

fields Sequence[str]

The fields of table to change, spelled as the Table properties. For example, to update the descriptive properties of the table, specify them in the fields argument: .. code-block:: python bigquery_client.update_table( table, ["description", "friendly_name"] )

retry Optional[google.api_core.retry.Retry]

A description of how to retry the API call.

timeout Optional[float]

The number of seconds to wait for the underlying HTTP transport before using retry.

Returns
TypeDescription
google.cloud.bigquery.table.TableThe table resource returned from the API call.

__init__

__init__(
    project=None,
    credentials=None,
    _http=None,
    location=None,
    default_query_job_config=None,
    client_info=None,
    client_options=None,
)

Initialize self. See help(type(self)) for accurate signature.

get_iam_policy

get_iam_policy(table, requested_policy_version=1, retry=<google.api_core.retry.Retry object>, timeout=None)

API documentation for bigquery.client.Client.get_iam_policy method.

Parameters
NameDescription
table None(google.cloud.bigquery.table.Table'>, google.cloud.bigquery.table.TableReference'>, google.cloud.bigquery.table.TableListItem'>,
requested_policy_version int
retry Retry
timeout None(

set_iam_policy

set_iam_policy(table, policy, updateMask=None, retry=<google.api_core.retry.Retry object>, timeout=None)

API documentation for bigquery.client.Client.set_iam_policy method.

Parameters
NameDescription
table None(google.cloud.bigquery.table.Table'>, google.cloud.bigquery.table.TableReference'>, google.cloud.bigquery.table.TableListItem'>,
policy Policy
updateMask str
retry Retry
timeout None(

test_iam_permissions

test_iam_permissions(table, permissions, retry=<google.api_core.retry.Retry object>, timeout=None)

API documentation for bigquery.client.Client.test_iam_permissions method.

Parameters
NameDescription
table None(google.cloud.bigquery.table.Table'>, google.cloud.bigquery.table.TableReference'>, google.cloud.bigquery.table.TableListItem'>,
permissions Sequence(
retry Retry
timeout None(