- 3.27.0 (latest)
- 3.26.0
- 3.25.0
- 3.24.0
- 3.23.1
- 3.22.0
- 3.21.0
- 3.20.1
- 3.19.0
- 3.18.0
- 3.17.2
- 3.16.0
- 3.15.0
- 3.14.1
- 3.13.0
- 3.12.0
- 3.11.4
- 3.4.0
- 3.3.6
- 3.2.0
- 3.1.0
- 3.0.1
- 2.34.4
- 2.33.0
- 2.32.0
- 2.31.0
- 2.30.1
- 2.29.0
- 2.28.1
- 2.27.1
- 2.26.0
- 2.25.2
- 2.24.1
- 2.23.3
- 2.22.1
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.1
- 2.15.0
- 2.14.0
- 2.13.1
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.2
- 2.5.0
- 2.4.0
- 2.3.1
- 2.2.0
- 2.1.0
- 2.0.0
- 1.28.2
- 1.27.2
- 1.26.1
- 1.25.0
- 1.24.0
- 1.23.1
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
Client(
project=None,
credentials=None,
_http=None,
location=None,
default_query_job_config=None,
client_info=None,
client_options=None,
)
Client to bundle configuration needed for API requests.
Parameters
Name | Description |
project |
str
Project ID for the project which the client acts on behalf of. Will be passed when creating a dataset / job. If not passed, falls back to the default inferred from the environment. |
credentials |
google.auth.credentials.Credentials
(Optional) The OAuth2 Credentials to use for this client. If not passed (and if no |
_http |
requests.Session
(Optional) HTTP object to make requests. Can be any object that defines |
location |
str
(Optional) Default location for jobs / datasets / tables. |
default_query_job_config |
google.cloud.bigquery.job.QueryJobConfig
(Optional) Default |
client_info |
google.api_core.client_info.ClientInfo
The client info used to send a user-agent string along with API requests. If |
client_options |
Union[`google.api_core.client_options.ClientOptions`, dict]
(Optional) Client options used to set user options on the client. API Endpoint should be set through client_options. |
Inheritance
builtins.object > google.cloud.client._ClientFactoryMixin > google.cloud.client.Client > builtins.object > google.cloud.client._ClientProjectMixin > google.cloud.client.ClientWithProject > ClientProperties
location
Default location for jobs / datasets / tables.
Methods
__init__
__init__(
project=None,
credentials=None,
_http=None,
location=None,
default_query_job_config=None,
client_info=None,
client_options=None,
)
Initialize self. See help(type(self)) for accurate signature.
cancel_job
cancel_job(job_id, project=None, location=None, retry=<google.api_core.retry.Retry object>)
Attempt to cancel a job from a job ID.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel
Name | Description |
job_id |
str :keyword project: (Optional) ID of the project which owns the job (defaults to the client's project). :kwtype project: str :keyword location: Location where the job was run. :kwtype location: str :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry
Unique job identifier. |
Type | Description |
Union[google.cloud.bigquery.job.LoadJob, google.cloud.bigquery.job.CopyJob, google.cloud.bigquery.job.ExtractJob, google.cloud.bigquery.job.QueryJob] | Job instance, based on the resource returned by the API. |
close
close()
Clean up transport, if set.
Suggested use:
import contextlib
with contextlib.closing(client): # closes on exit
do_something_with(client)
copy_table
copy_table(sources, destination, job_id=None, job_id_prefix=None, location=None, project=None, job_config=None, retry=<google.api_core.retry.Retry object>)
Copy one or more tables to another table.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#jobconfigurationtablecopy
Name | Description |
sources |
Union[ Table, TableReference, str, Sequence[ Union[ Table, TableReference, str, ] ], ]
Table or tables to be copied. |
Type | Description |
google.cloud.bigquery.job.CopyJob | A new copy job instance. |
create_dataset
create_dataset(dataset, exists_ok=False, retry=<google.api_core.retry.Retry object>)
API call: create the dataset via a POST request.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert
Name | Description |
dataset |
Union[ Dataset, DatasetReference, str, ]
A Dataset to create. If |
exists_ok |
bool
Defaults to |
retry |
google.api_core.retry.Retry
Optional. How to retry the RPC. |
Type | Description |
google.cloud.bigquery.dataset.Dataset .. rubric:: Example >>> from google.cloud import bigquery >>> client = bigquery.Client() >>> dataset = bigquery.Dataset(client.dataset('my_dataset')) >>> dataset = client.create_dataset(dataset) | A new ``Dataset`` returned from the API. |
create_routine
create_routine(routine, exists_ok=False, retry=<google.api_core.retry.Retry object>)
[Beta] Create a routine via a POST request.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/routines/insert
Name | Description |
routine |
Routine
A Routine to create. The dataset that the routine belongs to must already exist. |
exists_ok |
bool
Defaults to |
retry |
google.api_core.retry.Retry
Optional. How to retry the RPC. |
Type | Description |
google.cloud.bigquery.routine.Routine | A new ``Routine`` returned from the service. |
create_table
create_table(table, exists_ok=False, retry=<google.api_core.retry.Retry object>)
API call: create a table via a PUT request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/insert
Name | Description |
table |
Union[ Table, TableReference, str, ]
A Table to create. If |
exists_ok |
bool
Defaults to |
retry |
google.api_core.retry.Retry
Optional. How to retry the RPC. |
Type | Description |
google.cloud.bigquery.table.Table | A new ``Table`` returned from the service. |
dataset
dataset(dataset_id, project=None)
Construct a reference to a dataset.
Name | Description |
dataset_id |
str
ID of the dataset. |
project |
str
(Optional) project ID for the dataset (defaults to the project of the client). |
Type | Description |
DatasetReference | a new ``DatasetReference`` instance |
delete_dataset
delete_dataset(dataset, delete_contents=False, retry=<google.api_core.retry.Retry object>, not_found_ok=False)
Delete a dataset.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
Args
dataset (Union[ Dataset, DatasetReference, str, ]):
A reference to the dataset to delete. If a string is passed
in, this method attempts to create a dataset reference from a
string using
from_string.
delete_contents (boolean):
(Optional) If True, delete all the tables in the dataset. If
False and the dataset contains tables, the request will fail.
Default is False.
retry (google.api_core.retry.Retry
):
(Optional) How to retry the RPC.
not_found_ok (bool):
Defaults to False
. If True
, ignore "not found" errors
when deleting the dataset.
delete_model
delete_model(model, retry=<google.api_core.retry.Retry object>, not_found_ok=False)
[Beta] Delete a model
See https://cloud.google.com/bigquery/docs/reference/rest/v2/models/delete
Name | Description |
model |
Union[ Model, ModelReference, str, ]
A reference to the model to delete. If a string is passed in, this method attempts to create a model reference from a string using from_string. |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the RPC. |
not_found_ok |
bool
Defaults to |
delete_routine
delete_routine(routine, retry=<google.api_core.retry.Retry object>, not_found_ok=False)
[Beta] Delete a routine.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/routines/delete
Name | Description |
model |
Union[ Routine, RoutineReference, str, ]
A reference to the routine to delete. If a string is passed in, this method attempts to create a routine reference from a string using from_string. |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the RPC. |
not_found_ok |
bool
Defaults to |
delete_table
delete_table(table, retry=<google.api_core.retry.Retry object>, not_found_ok=False)
Delete a table
See https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete
Name | Description |
table |
Union[ Table, TableReference, str, ]
A reference to the table to delete. If a string is passed in, this method attempts to create a table reference from a string using from_string. |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the RPC. |
not_found_ok |
bool
Defaults to |
extract_table
extract_table(source, destination_uris, job_id=None, job_id_prefix=None, location=None, project=None, job_config=None, retry=<google.api_core.retry.Retry object>)
Start a job to extract a table into Cloud Storage files.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#jobconfigurationextract
Name | Description |
source |
TableReference
table to be extracted. |
destination_uris |
Union[str, Sequence[str]] :keyword job_id: (Optional) The ID of the job. :kwtype job_id: str :keyword job_id_prefix: (Optional) the user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the source table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword job_config: (Optional) Extra configuration options for the job. :kwtype job_config: google.cloud.bigquery.job.ExtractJobConfig :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry
URIs of Cloud Storage file(s) into which table data is to be extracted; in format |
Type | Description |
google.cloud.bigquery.job.ExtractJob | A new extract job instance. |
from_service_account_info
from_service_account_info(info, *args, **kwargs)
Factory to retrieve JSON credentials while creating client.
Name | Description |
args |
tuple
Remaining positional arguments to pass to constructor. |
info |
str
The JSON object with a private key and other credentials information (downloaded from the Google APIs console). |
Type | Description |
TypeError | if there is a conflict with the kwargs and the credentials created by the factory. |
Type | Description |
`_ClientFactoryMixin` | The client created with the retrieved JSON credentials. |
from_service_account_json
from_service_account_json(json_credentials_path, *args, **kwargs)
Factory to retrieve JSON credentials while creating client.
Name | Description |
args |
tuple
Remaining positional arguments to pass to constructor. |
json_credentials_path |
str
The path to a private key file (this file was given to you when you created the service account). This file must contain a JSON object with a private key and other credentials information (downloaded from the Google APIs console). |
Type | Description |
TypeError | if there is a conflict with the kwargs and the credentials created by the factory. |
Type | Description |
`_ClientFactoryMixin` | The client created with the retrieved JSON credentials. |
get_dataset
get_dataset(dataset_ref, retry=<google.api_core.retry.Retry object>)
Fetch the dataset referenced by dataset_ref
Name | Description |
dataset_ref |
Union[ DatasetReference, str, ]
A reference to the dataset to fetch from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string. |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the RPC. |
Type | Description |
google.cloud.bigquery.dataset.Dataset | A ``Dataset`` instance. |
get_job
get_job(job_id, project=None, location=None, retry=<google.api_core.retry.Retry object>)
Fetch a job for the project associated with this client.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
Name | Description |
job_id |
str :keyword project: (Optional) ID of the project which ownsthe job (defaults to the client's project). :kwtype project: str :keyword location: Location where the job was run. :kwtype location: str :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry
Unique job identifier. |
Type | Description |
Union[google.cloud.bigquery.job.LoadJob, google.cloud.bigquery.job.CopyJob, google.cloud.bigquery.job.ExtractJob, google.cloud.bigquery.job.QueryJob] | Job instance, based on the resource returned by the API. |
get_model
get_model(model_ref, retry=<google.api_core.retry.Retry object>)
[Beta] Fetch the model referenced by model_ref
.
Name | Description |
model_ref |
Union[ ModelReference, str, ]
A reference to the model to fetch from the BigQuery API. If a string is passed in, this method attempts to create a model reference from a string using from_string. |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the RPC. |
Type | Description |
google.cloud.bigquery.model.Model | A ``Model`` instance. |
get_routine
get_routine(routine_ref, retry=<google.api_core.retry.Retry object>)
[Beta] Get the routine referenced by routine_ref
.
Name | Description |
routine_ref |
Union[ Routine, RoutineReference, str, ]
A reference to the routine to fetch from the BigQuery API. If a string is passed in, this method attempts to create a reference from a string using from_string. |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the API call. |
Type | Description |
google.cloud.bigquery.routine.Routine | A ``Routine`` instance. |
get_service_account_email
get_service_account_email(project=None)
Get the email address of the project's BigQuery service account
.. note::
This is the service account that BigQuery uses to manage tables encrypted by a key in KMS.
Name | Description |
project |
str, optional
Project ID to use for retreiving service account email. Defaults to the client's project. |
Type | Description |
str .. rubric:: Example >>> from google.cloud import bigquery >>> client = bigquery.Client() >>> client.get_service_account_email() my_service_account@my-project.iam.gserviceaccount.com | service account email address |
get_table
get_table(table, retry=<google.api_core.retry.Retry object>)
Fetch the table referenced by table
.
Name | Description |
table |
Union[ Table, TableReference, str, ]
A reference to the table to fetch from the BigQuery API. If a string is passed in, this method attempts to create a table reference from a string using from_string. |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the RPC. |
Type | Description |
google.cloud.bigquery.table.Table | A ``Table`` instance. |
insert_rows
insert_rows(table, rows, selected_fields=None, **kwargs)
Insert rows into a table via the streaming API.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll
Name | Description |
kwargs |
dict
Keyword arguments to insert_rows_json. |
table |
Union[ Table, TableReference, str, ]
The destination table for the row data, or a reference to it. |
rows |
Union[ Sequence[Tuple], Sequence[dict], ]
Row data to be inserted. If a list of tuples is given, each tuple should contain data for each schema field on the current table and in the same order as the schema fields. If a list of dictionaries is given, the keys must include all required fields in the schema. Keys which do not correspond to a field in the schema are ignored. |
selected_fields |
Sequence[ SchemaField, ]
The fields to return. Required if |
Type | Description |
ValueError | if table's schema is not set |
Type | Description |
Sequence[Mappings] | One mapping per row with insert errors: the "index" key identifies the row, and the "errors" key contains a list of the mappings describing one or more problems with the row. |
insert_rows_from_dataframe
insert_rows_from_dataframe(
table, dataframe, selected_fields=None, chunk_size=500, **kwargs
)
Insert rows into a table from a dataframe via the streaming API.
Name | Description |
kwargs |
dict
Keyword arguments to insert_rows_json. |
table |
Union[ Table, TableReference, str, ]
The destination table for the row data, or a reference to it. |
dataframe |
pandas.DataFrame
A |
selected_fields |
Sequence[ SchemaField, ]
The fields to return. Required if |
chunk_size |
int
The number of rows to stream in a single chunk. Must be positive. |
Type | Description |
ValueError | if table's schema is not set |
Type | Description |
Sequence[Sequence[Mappings]] | A list with insert errors for each insert chunk. Each element is a list containing one mapping per row with insert errors: the "index" key identifies the row, and the "errors" key contains a list of the mappings describing one or more problems with the row. |
insert_rows_json
insert_rows_json(table, json_rows, row_ids=None, skip_invalid_rows=None, ignore_unknown_values=None, template_suffix=None, retry=<google.api_core.retry.Retry object>)
Insert rows into a table without applying local type conversions.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll
table (Union[ xref_Table xref_TableReference, str, ]):
The destination table for the row data, or a reference to it.
json_rows (Sequence[dict]):
Row data to be inserted. Keys must match the table schema fields
and values must be JSON-compatible representations.
row_ids (Sequence[str]):
(Optional) Unique ids, one per row being inserted. If omitted,
unique IDs are created.
skip_invalid_rows (bool):
(Optional) Insert all valid rows of a request, even if invalid
rows exist. The default value is False, which causes the entire
request to fail if any invalid rows exist.
ignore_unknown_values (bool):
(Optional) Accept rows that contain values that do not match the
schema. The unknown values are ignored. Default is False, which
treats unknown values as errors.
template_suffix (str):
(Optional) treat name
as a template table and provide a suffix.
BigQuery will create the table <name> + <template_suffix>
based
on the schema of the template table. See
https://cloud.google.com/bigquery/streaming-data-into-bigquery#template-tables
retry (google.api_core.retry.Retry
):
(Optional) How to retry the RPC.
Type | Description |
Sequence[Mappings] | One mapping per row with insert errors: the "index" key identifies the row, and the "errors" key contains a list of the mappings describing one or more problems with the row. |
job_from_resource
job_from_resource(resource)
Detect correct job type from resource and instantiate.
Name | Description |
resource |
dict
one job resource from API response |
Type | Description |
One of: LoadJob, CopyJob, ExtractJob, or QueryJob | the job instance, constructed via the resource |
list_datasets
list_datasets(project=None, include_all=False, filter=None, max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)
List datasets for the project associated with this client.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/list
Name | Description |
project |
str
Optional. Project ID to use for retreiving datasets. Defaults to the client's project. |
include_all |
bool
Optional. True if results include hidden datasets. Defaults to False. |
filter |
str
Optional. An expression for filtering the results by label. For syntax, see https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/list#body.QUERY_PARAMETERS.filter |
max_results |
int
Optional. Maximum number of datasets to return. |
page_token |
str
Optional. Token representing a cursor into the datasets. If not passed, the API will return the first page of datasets. The token marks the beginning of the iterator to be returned and the value of the |
retry |
google.api_core.retry.Retry
Optional. How to retry the RPC. |
Type | Description |
google.api_core.page_iterator.Iterator | Iterator of DatasetListItem. associated with the project. |
list_jobs
list_jobs(project=None, parent_job=None, max_results=None, page_token=None, all_users=None, state_filter=None, retry=<google.api_core.retry.Retry object>, min_creation_time=None, max_creation_time=None)
List jobs for the project associated with this client.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/list
Name | Description |
project |
str, optional
Project ID to use for retreiving datasets. Defaults to the client's project. |
parent_job |
Optional[Union[ _AsyncJob, str, ]]
If set, retrieve only child jobs of the specified parent. |
max_results |
int, optional
Maximum number of jobs to return. |
page_token |
str, optional
Opaque marker for the next "page" of jobs. If not passed, the API will return the first page of jobs. The token marks the beginning of the iterator to be returned and the value of the |
all_users |
bool, optional
If true, include jobs owned by all users in the project. Defaults to :data: |
state_filter |
str, optional
If set, include only jobs matching the given state. One of: * |
retry |
google.api_core.retry.Retry, optional
How to retry the RPC. |
min_creation_time |
datetime.datetime, optional
Min value for job creation time. If set, only jobs created after or at this timestamp are returned. If the datetime has no time zone assumes UTC time. |
max_creation_time |
datetime.datetime, optional
Max value for job creation time. If set, only jobs created before or at this timestamp are returned. If the datetime has no time zone assumes UTC time. |
Type | Description |
google.api_core.page_iterator.Iterator | Iterable of job instances. |
list_models
list_models(dataset, max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)
[Beta] List models in the dataset.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/models/list
Name | Description |
dataset |
Union[ Dataset, DatasetReference, str, ]
A reference to the dataset whose models to list from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string. |
max_results |
int
(Optional) Maximum number of models to return. If not passed, defaults to a value set by the API. |
page_token |
str
(Optional) Token representing a cursor into the models. If not passed, the API will return the first page of models. The token marks the beginning of the iterator to be returned and the value of the |
retry |
`google.api_core.retry.Retry` Returns: google.api_core.page_iterator.Iterator: Iterator of Model contained within the requested dataset.
(Optional) How to retry the RPC. |
list_partitions
list_partitions(table, retry=<google.api_core.retry.Retry object>)
List the partitions in a table.
Name | Description |
table |
Union[ Table, TableReference, str, ]
The table or reference from which to get partition info |
retry |
google.api_core.retry.Retry
(Optional) How to retry the RPC. |
Type | Description |
List[str] | A list of the partition ids present in the partitioned table |
list_projects
list_projects(max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)
List projects for the project associated with this client.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/projects/list
Name | Description |
max_results |
int
(Optional) maximum number of projects to return, If not passed, defaults to a value set by the API. |
page_token |
str
(Optional) Token representing a cursor into the projects. If not passed, the API will return the first page of projects. The token marks the beginning of the iterator to be returned and the value of the |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the RPC. |
Type | Description |
`google.api_core.page_iterator.Iterator` | Iterator of Project accessible to the current client. |
list_routines
list_routines(dataset, max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)
[Beta] List routines in the dataset.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/routines/list
Name | Description |
dataset |
Union[ Dataset, DatasetReference, str, ]
A reference to the dataset whose routines to list from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string. |
max_results |
int
(Optional) Maximum number of routines to return. If not passed, defaults to a value set by the API. |
page_token |
str
(Optional) Token representing a cursor into the routines. If not passed, the API will return the first page of routines. The token marks the beginning of the iterator to be returned and the value of the |
retry |
`google.api_core.retry.Retry` Returns: google.api_core.page_iterator.Iterator: Iterator of all Routines contained within the requested dataset, limited by ``max_results``.
(Optional) How to retry the RPC. |
list_rows
list_rows(table, selected_fields=None, max_results=None, page_token=None, start_index=None, page_size=None, retry=<google.api_core.retry.Retry object>)
List the rows of the table.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/list
.. note::
This method assumes that the provided schema is up-to-date with the
schema as defined on the back-end: if the two schemas are not
identical, the values returned may be incomplete. To ensure that the
local copy of the schema is up-to-date, call client.get_table
.
Name | Description |
table |
Union[ Table, TableListItem, TableReference, str, ]
The table to list, or a reference to it. When the table object does not contain a schema and |
selected_fields |
Sequence[ SchemaField ]
The fields to return. If not supplied, data for all columns are downloaded. |
max_results |
int
(Optional) maximum number of rows to return. |
page_token |
str
(Optional) Token representing a cursor into the table's rows. If not passed, the API will return the first page of the rows. The token marks the beginning of the iterator to be returned and the value of the |
start_index |
int
(Optional) The zero-based index of the starting row to read. |
page_size |
int
Optional. The maximum number of rows in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API. |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the RPC. |
Type | Description |
google.cloud.bigquery.table.RowIterator | Iterator of row data Row-s. During each page, the iterator will have the ``total_rows`` attribute set, which counts the total number of rows **in the table** (this is distinct from the total number of rows in the current page: ``iterator.page.num_items``). |
list_tables
list_tables(dataset, max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)
List tables in the dataset.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list
Name | Description |
dataset |
Union[ Dataset, DatasetReference, str, ]
A reference to the dataset whose tables to list from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string. |
max_results |
int
(Optional) Maximum number of tables to return. If not passed, defaults to a value set by the API. |
page_token |
str
(Optional) Token representing a cursor into the tables. If not passed, the API will return the first page of tables. The token marks the beginning of the iterator to be returned and the value of the |
retry |
`google.api_core.retry.Retry`
(Optional) How to retry the RPC. |
Type | Description |
google.api_core.page_iterator.Iterator | Iterator of TableListItem contained within the requested dataset. |
load_table_from_dataframe
load_table_from_dataframe(
dataframe,
destination,
num_retries=6,
job_id=None,
job_id_prefix=None,
location=None,
project=None,
job_config=None,
parquet_compression="snappy",
)
Upload the contents of a table from a pandas DataFrame.
Similar to load_table_from_uri
, this method creates, starts and
returns a xref_LoadJob.
Name | Description |
dataframe |
pandas.DataFrame
A |
destination |
google.cloud.bigquery.table.TableReference :keyword num_retries: Number of upload retries. :kwtype num_retries: int, optional :keyword job_id: Name of the job. :kwtype job_id: str, optional :keyword job_id_prefix: The user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str, optional :keyword location: Location where to run the job. Must match the location of the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str, optional :keyword job_config: Extra configuration options for the job. To override the default pandas data type conversions, supply a value for schema with column names matching those of the dataframe. The BigQuery schema is used to determine the correct data type conversion. Indexes are not loaded. Requires the `pyarrow` library. :kwtype job_config: LoadJobConfig, optional :keyword parquet_compression: [Beta] The compression method to use if intermittently serializing ``dataframe`` to a parquet file. If ``pyarrow`` and job config schema are used, the argument is directly passed as the ``compression`` argument to the underlying ``pyarrow.parquet.write_table()`` method (the default value "snappy" gets converted to uppercase). https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html#pyarrow-parquet-write-table If either ``pyarrow`` or job config schema are missing, the argument is directly passed as the ``compression`` argument to the underlying ``DataFrame.to_parquet()`` method. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_parquet.html#pandas.DataFrame.to_parquet :kwtype parquet_compression: str
The destination table to use for loading the data. If it is an existing table, the schema of the |
Type | Description |
ImportError | If a usable parquet engine cannot be found. This method requires `pyarrow` or `fastparquet` to be installed. |
Type | Description |
google.cloud.bigquery.job.LoadJob | A new load job. |
load_table_from_file
load_table_from_file(
file_obj,
destination,
rewind=False,
size=None,
num_retries=6,
job_id=None,
job_id_prefix=None,
location=None,
project=None,
job_config=None,
)
Upload the contents of this table from a file-like object.
Similar to load_table_from_uri
, this method creates, starts and
returns a xref_LoadJob.
Name | Description |
file_obj |
file
A file handle opened in binary mode for reading. |
destination |
Union[ Table, TableReference, str, ] :keyword rewind: If True, seek to the beginning of the file handle before reading the file. :kwtype rewind: bool :keyword size: The number of bytes to read from the file handle. If size is ``None`` or large, resumable upload will be used. Otherwise, multipart upload will be used. :kwtype size: int :keyword num_retries: Number of upload retries. Defaults to 6. :kwtype num_retries: int :keyword job_id: (Optional) Name of the job. :kwtype job_id: str :keyword job_id_prefix: (Optional) the user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword job_config: (Optional) Extra configuration options for the job. :kwtype job_config: google.cloud.bigquery.job.LoadJobConfig
Table into which data is to be loaded. If a string is passed in, this method attempts to create a table reference from a string using from_string. |
Type | Description |
ValueError | If ``size`` is not passed in and can not be determined, or if the ``file_obj`` can be detected to be a file opened in text mode. |
Type | Description |
google.cloud.bigquery.job.LoadJob | A new load job. |
load_table_from_json
load_table_from_json(
json_rows,
destination,
num_retries=6,
job_id=None,
job_id_prefix=None,
location=None,
project=None,
job_config=None,
)
Upload the contents of a table from a JSON string or dict.
Name | Description |
json_rows |
Iterable[Dict[str, Any]]
Row data to be inserted. Keys must match the table schema fields and values must be JSON-compatible representations. .. note:: If your data is already a newline-delimited JSON string, it is best to wrap it into a file-like object and pass it to load_table_from_file:: import io from google.cloud import bigquery data = u'{"foo": "bar"}' data_as_file = io.StringIO(data) client = bigquery.Client() client.load_table_from_file(data_as_file, ...) |
destination |
Union[ Table, TableReference, str, ] :keyword num_retries: Number of upload retries. :kwtype num_retries: int, optional :keyword job_id: (Optional) Name of the job. :kwtype job_id: str :keyword job_id_prefix: (Optional) the user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword job_config: (Optional) Extra configuration options for the job. The ``source_format`` setting is always set to NEWLINE_DELIMITED_JSON. :kwtype job_config: google.cloud.bigquery.job.LoadJobConfig
Table into which data is to be loaded. If a string is passed in, this method attempts to create a table reference from a string using from_string. |
Type | Description |
google.cloud.bigquery.job.LoadJob | A new load job. |
load_table_from_uri
load_table_from_uri(source_uris, destination, job_id=None, job_id_prefix=None, location=None, project=None, job_config=None, retry=<google.api_core.retry.Retry object>)
Starts a job for loading data into a table from CloudStorage.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#jobconfigurationload
Name | Description |
source_uris |
Union[str, Sequence[str]]
URIs of data files to be loaded; in format |
destination |
Union[ Table, TableReference, str, ] :keyword job_id: (Optional) Name of the job. :kwtype job_id: str :keyword job_id_prefix: (Optional) the user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword job_config: (Optional) Extra configuration options for the job. :kwtype job_config: google.cloud.bigquery.job.LoadJobConfig :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry
Table into which data is to be loaded. If a string is passed in, this method attempts to create a table reference from a string using from_string. |
Type | Description |
google.cloud.bigquery.job.LoadJob | A new load job. |
query
query(query, job_config=None, job_id=None, job_id_prefix=None, location=None, project=None, retry=<google.api_core.retry.Retry object>)
Run a SQL query.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#jobconfigurationquery
Name | Description |
query |
str :keyword job_config: (Optional) Extra configuration options for the job. To override any options that were previously set in the ``default_query_job_config`` given to the ``Client`` constructor, manually set those options to ``None``, or whatever value is preferred. :kwtype job_config: google.cloud.bigquery.job.QueryJobConfig :keyword job_id: (Optional) ID to use for the query job. :kwtype job_id: str :keyword job_id_prefix: (Optional) The prefix to use for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the any table used in the query as well as the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry
SQL query to be executed. Defaults to the standard SQL dialect. Use the |
Type | Description |
google.cloud.bigquery.job.QueryJob | A new query job instance. |
schema_from_json
schema_from_json(file_or_path)
Takes a file object or file path that contains json that describes a table schema.
schema_to_json
schema_to_json(schema_list, destination)
Takes a list of schema field objects.
Serializes the list of schema field objects as json to a file.
Destination is a file path or a file object.
update_dataset
update_dataset(dataset, fields, retry=<google.api_core.retry.Retry object>)
Change some fields of a dataset.
Use fields
to specify which fields to update. At least one field
must be provided. If a field is listed in fields
and is None
in
dataset
, it will be deleted.
If dataset.etag
is not None
, the update will only
succeed if the dataset on the server has the same ETag. Thus
reading a dataset with get_dataset
, changing its fields,
and then passing it to update_dataset
will ensure that the changes
will only be saved if no modifications to the dataset occurred
since the read.
Name | Description |
dataset |
google.cloud.bigquery.dataset.Dataset
The dataset to update. |
fields |
Sequence[str]
The properties of |
retry |
google.api_core.retry.Retry, optional
How to retry the RPC. |
Type | Description |
google.cloud.bigquery.dataset.Dataset | The modified ``Dataset`` instance. |
update_model
update_model(model, fields, retry=<google.api_core.retry.Retry object>)
[Beta] Change some fields of a model.
Use fields
to specify which fields to update. At least one field
must be provided. If a field is listed in fields
and is None
in model
, the field value will be deleted.
If model.etag
is not None
, the update will only succeed if
the model on the server has the same ETag. Thus reading a model with
get_model
, changing its fields, and then passing it to
update_model
will ensure that the changes will only be saved if
no modifications to the model occurred since the read.
Name | Description |
model |
google.cloud.bigquery.model.Model
The model to update. |
fields |
Sequence[str]
The fields of |
retry |
google.api_core.retry.Retry
(Optional) A description of how to retry the API call. |
Type | Description |
google.cloud.bigquery.model.Model | The model resource returned from the API call. |
update_routine
update_routine(routine, fields, retry=<google.api_core.retry.Retry object>)
[Beta] Change some fields of a routine.
Use fields
to specify which fields to update. At least one field
must be provided. If a field is listed in fields
and is None
in routine
, the field value will be deleted.
.. warning:: During beta, partial updates are not supported. You must provide all fields in the resource.
If xref_etag is not
None
, the update will only succeed if the resource on the server
has the same ETag. Thus reading a routine with
xref_get_routine, changing
its fields, and then passing it to this method will ensure that the
changes will only be saved if no modifications to the resource
occurred since the read.
Name | Description |
routine |
google.cloud.bigquery.routine.Routine
The routine to update. |
fields |
Sequence[str]
The fields of |
retry |
google.api_core.retry.Retry
(Optional) A description of how to retry the API call. |
Type | Description |
google.cloud.bigquery.routine.Routine | The routine resource returned from the API call. |
update_table
update_table(table, fields, retry=<google.api_core.retry.Retry object>)
Change some fields of a table.
Use fields
to specify which fields to update. At least one field
must be provided. If a field is listed in fields
and is None
in table
, the field value will be deleted.
If table.etag
is not None
, the update will only succeed if
the table on the server has the same ETag. Thus reading a table with
get_table
, changing its fields, and then passing it to
update_table
will ensure that the changes will only be saved if
no modifications to the table occurred since the read.
Name | Description |
table |
google.cloud.bigquery.table.Table
The table to update. |
fields |
Sequence[str]
The fields of |
retry |
google.api_core.retry.Retry
(Optional) A description of how to retry the API call. |
Type | Description |
google.cloud.bigquery.table.Table | The table resource returned from the API call. |