- 3.27.0 (latest)
- 3.26.0
- 3.25.0
- 3.24.0
- 3.23.1
- 3.22.0
- 3.21.0
- 3.20.1
- 3.19.0
- 3.18.0
- 3.17.2
- 3.16.0
- 3.15.0
- 3.14.1
- 3.13.0
- 3.12.0
- 3.11.4
- 3.4.0
- 3.3.6
- 3.2.0
- 3.1.0
- 3.0.1
- 2.34.4
- 2.33.0
- 2.32.0
- 2.31.0
- 2.30.1
- 2.29.0
- 2.28.1
- 2.27.1
- 2.26.0
- 2.25.2
- 2.24.1
- 2.23.3
- 2.22.1
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.1
- 2.15.0
- 2.14.0
- 2.13.1
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.2
- 2.5.0
- 2.4.0
- 2.3.1
- 2.2.0
- 2.1.0
- 2.0.0
- 1.28.2
- 1.27.2
- 1.26.1
- 1.25.0
- 1.24.0
- 1.23.1
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
RowIterator(
client,
api_request,
path,
schema,
page_token=None,
max_results=None,
page_size=None,
extra_params=None,
table=None,
selected_fields=None,
total_rows=None,
first_page_response=None,
location: typing.Optional[str] = None,
job_id: typing.Optional[str] = None,
query_id: typing.Optional[str] = None,
project: typing.Optional[str] = None,
num_dml_affected_rows: typing.Optional[int] = None,
)
A class for iterating through HTTP/JSON API row list responses.
Parameters |
|
---|---|
Name | Description |
client |
Optional[google.cloud.bigquery.Client]
The API client instance. This should always be non- |
api_request |
Callable[google.cloud._http.JSONConnection.api_request]
The function to use to make API requests. |
path |
str
The method path to query for the list of items. |
schema |
Sequence[Union[ SchemaField, Mapping[str, Any] ]]
The table's schema. If any item is a mapping, its content must be compatible with from_api_repr. |
page_token |
str
A token identifying a page in a result set to start fetching results from. |
max_results |
Optional[int]
The maximum number of results to fetch. |
page_size |
Optional[int]
The maximum number of rows in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API. |
extra_params |
Optional[Dict[str, object]]
Extra query string parameters for the API call. |
table |
Optional[Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, ]]
The table which these rows belong to, or a reference to it. Used to call the BigQuery Storage API to fetch rows. |
selected_fields |
Optional[Sequence[google.cloud.bigquery.schema.SchemaField]]
A subset of columns to select from this table. |
total_rows |
Optional[int]
Total number of rows in the table. |
first_page_response |
Optional[dict]
API response for the first page of results. These are returned when the first page is requested. |
Properties
job_id
ID of the query job (if applicable).
To get the job metadata, call
job = client.get_job(rows.job_id, location=rows.location)
.
location
Location where the query executed (if applicable).
num_dml_affected_rows
If this RowIterator is the result of a DML query, the number of rows that were affected.
pages
Iterator of pages in the response.
Exceptions | |
---|---|
Type | Description |
ValueError |
If the iterator has already been started. |
Returns | |
---|---|
Type | Description |
types.GeneratorType[google.api_core.page_iterator.Page] |
A generator of page instances. |
project
GCP Project ID where these rows are read from.
query_id
[Preview] ID of a completed query.
This ID is auto-generated and not guaranteed to be populated.
schema
List[google.cloud.bigquery.schema.SchemaField]: The subset of columns to be read from the table.
total_rows
int: The total number of rows in the table or query results.
Methods
__iter__
__iter__()
Iterator for each item returned.
Exceptions | |
---|---|
Type | Description |
ValueError |
If the iterator has already been started. |
Returns | |
---|---|
Type | Description |
types.GeneratorType[Any] |
A generator of items from the API. |
to_arrow
to_arrow(
progress_bar_type: typing.Optional[str] = None,
bqstorage_client: typing.Optional[bigquery_storage.BigQueryReadClient] = None,
create_bqstorage_client: bool = True,
) -> pyarrow.Table
[Beta] Create a class:pyarrow.Table
by loading all pages of a
table or query.
Parameters | |
---|---|
Name | Description |
progress_bar_type |
Optional[str]
If set, use the |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires |
create_bqstorage_client |
Optional[bool]
If |
Exceptions | |
---|---|
Type | Description |
ValueError |
If the pyarrow library cannot be imported. .. versionadded:: 1.17.0 |
to_arrow_iterable
to_arrow_iterable(bqstorage_client: typing.Optional[bigquery_storage.BigQueryReadClient] = None, max_queue_size: int = <object object>) -> typing.Iterator[pyarrow.RecordBatch]
[Beta] Create an iterable of class:pyarrow.RecordBatch
, to process the table as a stream.
Parameters | |
---|---|
Name | Description |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This method requires the |
max_queue_size |
Optional[int]
The maximum number of result pages to hold in the internal queue when streaming query results over the BigQuery Storage API. Ignored if Storage API is not used. By default, the max queue size is set to the number of BQ Storage streams created by the server. If |
Returns | |
---|---|
Type | Description |
pyarrow.RecordBatch .. versionadded:: 2.31.0 |
A generator of pyarrow.RecordBatch . |
to_dataframe
to_dataframe(
bqstorage_client: typing.Optional[bigquery_storage.BigQueryReadClient] = None,
dtypes: typing.Optional[typing.Dict[str, typing.Any]] = None,
progress_bar_type: typing.Optional[str] = None,
create_bqstorage_client: bool = True,
geography_as_object: bool = False,
bool_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.BOOL_DTYPE,
int_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.INT_DTYPE,
float_dtype: typing.Optional[typing.Any] = None,
string_dtype: typing.Optional[typing.Any] = None,
date_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.DATE_DTYPE,
datetime_dtype: typing.Optional[typing.Any] = None,
time_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.TIME_DTYPE,
timestamp_dtype: typing.Optional[typing.Any] = None,
range_date_dtype: typing.Optional[
typing.Any
] = DefaultPandasDTypes.RANGE_DATE_DTYPE,
range_datetime_dtype: typing.Optional[
typing.Any
] = DefaultPandasDTypes.RANGE_DATETIME_DTYPE,
range_timestamp_dtype: typing.Optional[
typing.Any
] = DefaultPandasDTypes.RANGE_TIMESTAMP_DTYPE,
) -> pandas.DataFrame
Create a pandas DataFrame by loading all pages of a query.
Parameters | |
---|---|
Name | Description |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This method requires |
dtypes |
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas |
progress_bar_type |
Optional[str]
If set, use the |
create_bqstorage_client |
Optional[bool]
If |
geography_as_object |
Optional[bool]
If |
bool_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype (e.g. |
int_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype (e.g. |
float_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype (e.g. |
string_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype (e.g. |
date_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype (e.g. |
datetime_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype (e.g. |
time_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype (e.g. |
timestamp_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype (e.g. |
range_date_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype, such as: .. code-block:: python pandas.ArrowDtype(pyarrow.struct( [("start", pyarrow.date32()), ("end", pyarrow.date32())] )) to convert BigQuery RANGE
|
range_datetime_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype, such as: .. code-block:: python pandas.ArrowDtype(pyarrow.struct( [ ("start", pyarrow.timestamp("us")), ("end", pyarrow.timestamp("us")), ] )) to convert BigQuery RANGE
|
range_timestamp_dtype |
Optional[pandas.Series.dtype, None]
If set, indicate a pandas ExtensionDtype, such as: .. code-block:: python pandas.ArrowDtype(pyarrow.struct( [ ("start", pyarrow.timestamp("us", tz="UTC")), ("end", pyarrow.timestamp("us", tz="UTC")), ] )) to convert BigQuery RANGE
|
Exceptions | |
---|---|
Type | Description |
ValueError |
If the pandas library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. Also if geography_as_object is True , but the shapely library cannot be imported. Also if bool_dtype , int_dtype or other dtype parameters is not supported dtype. |
Returns | |
---|---|
Type | Description |
pandas.DataFrame |
A pandas.DataFrame populated with row data and column headers from the query results. The column headers are derived from the destination table's schema. |
to_dataframe_iterable
to_dataframe_iterable(bqstorage_client: typing.Optional[bigquery_storage.BigQueryReadClient] = None, dtypes: typing.Optional[typing.Dict[str, typing.Any]] = None, max_queue_size: int = <object object>) -> pandas.DataFrame
Create an iterable of pandas DataFrames, to process the table as a stream.
Parameters | |
---|---|
Name | Description |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This method requires |
dtypes |
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas |
max_queue_size |
Optional[int]
The maximum number of result pages to hold in the internal queue when streaming query results over the BigQuery Storage API. Ignored if Storage API is not used. By default, the max queue size is set to the number of BQ Storage streams created by the server. If |
Exceptions | |
---|---|
Type | Description |
ValueError |
If the pandas library cannot be imported. |
Returns | |
---|---|
Type | Description |
pandas.DataFrame |
A generator of pandas.DataFrame . |
to_geodataframe
to_geodataframe(
bqstorage_client: typing.Optional[bigquery_storage.BigQueryReadClient] = None,
dtypes: typing.Optional[typing.Dict[str, typing.Any]] = None,
progress_bar_type: typing.Optional[str] = None,
create_bqstorage_client: bool = True,
geography_column: typing.Optional[str] = None,
) -> geopandas.GeoDataFrame
Create a GeoPandas GeoDataFrame by loading all pages of a query.
Parameters | |
---|---|
Name | Description |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This method requires the |
dtypes |
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas |
progress_bar_type |
Optional[str]
If set, use the |
create_bqstorage_client |
Optional[bool]
If |
geography_column |
Optional[str]
If there are more than one GEOGRAPHY column, identifies which one to use to construct a geopandas GeoDataFrame. This option can be ommitted if there's only one GEOGRAPHY column. |
Exceptions | |
---|---|
Type | Description |
ValueError |
If the geopandas library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. .. versionadded:: 2.24.0 |
Returns | |
---|---|
Type | Description |
geopandas.GeoDataFrame |
A geopandas.GeoDataFrame populated with row data and column headers from the query results. The column headers are derived from the destination table's schema. |