ReadRowsStream(wrapped, client, read_position, read_rows_kwargs)
A stream of results from a read rows request.
This stream is an iterable of ReadRowsResponse. Iterate over it to fetch all row messages.
If the fastavro library is installed, use the rows() method to parse all messages into a stream of row dictionaries.
If the pandas and fastavro libraries are installed, use the
to_dataframe()
method to parse all messages into a pandas.DataFrame
.
Methods
ReadRowsStream
ReadRowsStream(wrapped, client, read_position, read_rows_kwargs)
Construct a ReadRowsStream.
Name | Description |
wrapped |
Iterable[ ReadRowsResponse ]
The ReadRows stream to read. |
client |
google.cloud.bigquery_storage_v1beta1.gapic big_query_storage_client.BigQueryStorageClient
A GAPIC client used to reconnect to a ReadRows stream. This must be the GAPIC client to avoid a circular dependency on this class. |
read_position |
Union[ dict, google.cloud.bigquery_storage_v1beta1.gapicStreamPosition ]
Required. Identifier of the position in the stream to start reading from. The offset requested must be less than the last row read from ReadRows. Requesting a larger offset is undefined. If a dict is provided, it must be of the same form as the protobuf message google.cloud.bigquery_storage_v1beta1.gapicStreamPosition |
read_rows_kwargs |
dict
Keyword arguments to use when reconnecting to a ReadRows stream. |
Type | Description |
Iterable[ ReadRowsResponse ] | A sequence of row messages. |
__iter__
__iter__()
An iterable of messages.
Type | Description |
Iterable[ ReadRowsResponse ] | A sequence of row messages. |
rows
rows(read_session)
Iterate over all rows in the stream.
This method requires the fastavro library in order to parse row messages.
Name | Description |
read_session |
ReadSession
The read session associated with this read rows stream. This contains the schema, which is required to parse the data messages. |
Type | Description |
Iterable[Mapping] | A sequence of rows, represented as dictionaries. |
to_arrow
to_arrow(read_session)
Create a pyarrow.Table
of all rows in the stream.
This method requires the pyarrow library and a stream using the Arrow format.
Name | Description |
read_session |
ReadSession
The read session associated with this read rows stream. This contains the schema, which is required to parse the data messages. |
Type | Description |
pyarrow.Table | A table of all rows in the stream. |
to_dataframe
to_dataframe(read_session, dtypes=None)
Create a pandas.DataFrame
of all rows in the stream.
This method requires the pandas libary to create a data frame and the fastavro library to parse row messages.
Name | Description |
read_session |
ReadSession
The read session associated with this read rows stream. This contains the schema, which is required to parse the data messages. |
dtypes |
Map[str, Union[str, pandas.Series.dtype]]
Optional. A dictionary of column names pandas |
Type | Description |
pandas.DataFrame | A data frame of all rows in the stream. |