Class ReadRowsStream (2.12.0)

ReadRowsStream(client, name, offset, read_rows_kwargs, retry_delay_callback=None)

A stream of results from a read rows request.

This stream is an iterable of ReadRowsResponse. Iterate over it to fetch all row messages.

If the fastavro library is installed, use the rows() method to parse all messages into a stream of row dictionaries.

If the pandas and fastavro libraries are installed, use the to_dataframe() method to parse all messages into a pandas.DataFrame.

This object should not be created directly, but is returned by other methods in this library.

Methods

ReadRowsStream

ReadRowsStream(client, name, offset, read_rows_kwargs, retry_delay_callback=None)

Construct a ReadRowsStream.

Parameters
Name Description
client google.cloud.bigquery_storage_v1.services big_query_read.BigQueryReadClient

A GAPIC client used to reconnect to a ReadRows stream. This must be the GAPIC client to avoid a circular dependency on this class.

name str

Required. Stream ID from which rows are being read.

offset int

Required. Position in the stream to start reading from. The offset requested must be less than the last row read from ReadRows. Requesting a larger offset is undefined.

read_rows_kwargs dict

Keyword arguments to use when reconnecting to a ReadRows stream.

retry_delay_callback Optional[Callable[[float], None]]

If the client receives a retryable error that asks the client to delay its next attempt and retry_delay_callback is not None, ReadRowsStream will call retry_delay_callback with the delay duration (in seconds) before it starts sleeping until the next attempt.

Returns
Type Description
Iterable[ google.cloud.bigquery_storage_v1.servicesReadRowsResponse ] A sequence of row messages.

__iter__

__iter__()

An iterable of messages.

Returns
Type Description
Iterable[ ReadRowsResponse ] A sequence of row messages.

rows

rows(read_session=None)

Iterate over all rows in the stream.

This method requires the fastavro library in order to parse row messages in avro format. For arrow format messages, the pyarrow library is required.

Parameter
Name Description
read_session Optional[ReadSession]

DEPRECATED. This argument was used to specify the schema of the rows in the stream, but now the first message in a read stream contains this information.

Returns
Type Description
Iterable[Mapping] A sequence of rows, represented as dictionaries.

to_arrow

to_arrow(read_session=None)

Create a pyarrow.Table of all rows in the stream.

This method requires the pyarrow library and a stream using the Arrow format.

Parameter
Name Description
read_session ReadSession

DEPRECATED. This argument was used to specify the schema of the rows in the stream, but now the first message in a read stream contains this information.

Returns
Type Description
pyarrow.Table A table of all rows in the stream.

to_dataframe

to_dataframe(read_session=None, dtypes=None)

Create a pandas.DataFrame of all rows in the stream.

This method requires the pandas libary to create a data frame and the fastavro library to parse row messages.

Parameters
Name Description
read_session ReadSession

DEPRECATED. This argument was used to specify the schema of the rows in the stream, but now the first message in a read stream contains this information.

dtypes Map[str, Union[str, pandas.Series.dtype]]

Optional. A dictionary of column names pandas dtypes. The provided dtype is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.

Returns
Type Description
pandas.DataFrame A data frame of all rows in the stream.