Row Data

Container for Google Cloud Bigtable Cells and Streaming Row Contents.

class google.cloud.bigtable.row_data.Cell(value, timestamp_micros, labels=None)

Bases: object

Representation of a Google Cloud Bigtable Cell.

  • Parameters

    • value (bytes) – The value stored in the cell.

    • timestamp_micros (int) – The timestamp_micros when the cell was stored.

    • labels (list) – (Optional) List of strings. Labels applied to the cell.

classmethod from_pb(cell_pb)

Create a new cell from a Cell protobuf.

  • Parameters

    cell_pb (_generated.data_pb2.Cell) – The protobuf to convert.

  • Return type

    Cell

  • Returns

    The cell corresponding to the protobuf.

google.cloud.bigtable.row_data.DEFAULT_RETRY_READ_ROWS( = <google.api_core.retry.Retry object )

The default retry strategy to be used on retry-able errors.

Used by _read_next_response().

exception google.cloud.bigtable.row_data.InvalidChunk()

Bases: RuntimeError

Exception raised to to invalid chunk data from back-end.

exception google.cloud.bigtable.row_data.InvalidReadRowsResponse()

Bases: RuntimeError

Exception raised to to invalid response data from back-end.

class google.cloud.bigtable.row_data.PartialCellData(row_key, family_name, qualifier, timestamp_micros, labels=(), value=b'')

Bases: object

Representation of partial cell in a Google Cloud Bigtable Table.

These are expected to be updated directly from a _generated.bigtable_service_messages_pb2.ReadRowsResponse

  • Parameters

    • row_key (bytes) – The key for the row holding the (partial) cell.

    • family_name (str) – The family name of the (partial) cell.

    • qualifier (bytes) – The column qualifier of the (partial) cell.

    • timestamp_micros (int) – The timestamp (in microsecods) of the (partial) cell.

    • labels (list of str) – labels assigned to the (partial) cell

    • value (bytes) – The (accumulated) value of the (partial) cell.

append_value(value)

Append bytes from a new chunk to value.

  • Parameters

    value (bytes) – bytes to append

class google.cloud.bigtable.row_data.PartialRowData(row_key)

Bases: object

Representation of partial row in a Google Cloud Bigtable Table.

These are expected to be updated directly from a _generated.bigtable_service_messages_pb2.ReadRowsResponse

  • Parameters

    row_key (bytes) – The key for the row holding the (partial) data.

cell_value(column_family_id, column, index=0)

Get a single cell value stored on this instance.

For example:

from google.cloud.bigtable import Client

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
row_key = "row_key_1"
row_data = table.read_row(row_key)

cell_value = row_data.cell_value(COLUMN_FAMILY_ID, COL_NAME1)
  • Parameters

    • column_family_id (str) – The ID of the column family. Must be of the form [_a-zA-Z0-9][-_.a-zA-Z0-9]\*.

    • column (bytes) – The column within the column family where the cell is located.

    • index (Optional[int]) – The offset within the series of values. If not specified, will return the first cell.

  • Returns

    The cell value stored in the specified column and specified index.

  • Return type

    Cell value

  • Raises

    • KeyError – If column_family_id is not among the cells stored in this row.

    • KeyError – If column is not among the cells stored in this row for the given column_family_id.

    • IndexError – If index cannot be found within the cells stored in this row for the given column_family_id, column pair.

cell_values(column_family_id, column, max_count=None)

Get a time series of cells stored on this instance.

For example:

from google.cloud.bigtable import Client

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
row_key = "row_key_1"
row_data = table.read_row(row_key)

cell_values = row_data.cell_values(COLUMN_FAMILY_ID, COL_NAME1)
  • Parameters

    • column_family_id (str) – The ID of the column family. Must be of the form [_a-zA-Z0-9][-_.a-zA-Z0-9]\*.

    • column (bytes) – The column within the column family where the cells are located.

    • max_count (int) – The maximum number of cells to use.

  • Returns

    cell.value, cell.timestamp_micros

      for each cell in the list of cells
    
  • Return type

    A generator which provides

  • Raises

    • KeyError – If column_family_id is not among the cells stored in this row.

    • KeyError – If column is not among the cells stored in this row for the given column_family_id.

property cells()

Property returning all the cells accumulated on this partial row.

For example:

from google.cloud.bigtable import Client

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
row_key = "row_key_1"
row_data = table.read_row(row_key)

cells = row_data.cells
  • Return type

    dict

  • Returns

    Dictionary of the Cell objects accumulated. This dictionary has two-levels of keys (first for column families and second for column names/qualifiers within a family). For a given column, a list of Cell objects is stored.

find_cells(column_family_id, column)

Get a time series of cells stored on this instance.

For example:

from google.cloud.bigtable import Client

client = Client(admin=True)
instance = client.instance(INSTANCE_ID)
table = instance.table(TABLE_ID)
row_key = "row_key_1"
row = table.read_row(row_key)

cells = row.find_cells(COLUMN_FAMILY_ID, COL_NAME2)
  • Parameters

    • column_family_id (str) – The ID of the column family. Must be of the form [_a-zA-Z0-9][-_.a-zA-Z0-9]\*.

    • column (bytes) – The column within the column family where the cells are located.

  • Returns

    The cells stored in the specified column.

  • Return type

    List[Cell]

  • Raises

    • KeyError – If column_family_id is not among the cells stored in this row.

    • KeyError – If column is not among the cells stored in this row for the given column_family_id.

property row_key()

Getter for the current (partial) row’s key.

  • Return type

    bytes

  • Returns

    The current (partial) row’s key.

to_dict()

Convert the cells to a dictionary.

This is intended to be used with HappyBase, so the column family and column qualiers are combined (with :).

  • Return type

    dict

  • Returns

    Dictionary containing all the data in the cells of this row.

class google.cloud.bigtable.row_data.PartialRowsData(read_method, request, retry=<google.api_core.retry.Retry object>)

Bases: object

Convenience wrapper for consuming a ReadRows streaming response.

  • Parameters

    • read_method (client._table_data_client.read_rows) – ReadRows method.

    • request (data_messages_v2_pb2.ReadRowsRequest) – The ReadRowsRequest message used to create a ReadRowsResponse iterator. If the iterator fails, a new iterator is created, allowing the scan to continue from the point just beyond the last successfully read row, identified by self.last_scanned_row_key. The retry happens inside of the Retry class, using a predicate for the expected exceptions during iteration.

    • retry (Retry) – (Optional) Retry delay and deadline arguments. To override, the default value DEFAULT_RETRY_READ_ROWS can be used and modified with the with_delay() method or the with_deadline() method.

_iter_()

Consume the ReadRowsResponse s from the stream. Read the rows and yield each to the reader

Parse the response and its chunks into a new/existing row in _rows. Rows are returned in order by row key.

cancel()

Cancels the iterator, closing the stream.

consume_all(max_loops=None)

Consume the streamed responses until there are no more.

WARNING: This method will be removed in future releases. Please use this class as a generator instead.

  • Parameters

    max_loops (int) – (Optional) Maximum number of times to try to consume an additional ReadRowsResponse. You can use this to avoid long wait times.

property state()

State machine state.

  • Return type

    str

  • Returns

    name of state corresponding to current row / chunk processing.