Cloud Bigtable Client - Class Filter (1.23.0)

Reference documentation and code samples for the Cloud Bigtable Client class Filter.

This class houses static factory methods which can be used to create a hierarchy of filters for use with Google\Cloud\Bigtable\Table::checkAndMutateRow() or Google\Cloud\Bigtable\Table::readRows().

Filters are used to take an input row and produce an alternate view of the row based on the specified rules. For example, a filter might trim down a row to include just the cells from columns matching a given regular expression, or might return all the cells of a row but not their values. More complicated filters can be composed out of these components to express requests such as, "within every column of a particular family, give just the two most recent cells which are older than timestamp X."

There are two broad categories of filters (true filters and transformers), as well as two ways to compose simple filters into more complex ones (chains and interleaves). They work as follows:

True filters alter the input row by excluding some of its cells wholesale from the output row. An example of a true filter is Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\Builder\ValueFilter::regex(), which excludes cells whose values don't match the specified pattern. All regex true filters use RE2 syntax in raw byte mode (RE2::Latin1), and are evaluated as full matches. An important point to keep in mind is that RE2(.) is equivalent by default to RE2([^\n]), meaning that it does not match newlines.

Transformers alter the input row by changing the values of some of its cells in the output, without excluding them completely. An example of such a transformer is Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\Builder\ValueFilter::strip().

The total serialized size of a filter message must not exceed 4096 bytes, and filters may not be nested within each other (in Chains or Interleaves) to a depth of more than 20.

Example:

use Google\Cloud\Bigtable\BigtableClient;
use Google\Cloud\Bigtable\Filter;

$bigtable = new BigtableClient();
$table = $bigtable->table('my-instance', 'my-table');
$rowFilter = Filter::chain()
    ->addFilter(Filter::qualifier()->regex('prefix.*'))
    ->addFilter(Filter::limit()->cellsPerRow(10));

$rows = $table->readRows([
    'filter' => $rowFilter
]);

foreach ($rows as $row) {
    print_r($row) . PHP_EOL;
}

Methods

chain

Creates an empty chain filter.

Filters can be added to the chain by invoking Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\ChainFilter::addFilter().

The filters are applied in sequence, progressively narrowing the results. The full chain is executed atomically.

Conceptually, the process looks like the following: in row -> filter0 -> intermediate row -> filter1 -> ... -> filterN -> out row.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::chain()
    ->addFilter(Filter::qualifier()->regex('prefix.*'))
    ->addFilter(Filter::limit()->cellsPerRow(10));
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\ChainFilter

interleave

Creates an empty interleave filter.

Filters can be added to the interleave by invoking Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\InterleaveFilter::addFilter().

The supplied filters all process a copy of the input row, and the results are pooled, sorted, and combined into a single output row. If multiple cells are produced with the same column and timestamp, they will all appear in the output row in an unspecified mutual order. The full chain is executed atomically.

Consider the following example, with three filters:

                                 input row
                                     |
           -----------------------------------------------------
           |                         |                         |
        filter1                   filter2                   filter3
           |                         |                         |
    1: foo,bar,10,x             foo,bar,10,z              far,bar,7,a
    2: foo,blah,11,z            far,blah,5,x              far,blah,5,x
           |                         |                         |
           -----------------------------------------------------
                                     |
    1:                      foo,bar,10,z   // could have switched with #2
    2:                      foo,bar,10,x   // could have switched with #1
    3:                      foo,blah,11,z
    4:                      far,bar,7,a
    5:                      far,blah,5,x   // identical to #6
    6:                      far,blah,5,x   // identical to #5

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::interleave()
    ->addFilter(Filter::key()->regex('prefix.*'))
    ->addFilter(Filter::sink());
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\InterleaveFilter

condition

Creates a condition filter.

If the result of predicate filter outputs any cells the filter configured by Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\ConditionFilter::then() will be applied. Conversely, if the predicate results in no cells, the filter configured by Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\ConditionFilter::otherwise() will then be applied instead.

IMPORTANT NOTE: The predicate filter does not execute atomically with the Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\ConditionFilter::then() and Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\ConditionFilter::otherwise() filters, which may lead to inconsistent or unexpected results. Additionally, Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\ConditionFilter may have poor performance, especially when filters are set for the Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\ConditionFilter::otherwise().

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::condition(Filter::key()->regex('prefix.*'))
    ->then(Filter::label('hasPrefix'))
    ->otherwise(Filter::value()->strip());
Parameter
NameDescription
predicateFilter Google\Cloud\Bigtable\Filter\FilterInterface

A predicate filter.

Returns
TypeDescription
Google\Cloud\Bigtable\Filter\ConditionFilter

key

Returns a builder used to configure row key filters.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::key()
    ->regex('prefix.*');
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\Builder\KeyFilter

family

Returns a builder used to configure column family filters.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::family()
    ->regex('prefix.*');
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\Builder\FamilyFilter

qualifier

Returns a builder used to configure column qualifier filters.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::qualifier()
    ->regex('prefix.*');
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\Builder\QualifierFilter

timestamp

Returns a builder used to configure timestamp related filters.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::timestamp()
    ->range()
    ->of(1536766964380000, 1536766964383000);
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\Builder\TimestampFilter

value

Returns a builder used to configure value related filters.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::value()
    ->range()
    ->of('a', 'z');
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\Builder\ValueFilter

offset

Returns a builder used to configure offset related filters.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::offset()
    ->cellsPerRow(1);
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\Builder\OffsetFilter

limit

Returns a builder used to configure limit related filters.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::limit()
    ->cellsPerRow(1);
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\Builder\LimitFilter

pass

Matches all cells, regardless of input. Functionally equivalent to having no filter.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::pass();
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\SimpleFilter

block

Does not match any cells, regardless of input. Useful for temporarily disabling just part of a filter.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::block();
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\SimpleFilter

sink

Outputs all cells directly to the output of the read rather than to any parent filter. For advanced usage, see comments in for more detail.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::sink();
Returns
TypeDescription
Google\Cloud\Bigtable\Filter\SimpleFilter

label

Applies the given label to all cells in the output row. This allows the caller to determine which results were produced from which part of the filter.

Due to technical limitation, it is not currently possible to apply multiple labels to a cell. As a result, a Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\ChainFilter may have no more than one sub-filter which contains a label. It is okay for a Google\Cloud\Bigtable\Google\Cloud\Bigtable\Filter\InterleaveFilter to contain multiple labels, as they will be applied to separate copies of the input. This may be relaxed in the future.

Example:

use Google\Cloud\Bigtable\Filter;

$rowFilter = Filter::label('my-label');
Parameter
NameDescription
value string

The label to apply.

Returns
TypeDescription
Google\Cloud\Bigtable\Filter\SimpleFilter