ByteKeyRange (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

com.google.cloud.dataflow.sdk.io.range

Class ByteKeyRange

  • All Implemented Interfaces:
    Serializable


    public final class ByteKeyRange
    extends Object
    implements Serializable
    A class representing a range of ByteKeys.

    Instances of ByteKeyRange are immutable.

    A ByteKeyRange enforces the restriction that its start and end keys must form a valid, non-empty range [startKey, endKey) that is inclusive of the start key and exclusive of the end key.

    When the end key is empty, it is treated as the largest possible key.

    Interpreting ByteKey in a ByteKeyRange

    The primary role of ByteKeyRange is to provide functionality for estimateFractionForKey(ByteKey), interpolateKey(double), and split(int), which are used for Google Cloud Dataflow's Autoscaling and Dynamic Work Rebalancing features.

    ByteKeyRange implements these features by treating a ByteKey's underlying byte[] as the binary expansion of floating point numbers in the range [0.0, 1.0]. For example, the keys ByteKey.of(0x80), ByteKey.of(0xc0), and ByteKey.of(0xe0) are interpreted as 0.5, 0.75, and 0.875 respectively. The empty ByteKey.EMPTY is interpreted as 0.0 when used as the start of a range and 1.0 when used as the end key.

    Key interpolation, fraction estimation, and range splitting are all interpreted in these floating-point semantics. See the respective implementations for further details. Note: the underlying implementations of these functions use BigInteger and BigDecimal, so they can be slow and should not be called in hot loops. Dataflow's dynamic work rebalancing will only invoke these functions during periodic control operations, so they are not called on the critical path.

    See Also:
    ByteKey, Serialized Form


Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Dataflow
Need help? Visit our support page.