BlockBasedSource.BlockBasedReader (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

Class BlockBasedSource.BlockBasedReader<T>

    • Method Detail

      • readNextBlock

        public abstract boolean readNextBlock()
                                       throws IOException
        Read the next block from the input.
      • getCurrentBlock

        public abstract BlockBasedSource.Block<T> getCurrentBlock()
        Returns the current block (the block that was read by the last successful call to readNextBlock()). May return null initially, or if no block has been successfully read.
      • getCurrentBlockSize

        public abstract long getCurrentBlockSize()
        Returns the size of the current block in bytes as it is represented in the underlying file, if possible. This method may return 0 if the size of the current block is unknown.

        The size returned by this method must be such that for two successive blocks A and B, offset(A) + size(A) <= offset(B). If this is not satisfied, the progress reported by the BlockBasedReader will be non-monotonic and will interfere with the quality (but not correctness) of dynamic work rebalancing.

        This method and BlockBasedSource.Block.getFractionOfBlockConsumed() are used to provide an estimate of progress within a block (getCurrentBlock().getFractionOfBlockConsumed() * getCurrentBlockSize()). It is acceptable for the result of this computation to be 0, but progress estimation will be inaccurate.

      • getCurrentBlockOffset

        public abstract long getCurrentBlockOffset()
        Returns the largest offset such that starting to read from that offset includes the current block.
      • isAtSplitPoint

        protected boolean isAtSplitPoint()
        Returns true if the reader is at a split point. A BlockBasedReader is at a split point if the current record is the first record in a block. In other words, split points are block boundaries.
        isAtSplitPoint in class OffsetBasedSource.OffsetBasedReader<T>
      • readNextRecord

        protected final boolean readNextRecord()
                                        throws IOException
        Reads the next record from the current block if possible. Will call readNextBlock() to advance to the next block if not.

        The first record read from a block is treated as a split point.

        Specified by:
        readNextRecord in class FileBasedSource.FileBasedReader<T>
        true if a record was successfully read, false if the end of the channel was reached before successfully reading a new record.