Count.PerElement (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

Class Count.PerElement<T>

  • Type Parameters:
    T - the type of the elements of the input PCollection, and the type of the keys of the output PCollection
    All Implemented Interfaces:
    HasDisplayData, Serializable
    Enclosing class:

    public static class Count.PerElement<T>
    extends PTransform<PCollection<T>,PCollection<KV<T,Long>>>
    Count.PerElement<T> takes a PCollection<T> and returns a PCollection<KV<T, Long>> representing a map from each distinct element of the input PCollection to the number of times that element occurs in the input. Each key in the output PCollection is unique.

    This transform compares two values of type T by first encoding each element using the input PCollection's Coder, then comparing the encoded bytes. Because of this, the input coder must be deterministic. (See Coder.verifyDeterministic() for more detail). Performing the comparison in this manner admits efficient parallel evaluation.

    By default, the Coder of the keys of the output PCollection is the same as the Coder of the elements of the input PCollection.

    Example of use:

     PCollection<String> words = ...;
     PCollection<KV<String, Long>> wordCounts =
    See Also:
    Serialized Form

Send feedback about...

Cloud Dataflow