Flatten (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

com.google.cloud.dataflow.sdk.transforms

Class Flatten



  • public class Flatten
    extends Object
    Flatten<T> takes multiple PCollection<T>s bundled into a PCollectionList<T> and returns a single PCollection<T> containing all the elements in all the input PCollections. The name "Flatten" suggests taking a list of lists and flattening them into a single list.

    Example of use:

     
     PCollection<String> pc1 = ...;
     PCollection<String> pc2 = ...;
     PCollection<String> pc3 = ...;
     PCollectionList<String> pcs = PCollectionList.of(pc1).and(pc2).and(pc3);
     PCollection<String> merged = pcs.apply(Flatten.<String>pCollections());
      

    By default, the Coder of the output PCollection is the same as the Coder of the first PCollection in the input PCollectionList (if the PCollectionList is non-empty).

    • Constructor Detail

      • Flatten

        public Flatten()
    • Method Detail

      • pCollections

        public static <T> Flatten.FlattenPCollectionList<T> pCollections()
        Returns a PTransform that flattens a PCollectionList into a PCollection containing all the elements of all the PCollections in its input.

        All inputs must have equal WindowFns. The output elements of Flatten<T> are in the same windows and have the same timestamps as their corresponding input elements. The output PCollection will have the same WindowFn as all of the inputs.

        Type Parameters:
        T - the type of the elements in the input and output PCollections.
      • iterables

        public static <T> Flatten.FlattenIterables<T> iterables()
        Returns a PTransform that takes a PCollection<Iterable<T>> and returns a PCollection<T> containing all the elements from all the Iterables.

        Example of use:

         
         PCollection<Iterable<Integer>> pcOfIterables = ...;
         PCollection<Integer> pc = pcOfIterables.apply(Flatten.<Integer>iterables());
          

        By default, the output PCollection encodes its elements using the same Coder that the input uses for the elements in its Iterable.

        Type Parameters:
        T - the type of the elements of the input Iterable and the output PCollection


Send feedback about...

Cloud Dataflow