CoGroupByKey (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

com.google.cloud.dataflow.sdk.transforms.join

Class CoGroupByKey<K>

  • Type Parameters:
    K - the type of the keys in the input and output PCollections
    All Implemented Interfaces:
    HasDisplayData, Serializable


    public class CoGroupByKey<K>
    extends PTransform<KeyedPCollectionTuple<K>,PCollection<KV<K,CoGbkResult>>>
    A PTransform that performs a CoGroupByKey on a tuple of tables. A CoGroupByKey groups results from all tables by like keys into CoGbkResults, from which the results for any specific table can be accessed by the TupleTag supplied with the initial table.

    Example of performing a CoGroupByKey followed by a ParDo that consumes the results:

     {@code
     PCollection> pt1 = ...;
     PCollection> pt2 = ...;
    
     final TupleTag t1 = new TupleTag<>();
     final TupleTag t2 = new TupleTag<>();
     PCollection> coGbkResultCollection =
       KeyedPCollectionTuple.of(t1, pt1)
                            .and(t2, pt2)
                            .apply(CoGroupByKey.create());
    
     PCollection finalResultCollection =
       coGbkResultCollection.apply(ParDo.of(
         new DoFn, T>() {
    See Also:
    Serialized Form


Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Cloud Dataflow