CoGroupByKey (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

Class CoGroupByKey<K>

  • Type Parameters:
    K - the type of the keys in the input and output PCollections
    All Implemented Interfaces:
    HasDisplayData, Serializable

    public class CoGroupByKey<K>
    extends PTransform<KeyedPCollectionTuple<K>,PCollection<KV<K,CoGbkResult>>>
    A PTransform that performs a CoGroupByKey on a tuple of tables. A CoGroupByKey groups results from all tables by like keys into CoGbkResults, from which the results for any specific table can be accessed by the TupleTag supplied with the initial table.

    Example of performing a CoGroupByKey followed by a ParDo that consumes the results:

     PCollection<KV<K, V1>> pt1 = ...;
     PCollection<KV<K, V2>> pt2 = ...;
     final TupleTag<V1> t1 = new TupleTag<>();
     final TupleTag<V2> t2 = new TupleTag<>();
     PCollection<KV<K, CoGbkResult>> coGbkResultCollection =
       KeyedPCollectionTuple.of(t1, pt1)
                            .and(t2, pt2)
     PCollection<T> finalResultCollection =
         new DoFn<KV<K, CoGbkResult>, T>() {
           public void processElement(ProcessContext c) {
             KV<K, CoGbkResult> e = c.element();
             Iterable<V1> pt1Vals = e.getValue().getAll(t1);
             V2 pt2Val = e.getValue().getOnly(t2);
              ... Do Something ....
             c.output(...some T...);
    See Also:
    Serialized Form

이 페이지가 도움이 되었나요? 평가를 부탁드립니다.

다음에 대한 의견 보내기...

도움이 필요하시나요? 지원 페이지를 방문하세요.