WithKeys (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

com.google.cloud.dataflow.sdk.transforms

Class WithKeys<K,V>

  • Type Parameters:
    K - the type of the keys in the output PCollection
    V - the type of the elements in the input PCollection and the values in the output PCollection
    All Implemented Interfaces:
    HasDisplayData, Serializable


    public class WithKeys<K,V>
    extends PTransform<PCollection<V>,PCollection<KV<K,V>>>
    WithKeys<K, V> takes a PCollection<V>, and either a constant key of type K or a function from V to K, and returns a PCollection<KV<K, V>>, where each of the values in the input PCollection has been paired with either the constant key or a key computed from the value.

    Example of use:

     
     PCollection<String> words = ...;
     PCollection<KV<Integer, String>> lengthsToWords =
         words.apply(WithKeys.of(new SerializableFunction<String, Integer>() {
             public Integer apply(String s) { return s.length(); } }));
      

    Each output element has the same timestamp and is in the same windows as its corresponding input element, and the output PCollection has the same WindowFn associated with it as the input.

    See Also:
    Serialized Form
    • Method Detail

      • of

        public static <K,V> WithKeys<K,V> of(SerializableFunction<V,K> fn)
        Returns a PTransform that takes a PCollection<V> and returns a PCollection<KV<K, V>>, where each of the values in the input PCollection has been paired with a key computed from the value by invoking the given SerializableFunction.

        If using a lambda in Java 8, withKeyType(TypeDescriptor) must be called on the result PTransform.

      • of

        public static <K,V> WithKeys<K,V> of(K key)
        Returns a PTransform that takes a PCollection<V> and returns a PCollection<KV<K, V>>, where each of the values in the input PCollection has been paired with the given key.


Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Dataflow