Google Cloud Dataflow SDK for Java, version 1.9.1
Class Create<T>
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.transforms.Create<T>
-
- Type Parameters:
T
- the type of the elements of the resultingPCollection
public class Create<T> extends Object
Create<T>
takes a collection of elements of typeT
known when the pipeline is constructed and returns aPCollection<T>
containing the elements.Example of use:
Pipeline p = ...; PCollection<Integer> pc = p.apply(Create.of(3, 4, 5).withCoder(BigEndianIntegerCoder.of())); Map<String, Integer> map = ...; PCollection<KV<String, Integer>> pt = p.apply(Create.of(map) .withCoder(KvCoder.of(StringUtf8Coder.of(), BigEndianIntegerCoder.of())));
Create
can automatically determine theCoder
to use if all elements have the same run-time class, and a default coder is registered for that class. SeeCoderRegistry
for details on how defaults are determined.If a coder can not be inferred,
Create.Values.withCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>)
must be called explicitly to set the encoding of the resultingPCollection
.A good use for
Create
is when aPCollection
needs to be created without dependencies on files or other external entities. This is especially useful during testing.Caveat:
Create
only supports small in-memory datasets, particularly when submitting jobs to the Google Cloud Dataflow service.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class and Description static class
Create.TimestampedValues<T>
APTransform
that creates aPCollection
whose elements have associated timestamps.static class
Create.Values<T>
APTransform
that creates aPCollection
from a set of in-memory objects.
-
Constructor Summary
Constructors Constructor and Description Create()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method and Description static <T> Create.Values<T>
of(Iterable<T> elems)
Returns a newCreate.Values
transform that produces aPCollection
containing elements of the providedIterable
.static <K,V> Create.Values<KV<K,V>>
of(Map<K,V> elems)
Returns a newCreate.Values
transform that produces aPCollection
ofKV
s corresponding to the keys and values of the specifiedMap
.static <T> Create.Values<T>
of(T... elems)
Returns a newCreate.Values
transform that produces aPCollection
containing the specified elements.static <T> Create.TimestampedValues<T>
timestamped(Iterable<T> values, Iterable<Long> timestamps)
Returns a new root transform that produces aPCollection
containing the specified elements with the specified timestamps.static <T> Create.TimestampedValues<T>
timestamped(Iterable<TimestampedValue<T>> elems)
Returns a newCreate.TimestampedValues
transform that produces aPCollection
containing the elements of the providedIterable
with the specified timestamps.static <T> Create.TimestampedValues<T>
timestamped(TimestampedValue<T>... elems)
Returns a newCreate.TimestampedValues
transform that produces aPCollection
containing the specified elements with the specified timestamps.
-
-
-
Method Detail
-
of
public static <T> Create.Values<T> of(Iterable<T> elems)
Returns a newCreate.Values
transform that produces aPCollection
containing elements of the providedIterable
.The argument should not be modified after this is called.
The elements of the output
PCollection
will have a timestamp of negative infinity, seetimestamped(java.lang.Iterable<com.google.cloud.dataflow.sdk.values.TimestampedValue<T>>)
for a way of creating aPCollection
with timestamped elements.By default,
Create.Values
can automatically determine theCoder
to use if all elements have the same non-parameterized run-time class, and a default coder is registered for that class. SeeCoderRegistry
for details on how defaults are determined. Otherwise, useCreate.Values.withCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>)
to set the coder explicitly.
-
of
@SafeVarargs public static <T> Create.Values<T> of(T... elems)
Returns a newCreate.Values
transform that produces aPCollection
containing the specified elements.The elements will have a timestamp of negative infinity, see
timestamped(java.lang.Iterable<com.google.cloud.dataflow.sdk.values.TimestampedValue<T>>)
for a way of creating aPCollection
with timestamped elements.The arguments should not be modified after this is called.
By default,
Create.Values
can automatically determine theCoder
to use if all elements have the same non-parameterized run-time class, and a default coder is registered for that class. SeeCoderRegistry
for details on how defaults are determined. Otherwise, useCreate.Values.withCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>)
to set the coder explicitly.
-
of
public static <K,V> Create.Values<KV<K,V>> of(Map<K,V> elems)
Returns a newCreate.Values
transform that produces aPCollection
ofKV
s corresponding to the keys and values of the specifiedMap
.The elements will have a timestamp of negative infinity, see
timestamped(java.lang.Iterable<com.google.cloud.dataflow.sdk.values.TimestampedValue<T>>)
for a way of creating aPCollection
with timestamped elements.By default,
Create.Values
can automatically determine theCoder
to use if all elements have the same non-parameterized run-time class, and a default coder is registered for that class. SeeCoderRegistry
for details on how defaults are determined. Otherwise, useCreate.Values.withCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>)
to set the coder explicitly.
-
timestamped
public static <T> Create.TimestampedValues<T> timestamped(Iterable<TimestampedValue<T>> elems)
Returns a newCreate.TimestampedValues
transform that produces aPCollection
containing the elements of the providedIterable
with the specified timestamps.The argument should not be modified after this is called.
By default,
Create.TimestampedValues
can automatically determine theCoder
to use if all elements have the same non-parameterized run-time class, and a default coder is registered for that class. SeeCoderRegistry
for details on how defaults are determined. Otherwise, useCreate.TimestampedValues.withCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>)
to set the coder explicitly.
-
timestamped
@SafeVarargs public static <T> Create.TimestampedValues<T> timestamped(TimestampedValue<T>... elems)
Returns a newCreate.TimestampedValues
transform that produces aPCollection
containing the specified elements with the specified timestamps.The arguments should not be modified after this is called.
-
timestamped
public static <T> Create.TimestampedValues<T> timestamped(Iterable<T> values, Iterable<Long> timestamps)
Returns a new root transform that produces aPCollection
containing the specified elements with the specified timestamps.The arguments should not be modified after this is called.
By default,
Create.TimestampedValues
can automatically determine theCoder
to use if all elements have the same non-parameterized run-time class, and a default coder is registered for that class. SeeCoderRegistry
for details on how defaults are determined. Otherwise, useCreate.TimestampedValues.withCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>)
to set the coder explicitly.- Throws:
IllegalArgumentException
- if there are a different number of values and timestamps
-
-