Google Cloud Dataflow SDK for Java, version 1.9.1
Class WithTimestamps<T>
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.transforms.PTransform<PCollection<T>,PCollection<T>>
-
- com.google.cloud.dataflow.sdk.transforms.WithTimestamps<T>
-
- All Implemented Interfaces:
- HasDisplayData, Serializable
public class WithTimestamps<T> extends PTransform<PCollection<T>,PCollection<T>>
APTransform
for assigning timestamps to all the elements of aPCollection
.Timestamps are used to assign
Windows
to elements within theWindow.into(com.google.cloud.dataflow.sdk.transforms.windowing.WindowFn)
PTransform
. Assigning timestamps is useful when the input data set comes from aSource
without implicit timestamps (such asTextIO
).- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class com.google.cloud.dataflow.sdk.transforms.PTransform
name
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method and Description PCollection<T>
apply(PCollection<T> input)
Applies thisPTransform
on the givenInputT
, and returns itsOutput
.Duration
getAllowedTimestampSkew()
Returns the allowed timestamp skew duration, which is the maximum duration that timestamps can be shifted backwards from the timestamp of the input element.static <T> WithTimestamps<T>
of(SerializableFunction<T,Instant> fn)
For aSerializableFunction
fn
fromT
toInstant
, outputs aPTransform
that takes an inputPCollection<T>
and outputs aPCollection<T>
containing every elementv
in the input where each element is output with a timestamp obtained as the result offn.apply(v)
.WithTimestamps<T>
withAllowedTimestampSkew(Duration allowedTimestampSkew)
Return a new WithTimestamps like this one with updated allowed timestamp skew, which is the maximum duration that timestamps can be shifted backward.-
Methods inherited from class com.google.cloud.dataflow.sdk.transforms.PTransform
getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, populateDisplayData, toString, validate
-
-
-
-
Method Detail
-
of
public static <T> WithTimestamps<T> of(SerializableFunction<T,Instant> fn)
For aSerializableFunction
fn
fromT
toInstant
, outputs aPTransform
that takes an inputPCollection<T>
and outputs aPCollection<T>
containing every elementv
in the input where each element is output with a timestamp obtained as the result offn.apply(v)
.If the input
PCollection
elements have timestamps, the output timestamp for each element must not be before the input element's timestamp minus the value ofgetAllowedTimestampSkew()
. If an output timestamp is before this time, the transform will throw anIllegalArgumentException
when executed. UsewithAllowedTimestampSkew(Duration)
to update the allowed skew.Each output element will be in the same windows as the input element. If a new window based on the new output timestamp is desired, apply a new instance of
Window.into(WindowFn)
.This transform will fail at execution time with a
NullPointerException
if for any input element the result offn.apply(v)
isnull
.Example of use in Java 8:
PCollection<Record> timestampedRecords = records.apply( WithTimestamps.of((Record rec) -> rec.getInstant());
-
withAllowedTimestampSkew
public WithTimestamps<T> withAllowedTimestampSkew(Duration allowedTimestampSkew)
Return a new WithTimestamps like this one with updated allowed timestamp skew, which is the maximum duration that timestamps can be shifted backward. Does not modify this object.The default value is
Duration.ZERO
, allowing timestamps to only be shifted into the future. For infinite skew, usenew Duration(Long.MAX_VALUE)
.
-
getAllowedTimestampSkew
public Duration getAllowedTimestampSkew()
Returns the allowed timestamp skew duration, which is the maximum duration that timestamps can be shifted backwards from the timestamp of the input element.- See Also:
DoFn.getAllowedTimestampSkew()
-
apply
public PCollection<T> apply(PCollection<T> input)
Description copied from class:PTransform
Applies thisPTransform
on the givenInputT
, and returns itsOutput
.Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
The default implementation throws an exception. A derived class must either implement apply, or else each runner must supply a custom implementation via
PipelineRunner.apply(com.google.cloud.dataflow.sdk.transforms.PTransform<InputT, OutputT>, InputT)
.- Overrides:
apply
in classPTransform<PCollection<T>,PCollection<T>>
-
-