Google Cloud Dataflow SDK for Java, version 1.9.1
Class DoFnTester<InputT,OutputT>
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.transforms.DoFnTester<InputT,OutputT>
-
- Type Parameters:
InputT
- the type of theDoFn
's (main) input elementsOutputT
- the type of theDoFn
's (main) output elements
public class DoFnTester<InputT,OutputT> extends Object
A harness for unit-testing aDoFn
.For example:
DoFn<InputT, OutputT> fn = ...; DoFnTester<InputT, OutputT> fnTester = DoFnTester.of(fn); // Set arguments shared across all batches: fnTester.setSideInputs(...); // If fn takes side inputs. fnTester.setSideOutputTags(...); // If fn writes to side outputs. // Process a batch containing a single input element: Input testInput = ...; List<OutputT> testOutputs = fnTester.processBatch(testInput); Assert.assertThat(testOutputs, Matchers.hasItems(...)); // Process a bigger batch: Assert.assertThat(fnTester.processBatch(i1, i2, ...), Matchers.hasItems(...));
-
-
Nested Class Summary
Nested Classes Modifier and Type Class and Description static class
DoFnTester.CloningBehavior
Whether or not aDoFnTester
should clone theDoFn
under test.static class
DoFnTester.OutputElementWithTimestamp<OutputT>
Holder for an OutputElement along with its associated timestamp.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method and Description void
clearOutputElements()
Clears the record of the elements output so far to the main output.<T> void
clearSideOutputElements(TupleTag<T> tag)
Clears the record of the elements output so far to the side output with the given tag.void
finishBundle()
CallsDoFn.finishBundle(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.Context)
of theDoFn
under test.<AggregateT>
AggregateTgetAggregatorValue(Aggregator<?,AggregateT> agg)
Returns the value of the providedAggregator
.DoFnTester.CloningBehavior
getCloningBehavior()
Indicates whether thisDoFnTester
will clone theDoFn
under test.static <InputT,OutputT>
DoFnTester<InputT,OutputT>of(DoFn<InputT,OutputT> fn)
Returns aDoFnTester
supporting unit-testing of the givenDoFn
.static <InputT,OutputT>
DoFnTester<InputT,OutputT>of(DoFnWithContext<InputT,OutputT> fn)
Returns aDoFnTester
supporting unit-testing of the givenDoFn
.List<OutputT>
peekOutputElements()
Returns the elements output so far to the main output.List<DoFnTester.OutputElementWithTimestamp<OutputT>>
peekOutputElementsWithTimestamp()
Returns the elements output so far to the main output with associated timestamps.<T> List<T>
peekSideOutputElements(TupleTag<T> tag)
Returns the elements output so far to the side output with the given tag.List<OutputT>
processBatch(InputT... inputElements)
A convenience method for testingDoFns
with bundles of elements.List<OutputT>
processBatch(Iterable<? extends InputT> inputElements)
A convenience operation that first callsstartBundle()
, then callsprocessElement(InputT)
on each of the input elements, then callsfinishBundle()
, then returns the result oftakeOutputElements()
.void
processElement(InputT element)
CallsDoFn.processElement(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.ProcessContext)
on theDoFn
under test, in a context whereDoFn.ProcessContext.element()
returns the given element.void
setCloningBehavior(DoFnTester.CloningBehavior newValue)
Instruct thisDoFnTester
whether or not to clone theDoFn
under test.void
setSideInput(PCollectionView<?> sideInput, Iterable<com.google.cloud.dataflow.sdk.util.WindowedValue<?>> value)
Registers the values of a side inputPCollectionView
to pass to theDoFn
under test.void
setSideInputInGlobalWindow(PCollectionView<?> sideInput, Iterable<?> value)
Registers the values for a side inputPCollectionView
to pass to theDoFn
under test.void
setSideInputs(Map<PCollectionView<?>,Iterable<com.google.cloud.dataflow.sdk.util.WindowedValue<?>>> sideInputs)
Registers the tuple of values of the side inputPCollectionView
s to pass to theDoFn
under test.void
setSideOutputTags(TupleTagList sideOutputTags)
Registers the list ofTupleTag
s that can be used by theDoFn
under test to output to side outputPCollection
s.void
startBundle()
CallsDoFn.startBundle(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.Context)
on theDoFn
under test.List<OutputT>
takeOutputElements()
Returns the elements output so far to the main output.List<DoFnTester.OutputElementWithTimestamp<OutputT>>
takeOutputElementsWithTimestamp()
Returns the elements output so far to the main output with associated timestamps.<T> List<T>
takeSideOutputElements(TupleTag<T> tag)
Returns the elements output so far to the side output with the given tag.
-
-
-
Method Detail
-
of
public static <InputT,OutputT> DoFnTester<InputT,OutputT> of(DoFn<InputT,OutputT> fn)
Returns aDoFnTester
supporting unit-testing of the givenDoFn
.
-
of
public static <InputT,OutputT> DoFnTester<InputT,OutputT> of(DoFnWithContext<InputT,OutputT> fn)
Returns aDoFnTester
supporting unit-testing of the givenDoFn
.
-
setSideInputs
public void setSideInputs(Map<PCollectionView<?>,Iterable<com.google.cloud.dataflow.sdk.util.WindowedValue<?>>> sideInputs)
Registers the tuple of values of the side inputPCollectionView
s to pass to theDoFn
under test.If needed, first creates a fresh instance of the
DoFn
under test.If this isn't called,
DoFnTester
assumes theDoFn
takes no side inputs.
-
setSideInput
public void setSideInput(PCollectionView<?> sideInput, Iterable<com.google.cloud.dataflow.sdk.util.WindowedValue<?>> value)
Registers the values of a side inputPCollectionView
to pass to theDoFn
under test.If needed, first creates a fresh instance of the
DoFn
under test.If this isn't called,
DoFnTester
assumes theDoFn
takes no side inputs.
-
setSideInputInGlobalWindow
public void setSideInputInGlobalWindow(PCollectionView<?> sideInput, Iterable<?> value)
Registers the values for a side inputPCollectionView
to pass to theDoFn
under test. All values are placed in the global window.
-
setSideOutputTags
public void setSideOutputTags(TupleTagList sideOutputTags)
Registers the list ofTupleTag
s that can be used by theDoFn
under test to output to side outputPCollection
s.If needed, first creates a fresh instance of the DoFn under test.
If this isn't called,
DoFnTester
assumes theDoFn
doesn't emit to any side outputs.
-
setCloningBehavior
public void setCloningBehavior(DoFnTester.CloningBehavior newValue)
Instruct thisDoFnTester
whether or not to clone theDoFn
under test.
-
getCloningBehavior
public DoFnTester.CloningBehavior getCloningBehavior()
Indicates whether thisDoFnTester
will clone theDoFn
under test.
-
processBatch
public List<OutputT> processBatch(Iterable<? extends InputT> inputElements)
A convenience operation that first callsstartBundle()
, then callsprocessElement(InputT)
on each of the input elements, then callsfinishBundle()
, then returns the result oftakeOutputElements()
.
-
processBatch
@SafeVarargs public final List<OutputT> processBatch(InputT... inputElements)
A convenience method for testingDoFns
with bundles of elements. Logic proceeds as follows:- Calls
startBundle()
. - Calls
processElement(InputT)
on each of the arguments. - Calls
finishBundle()
. - Returns the result of
takeOutputElements()
.
- Calls
-
startBundle
public void startBundle()
CallsDoFn.startBundle(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.Context)
on theDoFn
under test.If needed, first creates a fresh instance of the DoFn under test.
-
processElement
public void processElement(InputT element)
CallsDoFn.processElement(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.ProcessContext)
on theDoFn
under test, in a context whereDoFn.ProcessContext.element()
returns the given element.Will call
startBundle()
automatically, if it hasn't already been called.- Throws:
IllegalStateException
- if theDoFn
under test has already been finished
-
finishBundle
public void finishBundle()
CallsDoFn.finishBundle(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.Context)
of theDoFn
under test.Will call
startBundle()
automatically, if it hasn't already been called.- Throws:
IllegalStateException
- if theDoFn
under test has already been finished
-
peekOutputElements
public List<OutputT> peekOutputElements()
Returns the elements output so far to the main output. Does not clear them, so subsequent calls will continue to include these elements.- See Also:
takeOutputElements()
,clearOutputElements()
-
peekOutputElementsWithTimestamp
@Experimental public List<DoFnTester.OutputElementWithTimestamp<OutputT>> peekOutputElementsWithTimestamp()
Returns the elements output so far to the main output with associated timestamps. Does not clear them, so subsequent calls will continue to include these. elements.
-
clearOutputElements
public void clearOutputElements()
Clears the record of the elements output so far to the main output.- See Also:
peekOutputElements()
-
takeOutputElements
public List<OutputT> takeOutputElements()
Returns the elements output so far to the main output. Clears the list so these elements don't appear in future calls.- See Also:
peekOutputElements()
-
takeOutputElementsWithTimestamp
@Experimental public List<DoFnTester.OutputElementWithTimestamp<OutputT>> takeOutputElementsWithTimestamp()
Returns the elements output so far to the main output with associated timestamps. Clears the list so these elements don't appear in future calls.
-
peekSideOutputElements
public <T> List<T> peekSideOutputElements(TupleTag<T> tag)
Returns the elements output so far to the side output with the given tag. Does not clear them, so subsequent calls will continue to include these elements.
-
clearSideOutputElements
public <T> void clearSideOutputElements(TupleTag<T> tag)
Clears the record of the elements output so far to the side output with the given tag.
-
takeSideOutputElements
public <T> List<T> takeSideOutputElements(TupleTag<T> tag)
Returns the elements output so far to the side output with the given tag. Clears the list so these elements don't appear in future calls.
-
getAggregatorValue
public <AggregateT> AggregateT getAggregatorValue(Aggregator<?,AggregateT> agg)
Returns the value of the providedAggregator
.
-
-