Google Cloud Dataflow SDK for Java, version 1.9.1
Class Combine.Globally<InputT,OutputT>
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.transforms.PTransform<PCollection<InputT>,PCollection<OutputT>>
-
- com.google.cloud.dataflow.sdk.transforms.Combine.Globally<InputT,OutputT>
-
- Type Parameters:
InputT
- type of input valuesOutputT
- type of output values
- All Implemented Interfaces:
- HasDisplayData, Serializable
- Enclosing class:
- Combine
public static class Combine.Globally<InputT,OutputT> extends PTransform<PCollection<InputT>,PCollection<OutputT>>
Combine.Globally<InputT, OutputT>
takes aPCollection<InputT>
and returns aPCollection<OutputT>
whose elements are the result of combining all the elements in each window of the inputPCollection
, using a specifiedCombineFn<InputT, AccumT, OutputT>
. It is common forInputT == OutputT
, but not required. Common combining functions include sums, mins, maxes, and averages of numbers, conjunctions and disjunctions of booleans, statistical aggregations, etc.Example of use:
PCollection<Integer> pc = ...; PCollection<Integer> sum = pc.apply( Combine.globally(new Sum.SumIntegerFn()));
Combining can happen in parallel, with different subsets of the input
PCollection
being combined separately, and their intermediate results combined further, in an arbitrary tree reduction pattern, until a single result value is produced.If the input
PCollection
is windowed intoGlobalWindows
, a default value in theGlobalWindow
will be output if the inputPCollection
is empty. To use this with inputs with other windowing, eitherwithoutDefaults()
orasSingletonView()
must be called, as the default value cannot be automatically assigned to any single window.By default, the
Coder
of the outputPValue<OutputT>
is inferred from the concrete type of theCombineFn<InputT, AccumT, OutputT>
's output typeOutputT
.See also
Combine.perKey(com.google.cloud.dataflow.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>)
/Combine.PerKey
andCombine.groupedValues(com.google.cloud.dataflow.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>)
/Combine.GroupedValues
, which are useful for combining values associated with each key in aPCollection
ofKV
s.- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class com.google.cloud.dataflow.sdk.transforms.PTransform
name
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method and Description PCollection<OutputT>
apply(PCollection<InputT> input)
Applies thisPTransform
on the givenInputT
, and returns itsOutput
.Combine.GloballyAsSingletonView<InputT,OutputT>
asSingletonView()
Returns aPTransform
that produces aPCollectionView
whose elements are the result of combining elements per-window in the inputPCollection
.Combine.Globally<InputT,OutputT>
named(String name)
Return a newGlobally
transform that's like this transform but with the specified name.void
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.Combine.Globally<InputT,OutputT>
withFanout(int fanout)
Returns aPTransform
identical to this, but that uses an intermediate node to combine parts of the data to reduce load on the final global combine step.Combine.Globally<InputT,OutputT>
withoutDefaults()
Returns aPTransform
identical to this, but that does not attempt to provide a default value in the case of empty input.Combine.Globally<InputT,OutputT>
withSideInputs(Iterable<? extends PCollectionView<?>> sideInputs)
Returns aPTransform
identical to this, but with the specified side inputs to use inCombineWithContext.CombineFnWithContext
.-
Methods inherited from class com.google.cloud.dataflow.sdk.transforms.PTransform
getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validate
-
-
-
-
Method Detail
-
named
public Combine.Globally<InputT,OutputT> named(String name)
Return a newGlobally
transform that's like this transform but with the specified name. Does not modify this transform.
-
asSingletonView
public Combine.GloballyAsSingletonView<InputT,OutputT> asSingletonView()
Returns aPTransform
that produces aPCollectionView
whose elements are the result of combining elements per-window in the inputPCollection
. If a value is requested from the view for a window that is not present, the result of applying theCombineFn
to an empty input set will be returned.
-
withoutDefaults
public Combine.Globally<InputT,OutputT> withoutDefaults()
Returns aPTransform
identical to this, but that does not attempt to provide a default value in the case of empty input. Required when the input is not globally windowed and the output is not being used as a side input.
-
withFanout
public Combine.Globally<InputT,OutputT> withFanout(int fanout)
Returns aPTransform
identical to this, but that uses an intermediate node to combine parts of the data to reduce load on the final global combine step.The
fanout
parameter determines the number of intermediate keys that will be used.
-
withSideInputs
public Combine.Globally<InputT,OutputT> withSideInputs(Iterable<? extends PCollectionView<?>> sideInputs)
Returns aPTransform
identical to this, but with the specified side inputs to use inCombineWithContext.CombineFnWithContext
.
-
apply
public PCollection<OutputT> apply(PCollection<InputT> input)
Description copied from class:PTransform
Applies thisPTransform
on the givenInputT
, and returns itsOutput
.Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
The default implementation throws an exception. A derived class must either implement apply, or else each runner must supply a custom implementation via
PipelineRunner.apply(com.google.cloud.dataflow.sdk.transforms.PTransform<InputT, OutputT>, InputT)
.- Overrides:
apply
in classPTransform<PCollection<InputT>,PCollection<OutputT>>
-
populateDisplayData
public void populateDisplayData(DisplayData.Builder builder)
Description copied from class:PTransform
Register display data for the given transform or component.populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData)
. Implementations may callsuper.populateDisplayData(builder)
in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)
to use the namespace of the subcomponent.By default, does not register any display data. Implementors may override this method to provide their own display data.
- Specified by:
populateDisplayData
in interfaceHasDisplayData
- Overrides:
populateDisplayData
in classPTransform<PCollection<InputT>,PCollection<OutputT>>
- Parameters:
builder
- The builder to populate with display data.- See Also:
HasDisplayData
-
-