Google Cloud Dataflow SDK for Java, version 1.9.1
Class Top
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.transforms.Top
-
public class Top extends Object
PTransform
s for finding the largest (or smallest) set of elements in aPCollection
, or the largest (or smallest) set of values associated with each key in aPCollection
ofKV
s.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class and Description static class
Top.Largest<T extends Comparable<? super T>>
ASerializable
Comparator
that that uses the compared elements' natural ordering.static class
Top.Smallest<T extends Comparable<? super T>>
Serializable
Comparator
that that uses the reverse of the compared elements' natural ordering.static class
Top.TopCombineFn<T,ComparatorT extends Comparator<T> & Serializable>
CombineFn
forTop
transforms that combines a bunch ofT
s into a singlecount
-longList<T>
, usingcompareFn
to choose the largestT
s.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method and Description static <T extends Comparable<T>>
Combine.Globally<T,List<T>>largest(int count)
Returns aPTransform
that takes an inputPCollection<T>
and returns aPCollection<List<T>>
with a single element containing the largestcount
elements of the inputPCollection<T>
, in decreasing order, sorted according to their natural order.static <K,V extends Comparable<V>>
Combine.PerKey<K,V,List<V>>largestPerKey(int count)
Returns aPTransform
that takes an inputPCollection<KV<K, V>>
and returns aPCollection<KV<K, List<V>>>
that contains an output element mapping each distinct key in the inputPCollection
to the largestcount
values associated with that key in the inputPCollection<KV<K, V>>
, in decreasing order, sorted according to their natural order.static <T,ComparatorT extends Comparator<T> & Serializable>
Combine.Globally<T,List<T>>of(int count, ComparatorT compareFn)
Returns aPTransform
that takes an inputPCollection<T>
and returns aPCollection<List<T>>
with a single element containing the largestcount
elements of the inputPCollection<T>
, in decreasing order, sorted using the givenComparator<T>
.static <K,V,ComparatorT extends Comparator<V> & Serializable>
PTransform<PCollection<KV<K,V>>,PCollection<KV<K,List<V>>>>perKey(int count, ComparatorT compareFn)
Returns aPTransform
that takes an inputPCollection<KV<K, V>>
and returns aPCollection<KV<K, List<V>>>
that contains an output element mapping each distinct key in the inputPCollection
to the largestcount
values associated with that key in the inputPCollection<KV<K, V>>
, in decreasing order, sorted using the givenComparator<V>
.static <T extends Comparable<T>>
Combine.Globally<T,List<T>>smallest(int count)
Returns aPTransform
that takes an inputPCollection<T>
and returns aPCollection<List<T>>
with a single element containing the smallestcount
elements of the inputPCollection<T>
, in increasing order, sorted according to their natural order.static <K,V extends Comparable<V>>
PTransform<PCollection<KV<K,V>>,PCollection<KV<K,List<V>>>>smallestPerKey(int count)
Returns aPTransform
that takes an inputPCollection<KV<K, V>>
and returns aPCollection<KV<K, List<V>>>
that contains an output element mapping each distinct key in the inputPCollection
to the smallestcount
values associated with that key in the inputPCollection<KV<K, V>>
, in increasing order, sorted according to their natural order.
-
-
-
Method Detail
-
of
public static <T,ComparatorT extends Comparator<T> & Serializable> Combine.Globally<T,List<T>> of(int count, ComparatorT compareFn)
Returns aPTransform
that takes an inputPCollection<T>
and returns aPCollection<List<T>>
with a single element containing the largestcount
elements of the inputPCollection<T>
, in decreasing order, sorted using the givenComparator<T>
. TheComparator<T>
must also beSerializable
.If
count
<
the number of elements in the inputPCollection
, then all the elements of the inputPCollection
will be in the resultingList
, albeit in sorted order.All the elements of the result's
List
must fit into the memory of a single machine.Example of use:
PCollection<Student> students = ...; PCollection<List<Student>> top10Students = students.apply(Top.of(10, new CompareStudentsByAvgGrade()));
By default, the
Coder
of the outputPCollection
is aListCoder
of theCoder
of the elements of the inputPCollection
.If the input
PCollection
is windowed intoGlobalWindows
, an emptyList<T>
in theGlobalWindow
will be output if the inputPCollection
is empty. To use this with inputs with other windowing, eitherwithoutDefaults
orasSingletonView
must be called.See also
smallest(int)
andlargest(int)
, which sortComparable
elements using their natural ordering.See also
perKey(int, ComparatorT)
,smallestPerKey(int)
, andlargestPerKey(int)
, which take aPCollection
ofKV
s and return the top values associated with each key.
-
smallest
public static <T extends Comparable<T>> Combine.Globally<T,List<T>> smallest(int count)
Returns aPTransform
that takes an inputPCollection<T>
and returns aPCollection<List<T>>
with a single element containing the smallestcount
elements of the inputPCollection<T>
, in increasing order, sorted according to their natural order.If
count
<
the number of elements in the inputPCollection
, then all the elements of the inputPCollection
will be in the resultingPCollection
'sList
, albeit in sorted order.All the elements of the result
List
must fit into the memory of a single machine.Example of use:
PCollection<Integer> values = ...; PCollection<List<Integer>> smallest10Values = values.apply(Top.smallest(10));
By default, the
Coder
of the outputPCollection
is aListCoder
of theCoder
of the elements of the inputPCollection
.If the input
PCollection
is windowed intoGlobalWindows
, an emptyList<T>
in theGlobalWindow
will be output if the inputPCollection
is empty. To use this with inputs with other windowing, eitherwithoutDefaults
orasSingletonView
must be called.See also
largest(int)
.See also
of(int, ComparatorT)
, which sorts using a user-specifiedComparator
function.See also
perKey(int, ComparatorT)
,smallestPerKey(int)
, andlargestPerKey(int)
, which take aPCollection
ofKV
s and return the top values associated with each key.
-
largest
public static <T extends Comparable<T>> Combine.Globally<T,List<T>> largest(int count)
Returns aPTransform
that takes an inputPCollection<T>
and returns aPCollection<List<T>>
with a single element containing the largestcount
elements of the inputPCollection<T>
, in decreasing order, sorted according to their natural order.If
count
<
the number of elements in the inputPCollection
, then all the elements of the inputPCollection
will be in the resultingPCollection
'sList
, albeit in sorted order.All the elements of the result's
List
must fit into the memory of a single machine.Example of use:
PCollection<Integer> values = ...; PCollection<List<Integer>> largest10Values = values.apply(Top.largest(10));
By default, the
Coder
of the outputPCollection
is aListCoder
of theCoder
of the elements of the inputPCollection
.If the input
PCollection
is windowed intoGlobalWindows
, an emptyList<T>
in theGlobalWindow
will be output if the inputPCollection
is empty. To use this with inputs with other windowing, eitherwithoutDefaults
orasSingletonView
must be called.See also
smallest(int)
.See also
of(int, ComparatorT)
, which sorts using a user-specifiedComparator
function.See also
perKey(int, ComparatorT)
,smallestPerKey(int)
, andlargestPerKey(int)
, which take aPCollection
ofKV
s and return the top values associated with each key.
-
perKey
public static <K,V,ComparatorT extends Comparator<V> & Serializable> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,List<V>>>> perKey(int count, ComparatorT compareFn)
Returns aPTransform
that takes an inputPCollection<KV<K, V>>
and returns aPCollection<KV<K, List<V>>>
that contains an output element mapping each distinct key in the inputPCollection
to the largestcount
values associated with that key in the inputPCollection<KV<K, V>>
, in decreasing order, sorted using the givenComparator<V>
. TheComparator<V>
must also beSerializable
.If there are fewer than
count
values associated with a particular key, then all those values will be in the result mapping for that key, albeit in sorted order.All the values associated with a single key must fit into the memory of a single machine, but there can be many more
KV
s in the resultingPCollection
than can fit into the memory of a single machine.Example of use:
PCollection<KV<School, Student>> studentsBySchool = ...; PCollection<KV<School, List<Student>>> top10StudentsBySchool = studentsBySchool.apply( Top.perKey(10, new CompareStudentsByAvgGrade()));
By default, the
Coder
of the keys of the outputPCollection
is the same as that of the keys of the inputPCollection
, and theCoder
of the values of the outputPCollection
is aListCoder
of theCoder
of the values of the inputPCollection
.See also
smallestPerKey(int)
andlargestPerKey(int)
, which sortComparable<V>
values using their natural ordering.See also
of(int, ComparatorT)
,smallest(int)
, andlargest(int)
, which take aPCollection
and return the top elements.
-
smallestPerKey
public static <K,V extends Comparable<V>> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,List<V>>>> smallestPerKey(int count)
Returns aPTransform
that takes an inputPCollection<KV<K, V>>
and returns aPCollection<KV<K, List<V>>>
that contains an output element mapping each distinct key in the inputPCollection
to the smallestcount
values associated with that key in the inputPCollection<KV<K, V>>
, in increasing order, sorted according to their natural order.If there are fewer than
count
values associated with a particular key, then all those values will be in the result mapping for that key, albeit in sorted order.All the values associated with a single key must fit into the memory of a single machine, but there can be many more
KV
s in the resultingPCollection
than can fit into the memory of a single machine.Example of use:
PCollection<KV<String, Integer>> keyedValues = ...; PCollection<KV<String, List<Integer>>> smallest10ValuesPerKey = keyedValues.apply(Top.smallestPerKey(10));
By default, the
Coder
of the keys of the outputPCollection
is the same as that of the keys of the inputPCollection
, and theCoder
of the values of the outputPCollection
is aListCoder
of theCoder
of the values of the inputPCollection
.See also
largestPerKey(int)
.See also
perKey(int, ComparatorT)
, which sorts values using a user-specifiedComparator
function.See also
of(int, ComparatorT)
,smallest(int)
, andlargest(int)
, which take aPCollection
and return the top elements.
-
largestPerKey
public static <K,V extends Comparable<V>> Combine.PerKey<K,V,List<V>> largestPerKey(int count)
Returns aPTransform
that takes an inputPCollection<KV<K, V>>
and returns aPCollection<KV<K, List<V>>>
that contains an output element mapping each distinct key in the inputPCollection
to the largestcount
values associated with that key in the inputPCollection<KV<K, V>>
, in decreasing order, sorted according to their natural order.If there are fewer than
count
values associated with a particular key, then all those values will be in the result mapping for that key, albeit in sorted order.All the values associated with a single key must fit into the memory of a single machine, but there can be many more
KV
s in the resultingPCollection
than can fit into the memory of a single machine.Example of use:
PCollection<KV<String, Integer>> keyedValues = ...; PCollection<KV<String, List<Integer>>> largest10ValuesPerKey = keyedValues.apply(Top.largestPerKey(10));
By default, the
Coder
of the keys of the outputPCollection
is the same as that of the keys of the inputPCollection
, and theCoder
of the values of the outputPCollection
is aListCoder
of theCoder
of the values of the inputPCollection
.See also
smallestPerKey(int)
.See also
perKey(int, ComparatorT)
, which sorts values using a user-specifiedComparator
function.See also
of(int, ComparatorT)
,smallest(int)
, andlargest(int)
, which take aPCollection
and return the top elements.
-
-