ROLLINGLIST Function

Computes the rolling list of values forward or backward of the current row within the specified column and returns an array of these values.
  • If an input value is missing or null, it is not factored in the computation. For example, for the first row in the dataset, the rolling count of distinct values of previous values is undefined.
  • The row from which to extract a value is determined by the order in which the rows are organized based on the order parameter.

    If you are working on a randomly generated sample of your dataset, the values that you see for this function might not correspond to the values that are generated on the full dataset during job execution.

  • The function takes a column name and three optional integer parameters that determine the maximum number of values and the window backward and forward of the current row.
    • By default, the list is limited to 1000 values. To change the maximum number of values, specify a value for the limit parameter.
    • For the window parameters, the default values are -1 and 0, which computes the rolling function from the current row back to the first row of the dataset.
  • This function works with the following transforms:

For more information on a non-rolling version of this function, see LIST Function.

Basic Usage

Column example:

derive type:single value:ROLLINGLIST(myCol)

Output: Generates a new column containing the rolling list of values in the myCol column from the first row of the dataset to the current one.

Rows before example:

window value:ROLLINGLIST(myNumber, 5, 20)

Output: Generates the new column, which contains the rolling list of values of the current row and the twenty previous row values in the myNumber column, with a limit of 5 total values.

Rows before and after example:

window value:ROLLINGLIST(myNumber, 20, 299, 200)

Output: Generates the new column, which contains the rolling list of values from the previous 299 rows, the current row value, and the 200 rows after the current one in the myNumber column, with a limit of 20 total values.

Syntax

window value:ROLLINGLIST(col_ref, rowsBefore_integer, rowsAfter_integer) order: order_col [group: group_col]

ArgumentRequired?Data TypeDescription
col_refYstringName of column whose values are applied to the function
limit_intNintegerMaximum number of values to extract into the list array. From 1 to 1000.
rowsBefore_integerNintegerNumber of rows before the current one to include in the computation
rowsAfter_integerNintegerNumber of rows after the current one to include in the computation

For more information on the order and group parameters, see Window Transform.

For more information on syntax standards, see Language Documentation Syntax Notes.

col_ref

Name of the column whose values are used to compute the function.

  • Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString (column reference to Integer or Decimal values)myColumn

limit_int

Non-negative integer that defines the maximum number of values to extract into the list array.

NOTE: If specified, this value must between 1 and 1000, inclusive.

NOTE: Do not use the limiting argument in a LIST function call on a flat aggregate, in which all values in a column have been inserted into a single cell. In this case, you might be able to use the limit argument if you also specify a group parameter. Misuse of the LIST function can cause the application to crash.

Usage Notes:

Required?Data TypeExample Value
NoInteger50

rowsBefore_integer, rowsAfter_integer

Integers representing the number of rows before or after the current one from which to compute the rolling function, including the current row. For example, if the first value is 5, the current row and the four rows after it are used in the computation. Negative values for k compute the rolling average from rows preceding the current one.

  • rowBefore=1 generates the current row value only.
  • rowBefore=-1 uses all rows preceding the current one.
  • If rowsAfter is not specified, then the value 0 is applied.
  • If a group parameter is applied, then these parameter values should be no more than the maximum number of rows in the groups.

Usage Notes:

Required?Data TypeExample Value
NoInteger4

Examples

Example - Recent Finishers by Boat Type

The following dataset includes the finishing times for each boat in a race. As part of the race, each boat may be assigned one or more penalties in terms of seconds. So, the total time for the race is computed by adding two columns.

You are interested in the list of recent finishers by each finishing time.

Source:

idpilotNameboatTyperaceTimeracePenalties
1SchmidtSunfish4573.853
2BoltLaser4934.2111
3MastersForce 54446.8970
4JamisonForce 54355.7931
5WilliamsSunfish4675.8615
6HobartLaser5077.550
7MillinghamLaser4940.0954
8NelsonForce 55116.1456
9GreeneSunfish5105.945
10DanielsonLaser4964.0318
11CooperForce 55281.5513
12StevensLaser5176.350
13YoungSunfish5038.1116
14ThompsonForce 55252.962
15McDonaldLaser5052.2420
16O'RoarkeSunfish5080.7645
17CollinsSunfish5176.0910
18WrightLaser5391.6134
19BlackSunfish5023.3232
20BushForce 55200.3728

Transform:

Compute the total time for each racer by summing the raceTime column and the racePenalties column:

derive type: single value: raceTime + racePenalties as: 'totalRaceTime'

You can then compute the list of recent finishers by boat type, automatically sorting the generated lists by the totalRaceTime column:

derive type: multiple value: ROLLINGLIST(boatType, 10, 4, 0) order: totalRaceTime as: 'last5FinisherBoatTypes'

Results:

idpilotNameboatTypelast5FinisherBoatTypesraceTimeracePenaltiestotalRaceTime
4JamisonForce 5["Force 5"]4355.79314386.79
3MastersForce 5["Force 5","Force 5"]4446.89704516.89
1SchmidtSunfish["Force 5","Force 5","Sunfish"]4573.8534626.8
5WilliamsSunfish["Force 5","Force 5","Sunfish","Sunfish"]4675.86154690.86
2BoltLaser["Force 5","Force 5","Sunfish","Sunfish","Laser"]4934.21114945.21
10DanielsonLaser["Force 5","Sunfish","Sunfish","Laser","Laser"]4964.03184982.03
7MillinghamLaser["Sunfish","Sunfish","Laser","Laser","Laser"]4940.09544994.09
13YoungSunfish["Sunfish","Laser","Laser","Laser","Sunfish"]5038.11165054.11
19BlackSunfish["Laser","Laser","Laser","Sunfish","Sunfish"]5023.32325055.32
15McDonaldLaser["Laser","Laser","Sunfish","Sunfish","Laser"]5052.24205072.24
9GreeneSunfish["Laser","Sunfish","Sunfish","Laser","Sunfish"]5105.9455110.94
16O'RoarkeSunfish["Sunfish","Sunfish","Laser","Sunfish","Sunfish"]5080.76455125.76
6HobartLaser["Sunfish","Laser","Sunfish","Sunfish","Laser"]5077.5505127.5
8NelsonForce 5["Laser","Sunfish","Sunfish","Laser","Force 5"]5116.14565172.14
12StevensLaser["Sunfish","Sunfish","Laser","Force 5","Laser"]5176.3505176.35
17CollinsSunfish["Sunfish","Laser","Force 5","Laser","Sunfish"]5176.09105186.09
20BushForce 5["Laser","Force 5","Laser","Sunfish","Force 5"]5200.37285228.37
11CooperForce 5["Force 5","Laser","Sunfish","Force 5","Force 5"]5281.55135294.55
14ThompsonForce 5["Laser","Sunfish","Force 5","Force 5","Force 5"]5252.9625314.9
18WrightLaser["Sunfish","Force 5","Force 5","Force 5","Laser"]5391.61345425.61

Was this page helpful? Let us know how we did:

Send feedback about...

Google Cloud Dataprep Documentation
Need help? Visit our support page.