LISTIF Function

Returns list of all values in a column for rows that match a specified condition.

NOTE: When added to a transform, this function is applied to the current sample. If you change your sample or run the job, the computed values for this function are updated. Transforms that change the number of rows in subsequent recipe steps do not affect the values computed for this step.

To perform a simple extraction of values without conditionals, use the LIST function. See LIST Function.

Basic Usage

pivot value: LISTIF(hotChocolateLiters, 500, temperature < 0) group:date limit:1

Output: Generates a two-column table containing the unique values for date and the values from the hotChocolateLiters column for each date when the temperature value is less than 0. Maximum number of values is 500. The limit parameter defines the maximum number of output columns.

Syntax

pivot value:LISTIF(col_ref, limit, test_expression) [group:group_col_ref] [limit:limit_count]

ArgumentRequired?Data TypeDescription
col_refYstringReference to the column you wish to evaluate.
limit_intNintegerMaximum number of values to extract into the list array. From 1 to 1000.
test_expressionYstringExpression that is evaluated. Must resolve to true or false

For more information on syntax standards, see Language Documentation Syntax Notes.

For more information on the group and limit parameters, see Pivot Transform.

col_ref

Name of the column whose values you wish to use in the calculation. Column must be a numeric (Integer or Decimal) type.

Usage Notes:

Required?Data TypeExample Value
YesString that corresponds to the name of the columnmyValues

limit_int

Non-negative integer that defines the maximum number of values to extract into the list array.

NOTE: If specified, this value must between 1 and 1000, inclusive.

NOTE: Do not use the limiting argument in a LISTIF function call on a flat aggregate, in which all values in a column have been inserted into a single cell. In this case, you might be able to use the limit argument if you also specify a group parameter. Misuse of the LISTIF function can cause the application to crash.

test_expression

This parameter contains the expression to evaluate. This expression must resolve to a Boolean (true or false) value.

Usage Notes:

Required?Data TypeExample Value
YesString expression that evaluates to true or false(LastName == 'Mouse' && FirstName == 'Mickey')

Examples

Example - ANYIF and LISTIF Functions

This section provides simple examples for how to use the ANYIF and LISTIF functions. These functions include the following:

  • ANYIF - Identifies a single value from a group that meets a specific condition. See ANYIF Function.
  • LISTAIF - Lists all values within a group that meet a specified condition. See LISTIF Function.

Source:

The following data identifies sales figures by salespeople for a week:

EmployeeIdDateSales
S0011/23/1725
S0021/23/1740
S0031/23/1748
S0011/24/1781
S0021/24/1711
S0031/24/1725
S0011/25/179
S0021/25/1740
S0031/25/17
S0011/26/1777
S0021/26/1783
S0031/26/17
S0011/27/1717
S0021/27/1771
S0031/27/1729
S0011/28/17
S0021/28/17
S0031/28/1714
S0011/29/172
S0021/29/177
S0031/29/1799

Transform:

In this example, you are interested in the high performers. A good day in sales is one in which an individual sells more than 80 units. First, you want to identify the day of week:

derive type:single value:WEEKDAY(Date) as:'DayOfWeek'

Values greater than 5 in DayOfWeek are weekend dates. You can use the following to identify if anyone reached this highwater marker during the workweek (non-weekend):

pivot value:ANYIF(Sales, (Sales > 80 && DayOfWeek < 6)) group:EmployeeId,Date limit:1

Before adding the step to the recipe, you take note of the individuals who reached this mark in the anyif_Sales column for special recognition.

Now, you want to find out sales for individuals during the week. You can use the following to filter the data to show only for weekdays:

pivot value:LISTIF(Sales, 1000, (DayOfWeek < 6)) group:EmployeeId,Date limit:1

To clean up, you might select and replace the following values in the listif_Sales column with empty strings:

["
"]
[]

Results:

EmployeeIdDatelistif_Sales
S0011/23/1725
S0021/23/1740
S0031/23/1748
S0011/24/1781
S0021/24/1711
S0031/24/1725
S0011/25/1740
S0021/25/17
S0031/25/1766
S0011/26/1777
S0021/26/1783
S0031/26/17
S0011/27/1717
S0021/27/1771
S0031/27/1729
S0011/28/17
S0021/28/17
S0031/28/17
S0011/29/17
S0021/29/17
S0031/29/17

Was this page helpful? Let us know how we did:

Send feedback about...

Google Cloud Dataprep Documentation