MODEIF Function

Computes the mode (most frequent value) from all row values in a column, according to their grouping. Input column can be of Integer or Decimal type.
  • If a row contains a missing or null value, it is not factored into the calculation. If the entire column contains no values, the function returns a null value.
  • If there is a tie in which the most occurrences of a value is shared between values, then no value is returned from the function.
  • When used in a pivot transform, the function is computed for each instance of the value specified in the group parameter. See Pivot Transform.

For a non-conditional version of this function, see MODE Function.

For a version of this function computed over a rolling window of rows, see ROLLINGMODE Function.

Basic Usage

pivot value:MODEIF(count_visits, health_status == 'sick') group:postal_code limit:1

Output: Generates a two-column table containing the unique values from the postal_code column and the mode of the values in the count_visits column as long as health_status is set to sick, for the postal_code value. The limit parameter defines the maximum number of output columns.

Syntax

pivot value:MODEIF(function_col_ref, test_expression) [group:group_col_ref] [limit:limit_count]

ArgumentRequired?Data TypeDescription
function_col_refYstringName of column to which to apply the function
test_expressionYstring

Expression that is evaluated. Must resolve to true or false

For more information on the group and limit parameters, see Pivot Transform.

For more information on syntax standards, see Language Documentation Syntax Notes.

function_col_ref

Name of the column the values of which you want to calculate the function. Column must contain Integer or Decimal values.

  • Literal values are not supported as inputs.
  • Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString (column reference)myValues

test_expression

This parameter contains the expression to evaluate. This expression must resolve to a Boolean (true or false) value.

Usage Notes:

Required? Data Type Example Value
YesString expression that evaluates to true or false(LastName == 'Mouse' && FirstName == 'Mickey')

Examples

Example - MODEIF function

The following data contains a list of weekly orders for 2017 across two regions (r01 and r02). You are interested in calculating the most common order count for the second half of the year, by region.

Source:

NOTE: For simplicity, only the first few rows are displayed.

DateRegionOrderCount
1/6/2017r0178
1/6/2017r0297
1/13/2017r0192
1/13/2017r0290
1/20/2017r0197
1/20/2017r0284

Transform:

To assist, you can first calculate the week number for each row:

derive type: single value: WEEKNUM(Date) as: 'weekNumber'

Then, you can use the following aggregation to determine the most common order value for each region during the second half of the year:

pivot group: Region value: MODEIF(OrderCount, weekNumber > 26) limit: 50

Results:

Regionmodeif_OrderCount
r0185
r02100

Was this page helpful? Let us know how we did:

Send feedback about...

Google Cloud Dataprep Documentation