COALESCE Function

Function returns the first non-missing value found in an array of columns.

The order of the columns listed in the function determines the order in which they are searched.

Basic Usage

derive type:single value: COALESCE([col1,col2,col3]) as: 'firstValue'

Output: Generates the firstValue column, which contains the first non-missing detected in col1, col2, or col3 in that order.

Syntax

derive type:single value: COALESCE([col_ref1,col_ref2, col_ref3])

A reference to a single column does not require brackets. References to multiple columns must be passed to the function as an array of column names.

ArgumentRequired?Data TypeDescription
col_ref1YstringName of the first column to find the first non-missing value
col_ref2NstringName of the second column to find the first non-missing value
col_ref3NstringName of the third column to find the first non-missing value

For more information on syntax standards, see Language Documentation Syntax Notes.

col_ref1, col_ref2, col_ref3

Name of the column(s) searched for the first non-missing value.

Usage Notes:

Required?Data TypeExample Value
YesString (column reference)[myColumn1, myColumn2]

Examples

Example - Find first time

You are tracking multiple racers across multiple heats. Racers might sit out heats for various reasons.

Source:

Here's the race data.

RacerHeat1Heat2Heat3
Racer X 38.2237.61
Racer Y41.33 38.04
Racer Z39.2739.0438.85

Transform:

Use the following transform to grab the first non-missing value from the Heat columns:

derive type:single value:COALESCE([Heat1, Heat2, Heat3]) as:'firstTime'

Results:

RacerHeat1Heat2Heat3firstTime
Racer X 38.2237.6138.22
Racer Y41.33 38.0441.33
Racer Z39.2739.0438.8539.27

Was this page helpful? Let us know how we did:

Send feedback about...

Google Cloud Dataprep Documentation
Need help? Visit our support page.