Aggregate Functions

Aggregate functions perform a computation against a set of values to generate a single result. For example, you could use an aggregate function to compute the average (mean) order over a period of time. Aggregations can be applied as standard functions or used as part of an aggregate transform to reshape the data.

Aggregate across an entire column:

derive value:AVERAGE(Scores)

Output: Generates a new column containing the average of all values in the Scores column.

aggregate value:AVERAGE(Scores)

Output: Generates a single-column table with a single value, which contains the average of all values in the Scores column.

NOTE: When aggregate functions are applied as part of an aggregate transform, they typically involve multiple parameters as part of an operation to reshape the dataset. See below.

Aggregate across groups of values within a column:

Aggregate functions can be used with the aggregate transform to change the structure of your data. Example:

aggregate value:AVERAGE(Scores) group:studentId

In the above instance, the resulting dataset contains two columns:

  • studentId - one row for each distinct student ID value
  • average_Scores - average score by each student (studentId)

An aggregate transform can include multiple aggregate functions and group columns from the pre-aggregate dataset.

For more information on the transform, see Aggregate Transform.

These aggregate functions are available:


Send feedback about...

Google Cloud Dataprep Documentation