Aggregate function calls in GoogleSQL

An aggregate function is a function that summarizes the rows of a group into a single value. When an aggregate function is used with the OVER clause, it becomes a window function, which computes values over a group of rows and then returns a single result for each row.

Aggregate function call syntax

function_name(
  [ DISTINCT ]
  function_arguments
  [ { IGNORE | RESPECT } NULLS ]
  [ HAVING { MAX | MIN } having_expression ]
)

Description

Each aggregate function supports all or a subset of the aggregate function call syntax. You can use the following syntax to build an aggregate function:

  • DISTINCT: Each distinct value of expression is aggregated only once into the result.
  • IGNORE NULLS or RESPECT NULLS: If IGNORE NULLS is specified, the NULL values are excluded from the result. If RESPECT NULLS is specified, the NULL values are included in the result. If neither is specified, the NULL values are included in the result.
  • HAVING MAX or HAVING MIN: Restricts the set of rows that the function aggregates by a maximum or minimum value. For details, see HAVING MAX and HAVING MIN clause.

    • If DISTINCT is also specified, then the sort key must be the same as expression.

Details

The clauses in an aggregate function call are applied in the following order:

  • DISTINCT
  • IGNORE NULLS or RESPECT NULLS
  • HAVING MAX or HAVING MIN

When used in conjunction with a GROUP BY clause, the groups summarized typically have at least one row. When the associated SELECT statement has no GROUP BY clause or when certain aggregate function modifiers filter rows from the group to be summarized, it is possible that the aggregate function needs to summarize an empty group.

Restrict aggregation by a maximal or minimal value

Some aggregate functions support two optional clauses that are called HAVING MAX and HAVING MIN. These clauses restrict the set of rows that a function aggregates to rows that have a maximal or minimal value in a particular column.

HAVING MAX clause

HAVING MAX having_expression

HAVING MAX restricts the set of rows that the function aggregates to those having a value for having_expression equal to the maximum value for having_expression within the group. The maximum value is equal to the result of MAX(having_expression).

This clause ignores NULL values when computing the maximum value unless having_expression evaluates to NULL for all rows.

This clause supports all orderable data types, except for ARRAY.

Examples

In the following query, rows with the most inches of precipitation, 4, are added to a group, and then the year for one of these rows is produced. Which row is produced is nondeterministic, not random.

WITH
  Precipitation AS (
    SELECT 2009 AS year, 'spring' AS season, 3 AS inches
    UNION ALL
    SELECT 2001, 'winter', 4
    UNION ALL
    SELECT 2003, 'fall', 1
    UNION ALL
    SELECT 2002, 'spring', 4
    UNION ALL
    SELECT 2005, 'summer', 1
  )
SELECT ANY_VALUE(year HAVING MAX inches) AS any_year_with_max_inches FROM Precipitation;

/*--------------------------*
 | any_year_with_max_inches |
 +--------------------------+
 | 2001                     |
 *--------------------------*/

In this example, the average rainfall is returned for 2001, the most recent year specified in the query.

WITH
  Precipitation AS (
    SELECT 2001 AS year, 'spring' AS season, 9 AS inches
    UNION ALL
    SELECT 2001, 'winter', 1
    UNION ALL
    SELECT 2000, 'fall', 3
    UNION ALL
    SELECT 2000, 'summer', 5
    UNION ALL
    SELECT 2000, 'spring', 7
    UNION ALL
    SELECT 2000, 'winter', 2
  )
SELECT AVG(inches HAVING MAX year) AS average FROM Precipitation;

/*---------*
 | average |
 +---------+
 | 5       |
 *---------*/

First, the query gets the rows with the maximum value in the year column:

/*------+--------+--------*
 | year | season | inches |
 +------+--------+--------+
 | 2001 | spring | 9      |
 | 2001 | winter | 1      |
 *------+--------+--------*/

Finally, the query averages the values in the inches column (9 and 1) with this result:

/*---------*
 | average |
 +---------+
 | 5       |
 *---------*/

HAVING MIN clause

HAVING MIN having_expression

HAVING MIN restricts the set of rows that the function aggregates to those having a value for having_expression equal to the minimum value for having_expression within the group. The minimum value is equal to the result of MIN(having_expression).

This clause ignores NULL values when computing the minimum value unless having_expression evaluates to NULL for all rows.

This clause supports all orderable data types, except for ARRAY.

Examples

In the following query, rows with the fewest inches of precipitation, 1, are added to a group, and then the year for one of these rows is produced. Which row is produced is nondeterministic, not random.

WITH
  Precipitation AS (
    SELECT 2009 AS year, 'spring' AS season, 3 AS inches
    UNION ALL
    SELECT 2001, 'winter', 4
    UNION ALL
    SELECT 2003, 'fall', 1
    UNION ALL
    SELECT 2002, 'spring', 4
    UNION ALL
    SELECT 2005, 'summer', 1
  )
SELECT ANY_VALUE(year HAVING MIN inches) AS any_year_with_min_inches FROM Precipitation;

/*--------------------------*
 | any_year_with_min_inches |
 +--------------------------+
 | 2003                     |
 *--------------------------*/

In this example, the average rainfall is returned for 2000, the earliest year specified in the query.

WITH
  Precipitation AS (
    SELECT 2001 AS year, 'spring' AS season, 9 AS inches
    UNION ALL
    SELECT 2001, 'winter', 1
    UNION ALL
    SELECT 2000, 'fall', 3
    UNION ALL
    SELECT 2000, 'summer', 5
    UNION ALL
    SELECT 2000, 'spring', 7
    UNION ALL
    SELECT 2000, 'winter', 2
  )
SELECT AVG(inches HAVING MIN year) AS average FROM Precipitation;

/*---------*
 | average |
 +---------+
 | 4.25    |
 *---------*/

First, the query gets the rows with the minimum value in the year column:

/*------+--------+--------*
 | year | season | inches |
 +------+--------+--------+
 | 2000 | fall   | 3      |
 | 2000 | summer | 5      |
 | 2000 | spring | 7      |
 | 2000 | winter | 2      |
 *------+--------+--------*/

Finally, the query averages the values in the inches column with this result:

/*---------*
 | average |
 +---------+
 | 4.25    |
 *---------*/

Aggregate function examples

A simple aggregate function call for COUNT, MIN, and MAX looks like this:

SELECT
  COUNT(*) AS total_count,
  COUNT(fruit) AS non_null_count,
  MIN(fruit) AS min,
  MAX(fruit) AS max
FROM
  (
    SELECT NULL AS fruit
    UNION ALL
    SELECT 'apple' AS fruit
    UNION ALL
    SELECT 'pear' AS fruit
    UNION ALL
    SELECT 'orange' AS fruit
  )

/*-------------+----------------+-------+------*
 | total_count | non_null_count | min   | max  |
 +-------------+----------------+-------+------+
 | 4           | 3              | apple | pear |
 *-------------+----------------+-------+------*/