An aggregate function summarizes the rows of a group into a single value. When
an aggregate function is used with the OVER
clause, it becomes a window
function, which computes values over a group of rows and then returns a single
result for each row.
Aggregate function call syntax
function_name(
[ DISTINCT ]
function_arguments
[ { IGNORE | RESPECT } NULLS ]
[ HAVING { MAX | MIN } having_expression ]
)
Description
Each aggregate function supports all or a subset of the aggregate function call syntax. You can use the following syntax to build an aggregate function:
DISTINCT
: Each distinct value ofexpression
is aggregated only once into the result.IGNORE NULLS
orRESPECT NULLS
: IfIGNORE NULLS
is specified, theNULL
values are excluded from the result. IfRESPECT NULLS
is specified, theNULL
values are included in the result. If neither is specified, theNULL
values are included in the result.HAVING MAX
orHAVING MIN
: Restricts the set of rows that the function aggregates by a maximum or minimum value. For details, see HAVING MAX and HAVING MIN clause.- If
DISTINCT
is also specified, then the sort key must be the same asexpression
.
- If
Details
The clauses in an aggregate function call are applied in the following order:
DISTINCT
IGNORE NULLS
orRESPECT NULLS
HAVING MAX
orHAVING MIN
When used in conjunction with a GROUP BY
clause, the groups summarized
typically have at least one row. When the associated SELECT
statement has
no GROUP BY
clause or when certain aggregate function modifiers filter rows
from the group to be summarized, it's possible that the aggregate function
needs to summarize an empty group.
Restrict aggregation by a maximum or minimum value
Some aggregate functions support two optional clauses that are called
HAVING MAX
and HAVING MIN
. These clauses restrict the set of rows that a
function aggregates to rows that have a maximum or minimum value in a particular
column.
HAVING MAX clause
HAVING MAX having_expression
HAVING MAX
restricts the set of rows that the function aggregates to those
having a value for having_expression
equal to the maximum value for
having_expression
within the group. The maximum value is equal to the result
of MAX(having_expression)
.
This clause ignores NULL
values when computing the maximum value unless
having_expression
evaluates to NULL
for all rows.
This clause supports all orderable data types,
except for ARRAY
.
Examples
In the following query, rows with the most inches of precipitation, 4
, are
added to a group, and then the year
for one of these rows is produced.
Which row is produced is nondeterministic, not random.
WITH
Precipitation AS (
SELECT 2009 AS year, 'spring' AS season, 3 AS inches
UNION ALL
SELECT 2001, 'winter', 4
UNION ALL
SELECT 2003, 'fall', 1
UNION ALL
SELECT 2002, 'spring', 4
UNION ALL
SELECT 2005, 'summer', 1
)
SELECT ANY_VALUE(year HAVING MAX inches) AS any_year_with_max_inches FROM Precipitation;
/*--------------------------*
| any_year_with_max_inches |
+--------------------------+
| 2001 |
*--------------------------*/
In the following example, the average rainfall is returned for 2001, the most
recent year specified in the query. First, the query gets the rows with the
maximum value in the year
column. Finally, the query averages the values in
the inches
column (9
and 1
):
WITH
Precipitation AS (
SELECT 2001 AS year, 'spring' AS season, 9 AS inches
UNION ALL
SELECT 2001, 'winter', 1
UNION ALL
SELECT 2000, 'fall', 3
UNION ALL
SELECT 2000, 'summer', 5
UNION ALL
SELECT 2000, 'spring', 7
UNION ALL
SELECT 2000, 'winter', 2
)
SELECT AVG(inches HAVING MAX year) AS average FROM Precipitation;
/*---------*
| average |
+---------+
| 5 |
*---------*/
HAVING MIN clause
HAVING MIN having_expression
HAVING MIN
restricts the set of rows that the function aggregates to those
having a value for having_expression
equal to the minimum value for
having_expression
within the group. The minimum value is equal to the result
of MIN(having_expression)
.
This clause ignores NULL
values when computing the minimum
value unless having_expression
evaluates to NULL
for all rows.
This clause supports all orderable data types,
except for ARRAY
.
Examples
In the following query, rows with the fewest inches of precipitation, 1
,
are added to a group, and then the year
for one of these rows is produced.
Which row is produced is nondeterministic, not random.
WITH
Precipitation AS (
SELECT 2009 AS year, 'spring' AS season, 3 AS inches
UNION ALL
SELECT 2001, 'winter', 4
UNION ALL
SELECT 2003, 'fall', 1
UNION ALL
SELECT 2002, 'spring', 4
UNION ALL
SELECT 2005, 'summer', 1
)
SELECT ANY_VALUE(year HAVING MIN inches) AS any_year_with_min_inches FROM Precipitation;
/*--------------------------*
| any_year_with_min_inches |
+--------------------------+
| 2003 |
*--------------------------*/
In the following example, the average rainfall is returned for 2000, the
earliest year specified in the query. First, the query gets the rows with
the minimum value in the year
column, and finally, the query averages the
values in the inches
column:
WITH
Precipitation AS (
SELECT 2001 AS year, 'spring' AS season, 9 AS inches
UNION ALL
SELECT 2001, 'winter', 1
UNION ALL
SELECT 2000, 'fall', 3
UNION ALL
SELECT 2000, 'summer', 5
UNION ALL
SELECT 2000, 'spring', 7
UNION ALL
SELECT 2000, 'winter', 2
)
SELECT AVG(inches HAVING MIN year) AS average FROM Precipitation;
/*---------*
| average |
+---------+
| 4.25 |
*---------*/
Aggregate function examples
A simple aggregate function call for COUNT
, MIN
, and MAX
looks like this:
SELECT
COUNT(*) AS total_count,
COUNT(fruit) AS non_null_count,
MIN(fruit) AS min,
MAX(fruit) AS max
FROM
(
SELECT NULL AS fruit
UNION ALL
SELECT 'apple' AS fruit
UNION ALL
SELECT 'pear' AS fruit
UNION ALL
SELECT 'orange' AS fruit
)
/*-------------+----------------+-------+------*
| total_count | non_null_count | min | max |
+-------------+----------------+-------+------+
| 4 | 3 | apple | pear |
*-------------+----------------+-------+------*/