Approximate Aggregate Functions in Standard SQL

Approximate aggregate functions are scalable in terms of memory usage and time, but produce approximate results instead of exact results. For more background, see Approximate Aggregation.

APPROX_COUNT_DISTINCT

APPROX_COUNT_DISTINCT(expression)

Description

Returns the approximate result for COUNT(DISTINCT expression). The value returned is a statistical estimate—not necessarily the actual value.

This function is less accurate than COUNT(DISTINCT expression), but performs better on huge input.

Supported Argument Types

All data types except:

  • ARRAY
  • STRUCT

Returned Data Types

INT64

Examples

SELECT APPROX_COUNT_DISTINCT(x) as approx_distinct
FROM UNNEST([0, 1, 1, 2, 3, 5]) as x;

+-----------------+
| approx_distinct |
+-----------------+
| 5               |
+-----------------+

APPROX_QUANTILES

APPROX_QUANTILES([DISTINCT] expression, number [{IGNORE|RESPECT} NULLS])

Description

Returns the approximate boundaries for a group of expression values, where number represents the number of quantiles to create. This function returns an array of number + 1 elements, where the first element is the approximate minimum and the last element is the approximate maximum.

Supported Argument Types

expression can be any supported data type except:

  • ARRAY
  • STRUCT

number must be INT64.

Optional Clauses

The clauses are applied in the following order:

  1. DISTINCT: Each distinct value of expression is aggregated only once into the result.

  2. IGNORE NULLS or RESPECT NULLS: If IGNORE NULLS is specified or if neither is specified, the NULL values are excluded from the result. If RESPECT NULLS is specified, the NULL values are included in the result (an error is raised, however, if an array in the final query result contains a NULL element).

Returned Data Types

An ARRAY of the type specified by the expression parameter.

Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

Examples

SELECT APPROX_QUANTILES(x, 2) AS approx_quantiles
FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x;

+------------------+
| approx_quantiles |
+------------------+
| [1, 5, 10]       |
+------------------+

SELECT APPROX_QUANTILES(DISTINCT x, 2) AS approx_quantiles
FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x;

+------------------+
| approx_quantiles |
+------------------+
| [1, 6, 10]       |
+------------------+

SELECT FORMAT("%T", APPROX_QUANTILES(x, 2 RESPECT NULLS)) AS approx_quantiles
FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x;

+------------------+
| approx_quantiles |
+------------------+
| [NULL, 4, 10]    |
+------------------+

SELECT FORMAT("%T", APPROX_QUANTILES(DISTINCT x, 2 RESPECT NULLS)) AS approx_quantiles
FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x;

+------------------+
| approx_quantiles |
+------------------+
| [NULL, 6, 10]    |
+------------------+

APPROX_TOP_COUNT

APPROX_TOP_COUNT(expression, number)

Description

Returns the approximate top elements of expression. The number parameter specifies the number of elements returned.

Supported Argument Types

expression can be of any data type that the GROUP BY clause supports.

number must be INT64.

Returned Data Types

An ARRAY of type STRUCT. The STRUCT contains two fields. The first field contains an input value. The second field contains an INT64 specifying the number of times the input was returned.

Returns NULL if there are zero input rows.

Examples

SELECT APPROX_TOP_COUNT(x, 2) as approx_top_count
FROM UNNEST(["apple", "apple", "pear", "pear", "pear", "banana"]) as x;

+-------------------------+
| approx_top_count        |
+-------------------------+
| [{pear, 3}, {apple, 2}] |
+-------------------------+

NULL handling

APPROX_TOP_COUNT does not ignore NULLs in the input. For example:

SELECT APPROX_TOP_COUNT(x, 2) as approx_top_count
FROM UNNEST([NULL, "pear", "pear", "pear", "apple", NULL]) as x;

+------------------------+
| approx_top_count       |
+------------------------+
| [{pear, 3}, {NULL, 2}] |
+------------------------+

APPROX_TOP_SUM

APPROX_TOP_SUM(expression, weight, number)

Description

Returns the approximate top elements of expression, based on the sum of an assigned weight. The number parameter specifies the number of elements returned.

If the weight input is negative or NaN, this function returns an error.

Supported Argument Types

expression can be of any data type that the GROUP BY clause supports.

weight must be one of the following:

  • INT64
  • FLOAT64

number must be INT64.

Returned Data Types

An ARRAY of type STRUCT. The STRUCT contains two fields: value and sum. The value field contains the value of the input expression. The sum field is the same type asweight, and is the approximate sum of the input weight associated with the value field.

Returns NULL if there are zero input rows.

Examples

SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM
UNNEST([
  STRUCT("apple" AS x, 3 AS weight),
  ("pear", 2),
  ("apple", 0),
  ("banana", 5),
  ("pear", 4)
]);

+--------------------------+
| approx_top_sum           |
+--------------------------+
| [{pear, 6}, {banana, 5}] |
+--------------------------+

NULL handling

APPROX_TOP_SUM does not ignore NULL values for the expression and weight parameters.

SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM
UNNEST([STRUCT("apple" AS x, NULL AS weight), ("pear", 0), ("pear", NULL)]);

+----------------------------+
| approx_top_sum             |
+----------------------------+
| [{pear, 0}, {apple, NULL}] |
+----------------------------+
SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM
UNNEST([STRUCT("apple" AS x, 0 AS weight), (NULL, 2)]);

+-------------------------+
| approx_top_sum          |
+-------------------------+
| [{NULL, 2}, {apple, 0}] |
+-------------------------+
SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM
UNNEST([STRUCT("apple" AS x, 0 AS weight), (NULL, NULL)]);

+----------------------------+
| approx_top_sum             |
+----------------------------+
| [{apple, 0}, {NULL, NULL}] |
+----------------------------+
Was this page helpful? Let us know how we did:

Send feedback about...