The APPROX_COUNT_DISTINCT
function counts the approximate number of unique items in a field.
APPROX_COUNT_DISTINCT
is only available when your data comes from a BigQuery data source. For other data source types, use
COUNT_DISTINCT
.
Syntax
APPROX_COUNT_DISTINCT
( X )
Parameters
- X - a field or expression that contains the items to be counted.
How the
APPROX_COUNT_DISTINCT
function works
The APPROX_COUNT_DISTINCT
function takes one parameter, which can be the name of a metric, dimension, or expression of any type.
APPROX_COUNT_DISTINCT
returns the approximate number of unique items in that field or expression.
APPROX_COUNT_DISTINCT
is more efficient in terms of query processing than
COUNT_DISTINCT
, but returns less exact results. If your data set is very large, or if the performance of your report is more important than exact counts, consider using
APPROX_COUNT_DISTINCT
. Using
APPROX_COUNT_DISTINCT
instead of
COUNT_DISTINCT
can also help reduce query costs when using BigQuery data sources.
For an in-depth explanation of how approximate aggregation works, see the BigQuery documentation.
Examples
APPROX_COUNT_DISTINCT( Page )
- counts the approximate number of unique values in the Page dimension.
Limits of
APPROX_COUNT_DISTINCT
- The
APPROX_COUNT_DISTINCT
function is only available when used with BigQuery data sources. Google Internal only: APPROX_COUNT_DISTINCT is also available for #plx data sources.
- For data sources which do not support
APPROX_COUNT_DISTINCT
,
APPROX_COUNT_DISTINCT
will act like COUNT_DISTINCT.
- You can't apply this function to a pre-aggregated metric ( Aggregation type of Auto ), or to an expression which is the result of another aggregation function. For example, a formula such as
APPROX_COUNT_DISTINCT(Sessions)
in a Google Analytics data source will produce an error.