Differentially private aggregate functions

GoogleSQL for BigQuery supports differentially private aggregate functions. For an explanation of how aggregate functions work, see Aggregate function calls.

You can only use differentially private aggregate functions with differentially private queries in a differential privacy clause.

Function list

Name Summary
AVG DIFFERENTIAL_PRIVACY-supported AVG.

Gets the differentially-private average of non-NULL, non-NaN values in a query with a DIFFERENTIAL_PRIVACY clause.
COUNT DIFFERENTIAL_PRIVACY-supported COUNT.

Signature 1: Gets the differentially-private count of rows in a query with a DIFFERENTIAL_PRIVACY clause.

Signature 2: Gets the differentially-private count of rows with a non-NULL expression in a query with a DIFFERENTIAL_PRIVACY clause.
PERCENTILE_CONT DIFFERENTIAL_PRIVACY-supported PERCENTILE_CONT.

Computes a differentially-private percentile across privacy unit columns in a query with a DIFFERENTIAL_PRIVACY clause.
SUM DIFFERENTIAL_PRIVACY-supported SUM.

Gets the differentially-private sum of non-NULL, non-NaN values in a query with a DIFFERENTIAL_PRIVACY clause.

AVG (DIFFERENTIAL_PRIVACY)

WITH DIFFERENTIAL_PRIVACY ...
  AVG(
    expression,
    [contribution_bounds_per_group => (lower_bound, upper_bound)]
  )

Description

Returns the average of non-NULL, non-NaN values in the expression. This function first computes the average per privacy unit column, and then computes the final result by averaging these averages.

This function must be used with the DIFFERENTIAL_PRIVACY clause and can support the following arguments:

  • expression: The input expression. This can be any numeric input type, such as INT64.
  • contribution_bounds_per_group: The contribution bounds named argument. Perform clamping per each group separately before performing intermediate grouping on the privacy unit column.

Return type

FLOAT64

Examples

The following differentially private query gets the average number of each item requested per professor. Smaller aggregations might not be included. This query references a table called professors.

-- With noise, using the epsilon parameter.
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=10, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    AVG(quantity, contribution_bounds_per_group => (0,100)) average_quantity
FROM professors
GROUP BY item;

-- These results will change each time you run the query.
-- Smaller aggregations might be removed.
/*----------+------------------*
 | item     | average_quantity |
 +----------+------------------+
 | pencil   | 38.5038356810269 |
 | pen      | 13.4725028762032 |
 *----------+------------------*/
-- Without noise, using the epsilon parameter.
-- (this un-noised version is for demonstration only)
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=1e20, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    AVG(quantity) average_quantity
FROM professors
GROUP BY item;

-- These results will not change when you run the query.
/*----------+------------------*
 | item     | average_quantity |
 +----------+------------------+
 | scissors | 8                |
 | pencil   | 40               |
 | pen      | 18.5             |
 *----------+------------------*/

COUNT (DIFFERENTIAL_PRIVACY)

  • Signature 1: Returns the number of rows in a differentially private FROM clause.
  • Signature 2: Returns the number of non-NULL values in an expression.

Signature 1

WITH DIFFERENTIAL_PRIVACY ...
  COUNT(
    *,
    [contribution_bounds_per_group => (lower_bound, upper_bound)]
  )

Description

Returns the number of rows in the differentially private FROM clause. The final result is an aggregation across a privacy unit column.

This function must be used with the DIFFERENTIAL_PRIVACY clause and can support the following argument:

  • contribution_bounds_per_group: The contribution bounds named argument. Perform clamping per each group separately before performing intermediate grouping on the privacy unit column.

Return type

INT64

Examples

The following differentially private query counts the number of requests for each item. This query references a table called professors.

-- With noise, using the epsilon parameter.
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=10, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    COUNT(*, contribution_bounds_per_group=>(0, 100)) times_requested
FROM professors
GROUP BY item;

-- These results will change each time you run the query.
-- Smaller aggregations might be removed.
/*----------+-----------------*
 | item     | times_requested |
 +----------+-----------------+
 | pencil   | 5               |
 | pen      | 2               |
 *----------+-----------------*/
-- Without noise, using the epsilon parameter.
-- (this un-noised version is for demonstration only)
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=1e20, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    COUNT(*, contribution_bounds_per_group=>(0, 100)) times_requested
FROM professors
GROUP BY item;

-- These results will not change when you run the query.
/*----------+-----------------*
 | item     | times_requested |
 +----------+-----------------+
 | scissors | 1               |
 | pencil   | 4               |
 | pen      | 3               |
 *----------+-----------------*/

Signature 2

WITH DIFFERENTIAL_PRIVACY ...
  COUNT(
    expression,
    [contribution_bounds_per_group => (lower_bound, upper_bound)]
  )

Description

Returns the number of non-NULL expression values. The final result is an aggregation across a privacy unit column.

This function must be used with the DIFFERENTIAL_PRIVACY clause and can support these arguments:

  • expression: The input expression. This expression can b