The ML.ROBUST_SCALER function
This document describes the ML.ROBUST_SCALER
function, which lets you scale a
numerical expression by using statistics that are robust to outliers. The
function performs the scaling by removing the
median and scaling
the data according to the quantile
range.
When used in the
TRANSFORM
clause,
the median and quantile range calculated during training are automatically
used in prediction.
Syntax
ML.ROBUST_SCALER(numerical_expression [, quantile_range] [, with_median] [, with_quantile_range]) OVER()
Arguments
ML.ROBUST_SCALER
takes the following arguments:
numerical_expression
: the numerical expression to scale.quantile_range
: an array of twoINT64
elements that specifies the quantile range. The first element provides the lower boundary of the range. It must be greater than0
. The second element provides the upper boundary of the range. It must be greater than the first element but less than100
. The default value is[25, 75]
.with_median
: aBOOL
value that specifies whether the data is centered. IfTRUE
, the function centers the data by removing the median before scaling. The default value isTRUE
.with_quantile_range
: aBOOL
value that specifies whether the data is scaled to the quantile range. IfTRUE
, the data is scaled. The default value isTRUE
.
Output
ML.ROBUST_SCALER
returns a FLOAT64
value that represents the scaled
numerical expression.
Example
The following example centers a set of numerical expressions and then
scales it to the range [25, 75]
:
SELECT f, ML.ROBUST_SCALER(f) OVER () AS output FROM UNNEST([NULL, -3, 1, 2, 3, 4, 5]) AS f ORDER BY f;
The output looks similar to the following:
+------+---------------------+ | f | output | +------+---------------------+ | NULL | NULL | | -3 | -1.6666666666666667 | | 1 | -0.3333333333333333 | | 2 | 0.0 | | 3 | 0.3333333333333333 | | 4 | 0.6666666666666666 | | 5 | 1.0 | +------+---------------------+
What's next
- For information about feature preprocessing, see Feature preprocessing overview.
- For information about the supported SQL statements and functions for each model type, see End-to-end user journey for each model.