The ML.EVALUATE function
ML.EVALUATE
function
Use the ML.EVALUATE
function to evaluate model metrics.
For information about model evaluation in BigQuery ML, see Model evaluation overview.
For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.
ML.EVALUATE
syntax
ML.EVALUATE(MODEL model_name [, {TABLE table_name | (query_statement)}] [, STRUCT<threshold FLOAT64, perform_aggregation BOOL, horizon INT64, confidence_level FLOAT64> settings]))])
model_name
model_name
is the name of the model you're evaluating. If you do
not have a default project configured, prepend the project ID to the model
name in following format: `[PROJECT_ID].[DATASET].[MODEL]`
(including the
backticks); for example, `myproject.mydataset.mymodel`
.
table_name
(Optional) table_name
is the name of the input table that contains the
evaluation data. If you do not have a default project configured, prepend the
project ID to the table name in following format: `[PROJECT_ID].[DATASET].[TABLE]`
(including the backticks); for example, `myproject.mydataset.mytable`
.
If table_name
is specified, the input column names in the table must match the
column names in the model, and their types should be compatible according to
BigQuery implicit coercion rules. The input must have a column that matches the
label column name provided during training. This value is provided using the
input_label_cols
option. If input_label_cols
is unspecified, the column
named "label" in the training data is used.
query_statement
(Optional) The query_statement
clause specifies the GoogleSQL query that is
used to generate the evaluation data. See the
GoogleSQL Query Syntax
page for the supported SQL syntax of the query_statement
clause.
If query_statement
is specified, the input column names from the query must
match the column names in the model, and their types should be compatible
according to BigQuery implicit coercion rules.
The input must have a column that matches the label column name provided during
training. This value is provided using the input_label_cols
option. If
input_label_cols
is unspecified, the column named "label" in the training data
is used. The extra columns are ignored.
If the TRANSFORM
clause was present in the CREATE MODEL
statement that created model_name
,
then only the input columns present in the TRANSFORM
clause must appear in query_statement
.
threshold
(Optional) threshold
is a custom threshold for the binary-class classification
model to be used for evaluation. The default value is 0.5. The threshold value that is
supplied must be of type STRUCT
.
A zero value for precision or recall means that the selected threshold produced no true positive labels. A NaN value for precision means that the selected threshold produced no positive labels, neither true positives nor false positives.
If both table_name
and query_statement
are unspecified, you cannot use a
threshold. Also, threshold
can only be used with binary-class classification
models.
perform_aggregation
(Optional) perform_aggregation
identifies the level of evaluation for forecasting
accuracy. If the value is true
, the forecasting accuracy is on the level of time series. If
the value is false
, the forecasting accuracy is on the level of timestamp. The default value
is true. The perform_aggregation value is part of the settings
STRUCT
.
horizon
(Optional) Horizon is the number of forecasted time points against which the evaluation
metrics are computed. The horizon value
is type INT64
and is part of the settings STRUCT.
The default value is the horizon value specified in
the CREATE MODEL
statement for time-series models, or 1000
if unspecified. When evaluating
multiple time series at the same time, this parameter applies to each
time series.
confidence_level
(Optional) The percentage of the future values that fall in the prediction
interval. The confidence_level value is type
FLOAT64
and is part of the settings STRUCT
.
The default value is 0.95. The valid input range is [0, 1).
ML.EVALUATE
output
The output of the ML.EVALUATE
function is a single row containing common
metrics applicable to the type of model supplied.
ML.EVALUATE
returns the following columns for a regression model, which
includes linear regression models, boosted tree regressor, random forest regressor,
DNN regressor, deep-and-wide regressor, and AutoML Tables regressor:
- mean_absolute_error
- mean_squared_error
mean_squared_log_error
median_absolute_error
r2_score
explained_variance
ML.EVALUATE
returns the following columns for a classification model, which includes logistic regression models, boosted tree classifier, random forest classifier, DNN classifier, deep-and-wide classifier, and AutoML Tables classifier:
precision
recall
accuracy
f1_score
log_loss
roc_auc
ML.EVALUATE
returns the following columns for a k-means model:
- Davies-Bouldin Index
- Mean squared distance
ML.EVALUATE
returns the following columns for a matrix factorization model with implicit feedback:
- mean_average_precision
- mean_squared_error
- normalized_discounted_cumulative_gain
- average_rank (PDF download)
ML.EVALUATE
returns the following columns for a matrix factorization model with explicit feedback:
- mean_absolute_error
- mean_squared_error
mean_squared_log_error
median_absolute_error
r2_score
explained_variance
ML.EVALUATE
returns the following columns for a PCA model:
total_explained_variance_ratio
.
It is the percentage of the cumulative variance explained by all the returned
principal components. For more information, see
the ML.PRINCIPAL_COMPONENT_INFO
function.
The ML.EVALUATE
function returns the following columns for a trained
time-series ARIMA_PLUS
or ARIMA_PLUS_XREG
model when the input data is provided and
perform_aggregation is FALSE:
time_series_id_col
ortime_series_id_cols
: the identifiers of a time series. Only present when forecasting multiple time series at once. The column names and types are inherited from theTIME_SERIES_ID_COL
option as specified in the model creation query. TheARIMA_PLUS_XREG
model does not support time_series_id_col(s).time_series_timestamp_col
: The column name and type are inherited from theTIME_SERIES_TIMESTAMP_COL
option as specified in the model creation query.time_series_data_col
: The column name and type are inherited from theTIME_SERIES_DATA_COL
option as specified in the model creation query.forecasted_time_series_data_col
: Same astime_series_data_col
but with "forecasted_" prefix in the column name.lower_bound
: The lower bound of the prediction interval.upper_bound
: The upper bound of the prediction interval.absolute_error
: The absolute value of the difference between the forecasted value and the actual data value.absolute_percentage_error
: The absolute value of the absolute error divided by the actual value.
The ML.EVALUATE
function returns the following columns for a trained
time-series ARIMA_PLUS or ARIMA_PLUS_XREG model when the input data is provided
and perform_aggregation is TRUE:
time_series_id_col
ortime_series_id_cols
: the identifiers of a time series. Only present when forecasting multiple time series at once. The column names and types are inherited from theTIME_SERIES_ID_COL
option as specified in the model creation query. TheARIMA_PLUS_XREG
model does not support these columns.- mean_absolute_error
- mean_squared_error
- root_mean_squared_error
- mean_absolute_percentage_error
- symmetric_mean_absolute_percentage_error