Hyperparameter tuning for CREATE MODEL statements
BigQuery ML supports hyperparameter tuning when training ML models using
CREATE MODEL
statements. Hyperparameter tuning is commonly used to improve
model performance by searching for the optimal hyperparameters. Hyperparameter
tuning supports the following model types:
For information about enabling new features and for the answers to common questions about fine-tuning machine learning models using BigQuery ML hyperparameter tuning, see the Q&A section below.
For information about BigQuery ML hyperparameter tuning, see Hyperparameter tuning overview.
For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.
For locations that BigQuery ML hyperparameter tuning is available, see BigQuery ML locations.
CREATE MODEL
syntax
To run hyperparameter tuning, add the num_trials training option parameter
to the CREATE MODEL
statement to specify the maximum number of sub-models to train.
{CREATE MODEL | CREATE MODEL IF NOT EXISTS | CREATE OR REPLACE MODEL} model_name OPTIONS(Existing Training Options, NUM_TRIALS = int64_value, [, MAX_PARALLEL_TRIALS = int64_value ] [, HPARAM_TUNING_ALGORITHM = { 'VIZIER_DEFAULT' | 'RANDOM_SEARCH' | 'GRID_SEARCH' } ] [, hyperparameter={HPARAM_RANGE(min, max) | HPARAM_CANDIDATES([candidates]) }... ] [, HPARAM_TUNING_OBJECTIVES = { 'R2_SCORE' | 'ROC_AUC' | ... } ] [, DATA_SPLIT_METHOD = { 'AUTO_SPLIT' | 'RANDOM' | 'CUSTOM' | 'SEQ' | 'NO_SPLIT' } ] [, DATA_SPLIT_COL = string_value ] [, DATA_SPLIT_EVAL_FRACTION = float64_value ] [, DATA_SPLIT_TEST_FRACTION = float64_value ] ) AS query_statement
Existing training options
Hyperparameter tuning supports most training options with the limitation that once a training option is explicitly set, it can’t be treated as a tunable hyperparameter. For example, the combination below is not valid:
l1_reg=0.1, l1_reg=hparam_range(0, 10)
NUM_TRIALS
Syntax
NUM_TRIALS = int64_value
Description
The maximum number of submodels to train. The tuning will stop when num_trials
submodels are trained, or when the hyperparameter search space is exhausted.
The maximum value is 100.
Arguments
int64_value
is an 'INT64'
. Allowed values are 1 to 100.
MAX_PARALLEL_TRIALS
Syntax
MAX_PARALLEL_TRIALS = int64_value
Description
The maximum number of trials to run at the same time. The default value is 1 and the maximum value is 5.
Arguments
int64_value
is an 'INT64'
. Allowed values are 1 to 5.
HPARAM_TUNING_ALGORITHM
Syntax
HPARAM_TUNING_ALGORITHM = { 'VIZIER_DEFAULT' | 'RANDOM_SEARCH' | 'GRID_SEARCH' }
Description
The algorithm used to tune the hyperparameters.
Arguments
HPARAM_TUNING_ALGORITHM
accepts the following values:
'VIZIER_DEFAULT'
(default and recommended): Uses the default algorithm in Vertex AI Vizier to tune hyperparameters. This algorithm is the most powerful tuning algorithm and performs a mixture of advanced search algorithms including Bayesian Optimization with Gaussian Process. It also uses transfer learning to take advantage of previously tuned models.'RANDOM_SEARCH'
: Uses random search to explore the search space.'GRID_SEARCH'
: Uses grid search to explore the search space. This algorithm is only available when every hyperparameter's search space is discrete.
HYPERPARAMETER
Syntax
hyperparameter={HPARAM_RANGE(min, max) | HPARAM_CANDIDATES([candidates]) }...
Description
The configuration of the search space of a hyperparameter. See Hyperparameters and objectives for each model type to see its supported tunable hyperparameters.
Arguments
Accepts one of the following arguments:
HPARAM_RANGE(min, max)
: Use this argument to specify the search space of continuous values from a hyperparameter, for examplelearn_rate = HPARAM_RANGE(0.0001, 1.0)
.HPARAM_CANDIDATES([candidates])
to specify a hyperparameter with discrete values, likeOPTIMIZER=HPARAM_CANDIDATES(['adagrad', 'sgd', 'ftrl'])
.
HPARAM_TUNING_OBJECTIVES
Syntax
HPARAM_TUNING_OBJECTIVES = { 'R2_SCORE' | 'ROC_AUC' | ... }
Description
The objective metrics for the model. The candidates are a subset of the model evaluation metrics. Currently only one objective is supported.
Arguments
See Hyperparameters and objectives for each model type to see its supported arguments and defaults.
DATA_SPLIT_TEST_FRACTION
Syntax
DATA_SPLIT_TEST_FRACTION = float64_value
Description
This option is used with 'RANDOM' and 'SEQ' splits. It specifies the fraction of the data used as test data for the final evaluation metrics reporting. The fraction is accurate to two decimal places. See the Data split section for more details.
Arguments
float64_value
is a FLOAT64
that specifies the
fraction of the data used as test data for the final evaluation metrics
reporting. The default value is 0.0.
Hyperparameters and objectives
The following table lists the supported hyperparameters and their objectives for each model type:
Model type | Hyperparameter Objectives | Hyperparameter | Valid Range | Default Range | Scale Type |
---|---|---|---|---|---|
LINEAR_REG |
mean_absolute_error
mean_squared_error
mean_squared_log_error
median_absolute_error
r2_score (default)
explained_variance
|
l1_reg
l2_reg |
(0, ∞]
(0, ∞] |
(0, 10]
(0, 10] |
LOG
LOG |
LOGISTIC_REG |
precision
recall
accuracy
f1_score
log_loss
roc_auc (default)
|
l1_reg
l2_reg |
(0, ∞]
(0, ∞] |
(0, 10]
(0, 10] |
LOG
LOG |
KMEANS |
davies_bouldin_index
|
num_clusters | [2, 100] | [2, 10] | LINEAR |
MATRIX_ FACTORIZATION (explicit) |
mean_squared_error
|
num_factors
l2_reg |
[2, 200]
(0, ∞) |
[2, 20]
(0, 10] |
LINEAR
LOG |
MATRIX_ FACTORIZATION (implicit) |
mean_average_precision (default)
mean_squared_error
normalized_discounted_cumulative_gain
average_rank
|
num_factors
l2_reg wals_alpha |
[2, 200]
(0, ∞) [0, ∞) |
[2, 20]
(0, 10] [0, 100] |
LINEAR
LOG LINEAR |
AUTOENCODER |
mean_absolute_error
mean_squared_error
mean_squared_log_error
|
learn_rate
batch_size l1_reg l2_reg l1_reg_activation dropout hidden_units optimizer activation_fn |
[0, 1]
(0, ∞) (0, ∞) (0, ∞) (0, ∞) [0, 1) Array of [1, ∞) { adam , adagrad , ftrl , rmsprop , sgd }
{ relu , relu6 , crelu , elu , selu , sigmoid , tanh }
|
[0, 1]
[16, 1024] (0, 10] (0, 10] (0, 10] [0, 0.8] N/A { adam , adagrad , ftrl , rmsprop , sgd }
N/A |
LOG
LOG LOG LOG LOG LINEAR N/A N/A N/A |
DNN_CLASSIFIER |
precision
recall
accuracy
f1_score
log_loss
roc_auc (default)
|
batch_size
dropout hidden_units learn_rate optimizer l1_reg l2_reg activation_fn |
(0, ∞)
[0, 1) Array of [1 |