End-to-end user journeys for ML models

This document describes the user journeys for machine learning (ML) models that are trained in BigQuery ML, including the statements and functions that you can use to work with ML models. BigQuery ML offers the following types of ML models:

Model creation user journeys

The following table describes the statements and functions you can use to create and tune models:

Model category Model type Model creation Feature preprocessing Hyperparameter tuning1 Model weights Feature & training info Tutorials
Supervised learning Linear & logistic regression CREATE MODEL Automatic preprocessing

Manual preprocessing
Hyperparameter tuning

ML.TRIAL
_INFO
ML.WEIGHTS ML.FEATURE
_INFO


ML.TRAINING
_INFO
Use linear regression to predict penguin weight

Perform classification with a logistic regression model
Deep neural networks (DNN) CREATE MODEL Automatic preprocessing

Manual preprocessing
Hyperparameter tuning

ML.TRIAL
_INFO
N/A2 ML.FEATURE
_INFO


ML.TRAINING
_INFO
N/A
Wide & Deep networks CREATE MODEL Automatic preprocessing

Manual preprocessing
Hyperparameter tuning

ML.TRIAL
_INFO
N/A2 ML.FEATURE
_INFO


ML.TRAINING
_INFO
N/A
Boosted trees CREATE MODEL Automatic preprocessing

Manual preprocessing
Hyperparameter tuning

ML.TRIAL
_INFO
N/A2 ML.FEATURE
_INFO


ML.TRAINING
_INFO
Perform classification with a boosted trees model
Random forest CREATE MODEL Automatic preprocessing

Manual preprocessing
Hyperparameter tuning

ML.TRIAL
_INFO
N/A2 ML.FEATURE
_INFO


ML.TRAINING
_INFO
N/A
AutoML classification & regression CREATE MODEL AutoML automatically performs feature engineering AutoML automatically performs hyperparameter tuning N/A2 ML.FEATURE
_INFO


ML.TRAINING
_INFO
N/A
Unsupervised learning K-means CREATE MODEL Automatic preprocessing

Manual preprocessing
Hyperparameter tuning

ML.TRIAL
_INFO
ML.CENTROIDS ML.FEATURE
_INFO


ML.TRAINING
_INFO
Find clusters in bike station data
Matrix factorization CREATE MODEL N/A Hyperparameter tuning

ML.TRIAL
_INFO
ML.WEIGHTS ML.FEATURE
_INFO


ML.TRAINING
_INFO
Generate movie recommendations using explicit feedback

Generate content recommendations using implicit feedback
Principal component analysis (PCA) CREATE MODEL Automatic preprocessing

Manual preprocessing
N/A ML.PRINCIPAL
_COMPONENTS


ML.PRINCIPAL
_COMPONENT
_INFO
ML.FEATURE
_INFO


ML.TRAINING
_INFO
N/A
Autoencoder CREATE MODEL Automatic preprocessing

Manual preprocessing
Hyperparameter tuning

ML.TRIAL
_INFO
N/A2 ML.FEATURE
_INFO


ML.TRAINING
_INFO
N/A
Transform-only Transform-only CREATE MODEL Manual preprocessing N/A N/A ML.FEATURE
_INFO
N/A

1For a step-by-step example of using hyperparameter tuning, see Improve model performance with hyperparameter tuning.

2BigQuery ML doesn't offer a function to retrieve the weights for this model. To see the weights of the model, you can export the model from BigQuery ML to Cloud Storage and then use the XGBoost library or the TensorFlow library to visualize the tree structure for tree models or the graph structure for neural networks. For more information, see EXPORT MODEL and Export a BigQuery ML model for online prediction.

Model use user journeys

The following table describes the statements and functions you can use to evaluate, explain, and get predictions from models:

Model category Model type Evaluation Inference AI explanation Model monitoring
Supervised learning Linear & logistic regression ML.EVALUATE

ML.CONFUSION
_MATRIX
1

ML.ROC_CURVE2
ML.PREDICT

ML.TRANSFORM
ML.EXPLAIN_PREDICT3

ML.GLOBAL_EXPLAIN

ML.ADVANCED_WEIGHTS5
ML.DESCRIBE_DATA

ML.VALIDATE_DATA
_DRIFT


ML.VALIDATE_DATA
_SKEW


ML.TFDV_DESCRIBE

ML.TFDV_VALIDATE
Deep neural networks (DNN) ML.EVALUATE

ML.CONFUSION
_MATRIX
1

ML.ROC_CURVE2
ML.PREDICT

ML.TRANSFORM
ML.EXPLAIN_PREDICT3

ML.GLOBAL_EXPLAIN

ML.ADVANCED_WEIGHTS5
ML.DESCRIBE_DATA

ML.VALIDATE_DATA
_DRIFT


ML.VALIDATE_DATA
_SKEW


ML.TFDV_DESCRIBE

ML.TFDV_VALIDATE
Wide & Deep networks ML.EVALUATE

ML.CONFUSION
_MATRIX
1

ML.ROC_CURVE2
ML.PREDICT

ML.TRANSFORM
ML.EXPLAIN_PREDICT3

ML.GLOBAL_EXPLAIN

ML.ADVANCED_WEIGHTS5
ML.DESCRIBE_DATA

ML.VALIDATE_DATA
_DRIFT


ML.VALIDATE_DATA
_SKEW


ML.TFDV_DESCRIBE

ML.TFDV_VALIDATE
Boosted trees ML.EVALUATE

ML.CONFUSION
_MATRIX
1

ML.ROC_CURVE2
ML.PREDICT

ML.TRANSFORM
ML.EXPLAIN_PREDICT3

ML.GLOBAL_EXPLAIN

ML.FEATURE_IMPORTANCE4
ML.DESCRIBE_DATA

ML.VALIDATE_DATA
_DRIFT


ML.VALIDATE_DATA
_SKEW


ML.TFDV_DESCRIBE

ML.TFDV_VALIDATE
Random forest ML.EVALUATE

ML.CONFUSION
_MATRIX
1

ML.ROC_CURVE2
ML.PREDICT

ML.TRANSFORM
ML.EXPLAIN_PREDICT3

ML.GLOBAL_EXPLAIN

ML.FEATURE_IMPORTANCE4
ML.DESCRIBE_DATA

ML.VALIDATE_DATA
_DRIFT


ML.VALIDATE_DATA
_SKEW


ML.TFDV_DESCRIBE

ML.TFDV_VALIDATE
AutoML classification & regression ML.EVALUATE

ML.CONFUSION
_MATRIX
1

ML.ROC_CURVE2
ML.PREDICT ML.GLOBAL_EXPLAIN ML.DESCRIBE_DATA

ML.VALIDATE_DATA
_DRIFT


ML.VALIDATE_DATA
_SKEW


ML.TFDV_DESCRIBE

ML.TFDV_VALIDATE
Unsupervised learning K-means ML.EVALUATE ML.PREDICT
ML.DETECT
_ANOMALIES


ML.TRANSFORM
N/A ML.DESCRIBE_DATA

ML.VALIDATE_DATA
_DRIFT


ML.VALIDATE_DATA
_SKEW


ML.TFDV_DESCRIBE

ML.TFDV_VALIDATE
Matrix factorization ML.EVALUATE ML.RECOMMEND

ML.GENERATE
_EMBEDDING
N/A N/A
Principal component analysis (PCA) ML.EVALUATE ML.PREDICT
ML.GENERATE
_EMBEDDING

ML.DETECT
_ANOMALIES


ML.TRANSFORM
N/A ML.DESCRIBE_DATA

ML.VALIDATE_DATA
_DRIFT


ML.VALIDATE_DATA
_SKEW


ML.TFDV_DESCRIBE

ML.TFDV_VALIDATE
Autoencoder ML.EVALUATE ML.PREDICT

ML.GENERATE
_EMBEDDING

ML.DETECT
_ANOMALIES


ML.RECONSTRUCTION
_LOSS


ML.TRANSFORM
N/A ML.DESCRIBE_DATA

ML.VALIDATE_DATA
_DRIFT


ML.VALIDATE_DATA
_SKEW


ML.TFDV_DESCRIBE

ML.TFDV_VALIDATE
Transform-only Transform-only N/A ML.TRANSFORM N/A N/A

1ML.CONFUSION_MATRIX is only applicable to classification models.

2ML.ROC_CURVE is only applicable to binary classification models.

3The ML.EXPLAIN_PREDICT function encompasses the ML.PREDICT function because its output is a superset of the results of ML.PREDICT.

4To understand the difference between ML.GLOBAL_EXPLAIN and ML.FEATURE_IMPORTANCE, see the Explainable AI overview.

5The ML.ADVANCED_WEIGHTS function encompasses the ML.WEIGHTS function because its output is a superset of the results of ML.WEIGHTS.