BigQuery ML model inference overview


Machine learning inference is the process of running data points into a machine learning model to calculate an output such as a single numerical score. This process is also referred to as “operationalizing a machine learning model” or “putting a machine learning model into production.”

Machine learning inference in BigQuery ML includes not only machine learning tasks such as label prediction, but also high-level application domains such as forecasting, recommendation, and anomaly detection.

For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, see the End-to-end user journey for each model.


Prediction in BigQuery ML is used not only for supervised learning models, but also unsupervised learning models. It is used only for models trained with independent and identically distributed (IID) data. For time series data that is non-IID, the term forecasting is used. See the forecasting section below.

BigQuery ML supports prediction functinalities through the ML.PREDICT function, with the following models:

Model Category Model Types What ML.PREDICT does
Supervised Learning Linear & Logistic Regression

Boosted Trees

Deep Neural Networks


AutoML Tables
Predict the label, either a numerical value for regression tasks or a categorical value for classification tasks.
Unsupervised Learning Kmeans Assign the cluster to the entity.
PCA Apply dimensionality reduction to the entity by transforming it into the space spanned by the eigenvectors
Autoencoder Transform the entity into the embedded space.

For Matrix Factorization model inference, see recommendation.


Forecasting is a technique that uses historical data as inputs to make informed estimates into the future. In BigQuery ML, forecasting is applied to time series data. For IID data, see prediction.

BigQuery ML supports forecasting functinalities through the ML.FORECAST function, with the ARIMA_PLUS model. The ARIMA_PLUS model is not a single model, but a time series modeling pipeline that includes multiple models and algorithms. See the time series modeling pipeline for more details.


Recommender systems are one of the most successful and widespread application of machine learning technologies for businesses. A recommendation system helps users find compelling content in a large body of work. For example, Google Play Store provides millions of apps, while YouTube provides billions of videos, with more apps and videos added every day. How can users find new compelling new content? To find new compelling new content users can use search, but a recommendation engine can display content that users might not have thought to search for on their own. See the Recommendation Systems Overview for more information.

Machine learning algorithms in recommender systems are typically classified into two categories: content-based and collaborative filtering methods.

content-based filtering Uses similarity between items to recommend items similar to what the user likes. If user A watches two cute cat videos, then the system can recommend cute animal videos to that user.
collaborative filtering Uses similarities between queries and items simultaneously to provide recommendations. If user A is similar to user B, and user B likes video 1, then the system can recommend video 1 to user A (even if user A hasn’t seen any videos similar to video 1).

The Matrix Factorization model is widely used as a collaborative filtering method for recommendation systems. BigQuery ML supports the ML.RECOMMEND function to facilitate using Matrix Factorization for recommendation purposes. For more information on applying Matrix Factorization to recommendation, see Matrix Factorization.

In modern recommendation engines, Deep neural network (DNN) models, including Wide-and-Deep models are widely used. It can be viewed as an extension of Matrix Factorization based collaborative filtering. It can easily incorporate query features and item features to improve the relevance of recommendations. For more background, read Recommendation using Deep Neural Network Models, Deep Neural Networks for YouTube Recommendations, or Wide & Deep Learning for Recommender Systems. It is also worthwhile to call out that any supervised learning models can be used for recommendation tasks.

Anomaly Detection

Anomaly detection is a step in data mining that identifies data points, events, and observations that deviate from a dataset's normal behavior. Anomalous data can indicate critical incidents such as technical issues or opportunities like changes in consumer behavior.

One challenge with anomaly detection is identifying and defining the anomaly. Labeled data with known anomalies allow you to choose between supervised machine learning model types that are already supported in BigQuery ML. Without either a known anomaly type or labeled data, you can still use unsupervised machine learning to help detect anomalies. Depending upon whether or not your training data is time series, you can detect anomalies in training data or in new input data using the ML.DETECT_ANOMALIES function with the following models:

Data Type Model Types What ML.DETECT_ANOMALIES does
Time series ARIMA_PLUS Detect the anomalies in the time series.
Independent and identically distributed random variables (IID) Kmeans Detect anomalies based on the shortest distance among the normalized distances from the input data to each cluster centroid. For a definiation of normalized distances, see ML.DETECT_ANOMALIES.
Autoencoder Detect anomalies based upon the reconstruction loss in terms of mean squared error. For more information, see ML.RECONSTRUCTION_LOSS. ML.RECONSTRUCTION_LOSS can retrieve all types of reconstruction loss.
PCA Detect anomalies based upon the reconstruction loss in terms of mean squared error.