Model inference overview
This document describes the types of inference that BigQuery ML supports.
Machine learning inference is the process of running data points into a machine learning model to calculate an output such as a single numerical score. This process is also referred to as "operationalizing a machine learning model" or "putting a machine learning model into production."
Machine learning inference in BigQuery ML includes machine learning tasks such as the following:
- Prediction
- Forecasting
- Recommendation
- Anomaly detection
- Computer vision
- Machine translation
- Natural language processing
For information about supported model types and SQL functions for each type of inference, see the End-to-end user journey for each model.
Prediction
The following sections describe the available ways of performing prediction in BigQuery ML.
Inference using BigQuery ML trained models
Prediction in BigQuery ML is used not only for supervised learning models, but also unsupervised learning models. It is used only for models trained with independent and identically distributed (IID) data. For time series data that is non-IID, the term forecasting is used. See the forecasting section below.
BigQuery ML supports prediction functionalities through the
ML.PREDICT
function,
with the following models:
Model Category | Model Types | What ML.PREDICT does |
---|---|---|
Supervised Learning | Linear & logistic regression Boosted trees Random forest Deep Neural Networks Wide-and-Deep AutoML Tables | Predict the label, either a numerical value for regression tasks or a categorical value for classification tasks. |
Unsupervised Learning | K-means | Assign the cluster to the entity. |
PCA | Apply dimensionality reduction to the entity by transforming it into the space spanned by the eigenvectors. | |
Autoencoder | Transform the entity into the embedded space. |
For Matrix Factorization model inference, see Recommendation.
Inference using imported models
With this approach, you create and train a model outside of
BigQuery, import it by using the
CREATE MODEL
statement,
and then run inference on it by using the
ML.PREDICT
function.
All inference processing occurs in BigQuery, using data from
BigQuery. Imported models can perform supervised or
unsupervised learning.
BigQuery ML supports the following types of imported models:
- Open Neural Network Exchange (ONNX) for models trained in PyTorch, scikit-learn, and other popular ML frameworks.
- TensorFlow
- TensorFlow Lite
- XGBoost
Use this approach to make use of custom models developed with a range of ML frameworks while taking advantage of BigQuery ML's inference speed and co-location with data.
To learn more, try one of the following tutorials:
- Make predictions with imported TensorFlow models
- Make predictions with scikit-learn models in ONNX format
- Make predictions with PyTorch models in ONNX format
Inference using remote models
With this approach, you can create a reference to a model
hosted in Vertex AI Prediction
by using the
CREATE MODEL
statement,
and then run inference on it by using the
ML.PREDICT
function.
All inference processing occurs in Vertex AI, using data from
BigQuery. Remote models can perform supervised or
unsupervised learning.
Use this approach to run inference against large models that require the GPU hardware support provided by Vertex AI. If most of your models are hosted by Vertex AI, this also lets you run inference against these models by using SQL, without having to manually build data pipelines to take data to Vertex AI and bring prediction results back to BigQuery.
For step-by-step instructions, see Make predictions with remote models on Vertex AI.
Forecasting
Forecasting is a technique that uses historical data as inputs to make informed estimates into the future. In BigQuery ML, forecasting is applied to time series data. For IID data, see prediction.
BigQuery ML supports forecasting functionalities through the
ML.FORECAST
function,
with the ARIMA_PLUS and ARIMA_PLUS_XREG models. The time series model is not a single
model, but a time series modeling pipeline that includes multiple models and
algorithms. See the time series modeling pipeline for more details.
Recommendation
Recommender systems are one of the most successful and widespread applications of machine learning technologies for businesses. A recommendation system helps users find compelling content in a large body of work. For example, Google Play Store provides millions of apps, while YouTube provides billions of videos, with more apps and videos added every day. To find new compelling new content users can use search, but a recommendation engine can display content that users might not have thought to search for on their own. See the Recommendation Systems Overview for more information.
Machine learning algorithms in recommender systems are typically classified into two categories: content-based and collaborative filtering methods.
Type | Definition | Example |
---|---|---|
content-based filtering | Uses similarity between items to recommend items similar to what the user likes. | If user A watches two cute cat videos, then the system can recommend cute animal videos to that user. |
collaborative filtering | Uses similarities between queries and items simultaneously to provide recommendations. | If user A is similar to user B, and user B likes video 1, then the system can recommend video 1 to user A (even if user A hasn't seen any videos similar to video 1). |
The Matrix Factorization model is widely used as a collaborative filtering method for recommendation systems. BigQuery ML supports the
ML.RECOMMEND
function to facilitate using Matrix Factorization for recommendation purposes.
For more information about applying Matrix Factorization to recommendation, see
Matrix Factorization.
In modern recommendation engines, Deep neural network (DNN) models, including Wide-and-Deep models, are widely used. It can be viewed as an extension of Matrix Factorization based collaborative filtering. It can incorporate query features and item features to improve the relevance of recommendations. For more background, read Recommendation using Deep Neural Network Models, Deep Neural Networks for YouTube Recommendations, or Wide & Deep Learning for Recommender Systems. It is also worthwhile to call out that any supervised learning models can be used for recommendation tasks.
Anomaly detection
Anomaly detection is a step in data mining that identifies data points, events, and observations that deviate from a dataset's normal behavior. Anomalous data can indicate critical incidents such as technical issues or opportunities like changes in consumer behavior.
One challenge with anomaly detection is identifying and defining the anomaly.
Labeled data with known anomalies let you choose
between supervised machine learning model types that are already
supported in BigQuery ML.
Without either a known anomaly type or labeled data, you can still use unsupervised machine learning to help detect anomalies. Depending upon whether or not your training data is time series, you can detect anomalies in training data or in new input data using the
ML.DETECT_ANOMALIES
function
with the following models:
Data Type | Model Types | What ML.DETECT_ANOMALIES does |
---|---|---|
Time series | ARIMA_PLUS | Detect the anomalies in the time series. |
Independent and identically distributed random variables (IID) | K-means | Detect anomalies based on the shortest distance among the normalized
distances from the input data to each cluster centroid. For a definition of
normalized distances, see ML.DETECT_ANOMALIES . |
Autoencoder | Detect anomalies based upon the reconstruction loss in terms of mean
squared error. For more information, see ML.RECONSTRUCTION_LOSS . ML.RECONSTRUCTION_LOSS can
retrieve all types of reconstruction loss. |
|
PCA | Detect anomalies based upon the reconstruction loss in terms of mean squared error. |
Computer vision
You can create a reference to the Cloud Vision API by
creating a remote model,
with a REMOTE_SERVICE_TYPE
of CLOUD_AI_VISION_V1
, and then using the
ML.ANNOTATE_IMAGE
function
to annotate images by using that service. ML.ANNOTATE_IMAGE
works with
object tables. All inference
processing occurs in Vertex AI, using data
from BigQuery. The results are stored in BigQuery.
Use this approach to run inference against Google's vision models without having to learn Python or develop familiarity with the Vision API.
To learn more, try
Annotate object table images with the ML.ANNOTATE_IMAGE
function.
Machine translation
You can create a reference to the Cloud Translation API by
creating a remote model,
with a REMOTE_SERVICE_TYPE
of CLOUD_AI_TRANSLATE_V3
, and then using the
ML.TRANSLATE
function
to interact with that service. ML.TRANSLATE
works with
standard tables. All inference
processing occurs in Vertex AI, using data
from BigQuery. The results are stored in BigQuery.
Use this approach to run inference against Google's text translation models without having to learn Python or develop familiarity with the Cloud Translation API.
To learn more, try
Translate text with the ML.TRANSLATE
function.
Natural language processing
You can create a reference to the Cloud Natural Language API by
creating a remote model,
with a REMOTE_SERVICE_TYPE
of CLOUD_AI_NATURAL_LANGUAGE_V1
, and then using the
ML.UNDERSTAND_TEXT
function
to interact with that service. ML.UNDERSTAND_TEXT
works with
standard tables. All inference
processing occurs in Vertex AI, using data
from BigQuery. The results are stored in BigQuery.
Use this approach to run inference against Google's text translation models without having to learn Python or develop familiarity with the Natural Language API.
To learn more, try
Understand text with the ML.UNDERSTAND_TEXT
function.