Model inference overview
This document describes the types of batch inference that BigQuery ML supports, which include:
Machine learning inference is the process of running data points into a machine learning model to calculate an output such as a single numerical score. This process is also referred to as "operationalizing a machine learning model" or "putting a machine learning model into production."
Batch prediction
The following sections describe the available ways of performing prediction in BigQuery ML.
Inference using BigQuery ML trained models
Prediction in BigQuery ML is used not only for supervised learning models, but also unsupervised learning models.
BigQuery ML supports prediction functionalities through the
ML.PREDICT
function,
with the following models:
Model Category | Model Types | What ML.PREDICT does |
---|---|---|
Supervised Learning | Linear & logistic regression Boosted trees Random forest Deep Neural Networks Wide-and-Deep AutoML Tables | Predict the label, either a numerical value for regression tasks or a categorical value for classification tasks. |
Unsupervised Learning | K-means | Assign the cluster to the entity. |
PCA | Apply dimensionality reduction to the entity by transforming it into the space spanned by the eigenvectors. | |
Autoencoder | Transform the entity into the embedded space. |
Inference using imported models
With this approach, you create and train a model outside of
BigQuery, import it by using the
CREATE MODEL
statement,
and then run inference on it by using the
ML.PREDICT
function.
All inference processing occurs in BigQuery, using data from
BigQuery. Imported models can perform supervised or
unsupervised learning.
BigQuery ML supports the following types of imported models:
- Open Neural Network Exchange (ONNX) for models trained in PyTorch, scikit-learn, and other popular ML frameworks.
- TensorFlow
- TensorFlow Lite
- XGBoost
Use this approach to make use of custom models developed with a range of ML frameworks while taking advantage of BigQuery ML's inference speed and co-location with data.
To learn more, try one of the following tutorials:
- Make predictions with imported TensorFlow models
- Make predictions with scikit-learn models in ONNX format
- Make predictions with PyTorch models in ONNX format
Inference using remote models
With this approach, you can create a reference to a model
hosted in Vertex AI Prediction
by using the
CREATE MODEL
statement,
and then run inference on it by using the
ML.PREDICT
function.
All inference processing occurs in Vertex AI, using data from
BigQuery. Remote models can perform supervised or
unsupervised learning.
Use this approach to run inference against large models that require the GPU hardware support provided by Vertex AI. If most of your models are hosted by Vertex AI, this also lets you run inference against these models by using SQL, without having to manually build data pipelines to take data to Vertex AI and bring prediction results back to BigQuery.
For step-by-step instructions, see Make predictions with remote models on Vertex AI.
Batch inference with BigQuery models in Vertex AI
BigQuery ML has built-in support for batch prediction, without the
need to use Vertex AI. It is also possible to register a
BigQuery ML model to Model Registry in order to
perform batch prediction in Vertex AI using a
BigQuery table as input. However, this can only
be done by using the Vertex AI API and setting
InstanceConfig.instanceType
to object
.
Online prediction
The built-in inference capability of BigQuery ML is optimized for large-scale use cases, such as batch prediction. While BigQuery ML delivers low latency inference results when handling small input data, you can achieve faster online prediction through seamless integration with Vertex AI.
You can manage BigQuery ML models within the Vertex AI environment, which eliminates the need to export models from BigQuery ML before deploying them as Vertex AI endpoints. By managing models within Vertex AI, you get access to all of the Vertex AI MLOps capabilities, and also to features such as Vertex AI Feature Store.
Additionally, you have the flexibility to export BigQuery ML models to Cloud Storage for availability on other model hosting platforms.
What's next
- For more information about using Vertex AI models to generate text and embeddings, see Generative AI overview.
- For more information about using Cloud AI APIs to perform AI tasks, see AI application overview.
- For information about supported model types and SQL functions for each type of inference, see the End-to-end user journey for each model.