Scaling machine learning with BigQuery ML inference engine
As enterprises race to extract value from structured, semi-structured, and unstructured data, they face a continuum of challenges related to data gravity, including data acquisition, data management and data governance. Simultaneously, these companies are also grappling with model gravity as they build and scale machine learning workflows for their predictive analytics needs.
Core data warehousing capabilities of BigQuery addresses the challenges of data gravity. Through integrated machine learning capabilities, BigQuery addresses model gravity challenges including feature engineering, model training, model evaluation, model serving, and scalable inference.
Today, we are announcing general availability (GA) for BigQuery ML inference engine, enabling users to run inference over custom models, remote models, and pretrained models within their machine learning workflow.
BigQuery ML inference engine provides a single API in to integrate with:
- Models trained via BigQuery ML
- Models imported into BigQuery ML, e.g., TensorFlow, XGBoost, ONNX
- Models deployed on Vertex AI Endpoint, including models trained elsewhere, or trained on Vertex AI via Custom Training or AutoML
- Vertex AI pretrained models, e.g., Vertex AI Vision, Vertex AI Translation, Vertex AI Natural Language, Vertex AI foundational models
BigQuery ML inference engine eliminates integration complexity by enabling developers to bring their models directly to their BigQuery data, while also providing a secure and scalable mechanism to integrate and run inference over models managed remotely.
Let’s look at each of the integrations in more detail.
Inference with models trained with BigQuery ML
BigQuery ML natively supports a variety of machine learning models, including Linear Regression, DNN, Boosted Tree Classification, K-Means, Principal Component Analysis, Matrix Factorization, and Multivariate Time Series. Through common SQL-based templates, BigQuery ML provides an intuitive programming model to easily create, train, and serve machine learning models.
In the example below, CREATE MODEL is used to create linear regression models. Then, the ML.PREDICT function is used to run inference over the entire table, which could be a small data set or scale to petabytes in size.
The example above demonstrates the simplicity and efficiency of building a machine learning model with BigQuery ML, eliminating the need of exporting data to train the model and providing integrated storage and serving of the model.
Inference with models imported into BigQuery ML
Thousands of customers benefit from BigQuery ML native models; however, these same customers sometimes also have machine learning needs served by frameworks and model types not natively supported by BigQuery ML. Enterprises often have a variety of modeling frameworks, including Tensorflow, PyTorch, XGBoost, Caffe, and scikit-learn.
BigQuery ML inference engine now supports the ability to import and then run inference over models created outside of BigQuery ML, with support for TensorFlow, TensorFlow Lite, XGBoost, and Open Neural Network Exchange (ONNX) models. Models built with Pytorch, Caffe, scikit-learn, and many other frameworks can be converted to ONNX and served in BigQuery ML.
The template in the example above is similar for importing TensorFlow, TensorFlow Lite, and Open Neural Network Exchange models. This enables developers to seamlessly integrate and run inference with existing models within their BigQuery ML workflow and eliminates the need to export data in order to run inference at scale.
Inference with models managed in Vertex AI
While inference for native BigQuery ML models and imported models improve developer efficiency and reduce workflow complexity, there are clear use cases where inference from external models is more appropriate, including the need to serve large language models, models that require GPU/TPU acceleration, and workflows that require a single serving point.
BigQuery ML inference engine now supports the ability to run inference over remote models with initial support for models managed by Vertex AI.
The example below shows the CREATE MODEL for a remote connection to a model hosted as Vertex AI endpoint.
BigQuery ML inference engine’s ability to integrate with remote models, managed by Vertex AI, provides a scalable and flexible serving environment as these endpoints can be configured for a wide range of model types, with options for pre-built containers, custom containers, and custom prediction routines.
It provides auto-retry logic for failed inference operations, enabling large-scale inference jobs to successfully complete while providing detailed error output for failed operations.
Additionally, native BigQuery ML-developed models can be automatically registered with Vertex AI Model Registry, providing support for local and remote inference.
Inference with Vertex AI pretrained models
Today, in addition to describing how BigQuery ML inference engine takes remote model integration beyond Vertex AI endpoints, we are also announcing the availability of remote inference for Vertex AI pretrained APIs including Vertex AI Vision, Vertex AI Translation, Vertex AI Natural Language.
The example below shows the CREATE MODEL for a remote connection to a Vertex AI pretrained APIs.
By expanding support to run inference on open-source and Vertex AI-hosted models, BigQuery ML inference engine makes it simple, easy, and cost-effective to integrate machine learning workflows within your BigQuery environment. Refer to the documentation to learn more about these new features.
If you’re attending Google Cloud Next '23, join us for Generative AI powered use cases for data engineers (ANA111) and Generative AI powered use cases for data engineers (ANA211) breakout session to learn more about BigQuery ML and see live demos.