Feature serving
This document describes your options for making features available for BigQuery ML model training and inference. For all options, you must save the features in BigQuery tables as a prerequisite first step.
Point-in-time correctness
The data used to train a model often has time dependencies built into it. When you create a feature table for time sensitive features, include a timestamp column to represent the feature values as they existed at a given time for each row. You can then use point-in-time lookup functions when querying data from these feature tables in order to ensure that there is no data leakage between training and serving. This process enables point-in-time correctness.
Use the following functions to specify point-in-time cutoffs when retrieving time sensitive features:
Serve features in BigQuery ML
To train models and perform batch inference in BigQuery ML, you
can retrieve features using one of the point-in-time lookup functions described
in the Point-in-time correctness section. You can
include these functions in the
query_statement
clause of the CREATE MODEL
statement for
training, or in the query_statement
clause of the appropriate table-valued
function, such as
ML.PREDICT
,
for serving.
Serve features with Vertex AI Feature Store
To serve features to BigQuery ML models that are registered in Vertex AI, you can use Vertex AI Feature Store. Vertex AI Feature Store works on top of feature tables in BigQuery to manage and serve features with low latency. You can use online serving to retrieve features in real time for online prediction, and you can use offline serving to retrieve features for model training.
For more information about preparing BigQuery feature data to be used in Vertex AI Feature Store, see Prepare data source.