Introduction to feature management in Vertex AI

In machine learning (ML), features are characteristic attributes of an instance or entity that you can use to train models or to make online predictions. Features are generated by transforming raw ML data into measurable and shareable attributes using feature engineering techniques, generally referred to as feature transformations.

Feature management refers to the process of creating, maintaining, sharing, and serving ML features stored in a centralized location or repository. Feature management makes it easier to reuse features to train and retrain models, reducing the life cycle of AI and ML deployments.

A product or service that includes feature management services to store, discover, share, and serve ML features is called a feature store. Vertex AI incorporates the following feature store services:

This page introduces and compares the two feature management services and provides an overview of their capabilities. It also describes how to migrate an existing feature store in Vertex AI Feature Store (Legacy) to the new Vertex AI Feature Store.

Vertex AI Feature Store

Vertex AI Feature Store offers a new approach to feature management by letting you maintain and serve your feature data from a BigQuery data source. In this approach, Vertex AI Feature Store acts as a metadata layer that provides online serving capabilities to your feature data source in BigQuery and lets you serve features online based on that data. You don't need to copy or import the data to a separate offline store in Vertex AI.

Vertex AI Feature Store is integrated with Dataplex to track feature metadata. It also supports embeddings and lets you perform vector similarity searches for nearest neighbors.

Vertex AI Feature Store is optimized for ultra-low latency serving and lets you do the following:

  • Store and maintain your offline feature data in BigQuery, taking advantage of the data management capabilities of BigQuery.

  • Share and reuse features by adding them to the feature registry.

  • Serve features for online predictions at low latencies using Bigtable online serving or at ultra-low latencies using Optimized online serving.

  • Store embeddings in your feature data and perform vector similarity searches.

  • Track feature metadata in Dataplex.

To learn more about Vertex AI Feature Store, see the Vertex AI Feature Store documentation.

Vertex AI Feature Store (Legacy)

Vertex AI Feature Store (Legacy) provides a centralized repository to store, organize, and serve ML feature data. It provisions a resource hierarchy that encapsulates both an online store and an offline store within Vertex AI. The online store serves the most recent feature values for online predictions. The offline store stores and maintains feature data (including historical data) that you can batch serve for training ML models.

Vertex AI Feature Store (Legacy) is a fully-functional feature management service that lets you do the following:

  • Batch or stream import feature data into the offline store from a data source, such as a Cloud Storage bucket or a BigQuery source.

  • Serve features online for predictions.

  • Batch serve or export features for ML model training or analysis.

  • Set Identity and Access Management (IAM) policies on EntityType and Featurestore resources.

  • Manage feature store resources from the Google Cloud console.

Vertex AI Feature Store (Legacy) doesn't include embeddings management or vector retrieval capabilities. If you need to manage embeddings in your feature data or perform vector similarity searches, consider switching to Vertex AI Feature Store. For information about migrating to Vertex AI Feature Store, see Migrate to Vertex AI Feature Store.

To learn more about Vertex AI Feature Store (Legacy), see the Vertex AI Feature Store (Legacy) documentation.

Comparison between Vertex AI Feature Store and Vertex AI Feature Store (Legacy)

The following table compares the various aspects of Vertex AI Feature Store (Legacy) and the new Vertex AI Feature Store:

Category Vertex AI Feature Store Vertex AI Feature Store (Legacy)
Data models
Resource hierarchy (online and offline store) The resource hierarchy in the online store is as follows: FeatureOnlineStore -> FeatureView
  • FeatureOnlineStore contains the configuration parameters for online storage and retrieval only. It can contain multiple FeatureView resources.
  • FeatureView is a logical grouping of features in an online serving request. It's a single resource that replaces entity types and features. Data in a feature view reflects the latest feature values in the BigQuery storage.
There are no offline store resources, since the data resides in BigQuery.
The resource hierarchy is as follows: Featurestore -> EntityType -> Feature
  • Featurestore contains the configuration parameters for both online and offline stores. It can contain multiple EntityType resources.
  • EntityType is a collection of semantically related features. It can have several instances called entities, which can contain multiple Feature resources.
  • Feature is a property or attribute of an EntityType.
Resource hierarchy (feature registry) The resource hierarchy in the feature registry is as follows: FeatureGroup -> Feature
  • FeatureGroup registers the location of the BigQuery data source. It can contain multiple Feature resources.
  • Feature corresponds to a column in the data source registered with the feature group.
A feature registry doesn't exist in Vertex AI Feature Store (Legacy).
Feature management
Online and offline stores You need to create an online store instance and define feature views.
Vertex AI Feature Store doesn't require a separate offline store, because the BigQuery data source constitutes the offline store.
When you provision a feature store, Vertex AI Feature Store (Legacy) creates separate online and offline stores.
Feature import You don't need to import data to offline stores, as the data resides in BigQuery, and you can use it directly for offline needs. For online-serving use cases, you can register a BigQuery table, or view as a feature view, which copies feature data into the online store. Vertex AI Feature Store refreshes the data in the online store during data sync. You need to import feature data into offline and online stores by using batch or streaming import from an external source, such as a BigQuery table or a BigQuery view.
Data movement between online and offline stores Vertex AI Feature Store uses BigQuery as its offline store and copies only the latest feature values to the online store. There is no separate offline store provisioned in Vertex AI. Feature values are copied to the offline storage and subsequently, to the online storage.
Feature serving
Offline serving To interact with the offline store, you need to use BigQuery APIs. The underlying capabilities are the same. To interact with the offline store, which is managed by Vertex AI Feature Store (Legacy), you need to use Vertex AI APIs. Examples of these interactions are Point-in-Time lookups and exporting features.
Online serving

Vertex AI Feature Store provides two types of online serving:

  • Bigtable online serving is similar to the online serving in Vertex AI Feature Store (Legacy), but provides improved caching to mitigate hotspotting. It's useful for large data volumes (terabytes of data).
  • Optimized online serving is suitable for ultra-low latency serving needs.

Each online read request retrieves all the preset features in a feature view without additional processing, which results in lower latencies.

Vertex AI Feature Store (Legacy) provides only one type of online serving. You can specify the entities and features to fetch the feature data.
Interfaces and APIs
Google Cloud console features Use the Google Cloud console to create and manage resources, such as online store instances, feature view instances, feature groups, and features. You can also view the list of online stores and information about feature lineage. Use the Google Cloud console to perform most of the feature management tasks, including resource creation monitoring.
Resource creation APIs Includes APIs to create FeatureOnlineStore, FeatureView, FeatureGroup, and Featureresources. These resources let you set up your feature registry and online store. For the offline store, BigQuery is used. Includes APIs to create Featurestore, EntityType, and Feature resources that are used in the online and offline stores.
Batch import APIs (offline store) Doesn't require APIs for batch import to the offline store, because a separate batch import step to the offline store isn't required. Uses Vertex AI APIs for batch import to the offline store.
Batch import APIs (online store) Periodically copies data from BigQuery to the online store during data sync. Uses Vertex AI APIs for batch import to the online store.
Streaming import APIs (offline store) Doesn't require APIs for streaming import to the offline store, because a separate streaming import step to the offline store isn't required. Uses Vertex AI for streaming import to the offline store.
Streaming import APIs (online store) Streaming import isn't supported. Uses Vertex AI APIs for streaming import to the online store.
Batch serving APIs Uses BigQuery APIs to batch serve data directly from the BigQuery data sources defined in the feature views. Uses Vertex AI APIs to batch serve feature data.
Online serving APIs Uses the FetchFeatureValues(FetchFeatureValuesRequest) API. Uses the ReadFeatureValues(ReadFeatureValuesRequest) API for online serving.

Migrate to Vertex AI Feature Store

Vertex AI Feature Store (Legacy) resources and feature data aren't readily available in Vertex AI Feature Store. If you're an existing user of Vertex AI Feature Store (Legacy) and want to migrate your project to Vertex AI Feature Store, perform the following steps. Note that since the resource hierarchy in Vertex AI Feature Store is different from the resource hierarchy in Vertex AI Feature Store (Legacy), you'll need to manually create the resources based after you migrate the feature data.

  1. If your feature data isn't available in BigQuery already, export the feature data to BigQuery, and create BigQuery tables and views. Follow the Data preparation guidelines when you export and prepare the data. For example:

    • Each feature corresponds to a column. Entity IDs can be a separate column, which you can identify as the ID column.

    • Vertex AI Feature Store doesn't have the EntityType and Entity resources. Provide the feature values for each entity in the row corresponding to the entity ID.

  2. Optional: Register your feature data source by adding feature groups and features. For more information, see Create a feature group and Create a feature.

  3. Set up online serving by creating online store and feature view instances based on the feature data.

What's next?