Best practices for Vertex AI Feature Store

The following best practices will help you plan and use Vertex AI Feature Store in various scenarios. This guide is not intended to be exhaustive..

Modeling features that jointly describe multiple entities

Some features might apply to multiple entity types. For example, you might have a calculated value that records clicks per product by user. This feature jointly describes product-user pairs.

The best practice, in this case, is to create a separate entity type to group shared features. You can create an entity type, such as product-user, to contain shared features.

For the entity IDs, concatenate the IDs of the individual entities, such as the entity IDs of the individual product and user. The only requirement is that the IDs must be strings. These combined entity types are referred to as composite entity types.

For more information, see creating an entity type.

Use IAM policies to control access across multiple teams

Use IAM roles and policies to set different levels of access to different groups of users. For example, ML researchers, data scientists, DevOps, and site reliability engineers all require access to the same featurestore, but their level of access can differ. For example, DevOps users might require permissions to manage a featurestore, but they don't require access to the contents of the featurestore.

You can also restrict access to a particular featurestore or entity type by using resource-level IAM policies.

As an example, imagine that your organization includes the following personas. Because each persona requires a different level of access, each persona is assigned a different predefined IAM role. You can also create and use your own custom roles.

Persona Description Predefined role
ML researcher or business analyst Users who only view data on specific entity types roles/aiplatform.featurestoreDataViewer (can be granted at the project or resource level)
Data scientists or data engineers Users who work with specific entity type resources. For the resources they own, they can delegate access to other principals. roles/aiplatform.entityTypeOwner (can be granted at the project or resource level)
IT or DevOps Users who must maintain and tune the performance of specific featurestores but don't require access to the data. roles/aiplatform.featurestoreInstanceCreator (can be granted at the project or resource level)
Automated data import pipeline Applications that write data to specific entity types. roles/aiplatform.featurestoreDataWriter (can be granted at the project or resource level)
Site reliability engineer Users who manage particular featurestores or all featurestores in a project roles/aiplatform.featurestoreAdmin (can be granted at the project or resource level)
Global (any Vertex AI Feature Store user)

Allow users to view and search for existing features. If they find a feature they want to work with, they can request access from the feature owners.

For Google Cloud console users, this role is also required to view the Vertex AI Feature Store landing page, ingestion jobs page, and batch serving jobs page.

Grant the roles/aiplatform.featurestoreResourceViewer role at the project level.

Monitor and tune resources accordingly to optimize batch ingestion

Batch ingestion jobs require workers to process and write data, which can increase the CPU utilization of your featurestore and affect online serving performance. If preserving online serving performance is a priority, start with one worker for every ten online serving nodes. During ingestion, monitor the CPU usage of the online storage. If CPU usage is lower than expected, increase the number of workers for future batch ingestion jobs to increase throughput. If CPU usage is higher than expected, increase the number of online serving nodes to increase CPU capacity or lower the batch ingestion worker count, both of which can lower CPU usage.

If you do increase the number of online serving nodes, note that Vertex AI Feature Store takes roughly 15 minutes to reach optimal performance after you make the update.

For more information, see updating a featurestore and batch ingesting feature values.

For more information about featurestore monitoring, see Cloud Monitoring metrics.

Use the disableOnlineServing field when backfilling historical data

Backfilling is the process of ingesting historical feature values and don't impact the most recent feature values. In this case, you can disable online serving, which skips any changes to the online store. For more information, see Backfill historical data.

What's next

Learn Vertex AI Feature Store best practices for implementing custom-trained ML models on Vertex AI.