Vertex Feature Store (Feature Store) provides a centralized repository for organizing, storing, and serving ML features. By using a central featurestore, your organization can efficiently share, discover, and re-use ML features at scale, which can increase the velocity of your organization for developing and deploying new ML applications. Feature Store is a fully managed solution; it manages and scales the underlying infrastructure for you such as storage and compute resources. This solution means that your data scientists can focus on the feature computation logic instead of worrying about the challenges of deploying features into production.
Feature Store is an integrated part of Vertex AI. You can use Feature Store independently or as part of Vertex AI workflows. For example, you can fetch data from Feature Store to train custom or AutoML models in Vertex AI.
Use Feature Store to create and manage resources, such as a featurestore. A featurestore is a top-level container for your features and their values. As soon as you set up a featurestore, permitted users can add and share their features without additional engineering support. Users can define features and then ingest (import) feature values from various data sources.
Any permitted user can search and retrieve values from the featurestore. For example, users can find features and then do a batch export to get training data for ML model creation. Users can also retrieve feature values in real-time to perform fast online predictions.
Prior to using Feature Store, you might have computed feature values and saved them in various locations such as tables in BigQuery or as files in Cloud Storage. Moreover, you might have built and managed separate solutions for storage and the consumption of feature values. In contrast, Feature Store provides a unified solution for batch and online storage as well as the serving of ML features. The following sections details the benefits that Feature Store provides.
Share features across your organization
If you produce features in a featurestore, you can quickly share them with others for training or serving tasks. Teams don't need to re-engineer features for different projects or use cases. Also, because you can manage and serve features from a central repository, this ensures consistency across your organization and reduces duplicate efforts, particularly for high value features.
Feature Store provides search and filter capabilities so that others can easily discover and reuse existing features. For each feature, users can view relevant metadata to determine the quality and usage patterns of the feature. For example, users can view the fraction of entities that have a valid value for a feature (also known as feature coverage), the statistical distribution of feature values, and the frequency of feature updates.
Managed solution for online serving at scale
Feature Store provides a managed solution for online feature serving (low-latency serving), which is critical for making timely online predictions. You do not need to build and operate low-latency data serving infrastructure; Feature Store does this for you and scales as needed. You code the logic to generate features but offload the task of serving features. All of this included management reduces the friction for building new features, enabling data scientists to do their work without worrying about deployment.
Mitigate training-serving skew
Training-serving skew occurs when the feature data distribution that you use in production differs from the feature data distribution that was used to train your model. This skew often results in discrepancies between a model's performance during training and its performance in production. The following examples describe how Feature Store can address potential sources of training-serving skew:
- Feature Store ensures that a feature value is ingested once into a featurestore and that same value is reused for both training and serving. Without a featurestore, you might have different code paths for generating features between training and serving. Consequently, feature values might differ for training and serving.
- Feature Store provides point-in-time lookups to fetch historical data for training. With these lookups, you can fetch only the feature values that were available prior to a prediction and not after, mitigating potential data leakage.
Quotas and limits
Feature Store enforces quotas and limits to help you manage resources by setting your own usage limits and to protect the community of Google Cloud users by preventing unforeseen spikes in usage. To prevent you from hitting unplanned constraints, review Feature Store quotas on the Quotas and limits page. For example, Feature Store sets a quota on the number of online serving nodes and a quota on the number of online serving request that you can make per minute.
- Learn about the Feature Store data model and its resources.
- Learn how to set up a project and set Identity and Access Management permissions for Feature Store.
- View Feature Store quotas on the Quotas and limits page.
- View Feature Store pricing on the Pricing page.