The following best practices will help you plan and use Vertex Feature Store in various scenarios. This guide is not intended to be exhaustive, but will provide you with an understanding of how you can use Feature Store in your organization.
Modeling features that jointly describe multiple entities
In some cases, a feature might apply to multiple entity types. You
might have a calculated value, for example, clicks per product by a particular
user. This feature jointly describes pairs of products and users for which you
already have separate entity types for products and users. The best practice,
in this case, is to create a separate entity type to group shared features.
You can create an entity type, such as
product-user, to contain
the shared features. For the specific entity IDs, concatenate the IDs
of the individual entities, such as the entity IDs of the individual
product and user. The only requirement is that the IDs must be strings. These
combined entity types are referred to as composite entity types.
For more information, see creating an entity type.
Monitor and tune resources accordingly to optimize batch ingestion
Batch ingestion jobs require workers to process and write data, which can increase the CPU utilization of your featurestore and affect online serving performance. If preserving online serving performance is a priority, start with 1 worker for every 10 online serving nodes. During ingestion, monitor the CPU usage of the online storage. If CPU usage is lower than expected, increase the number of workers for future batch ingestion jobs to increase throughput. If CPU usage is higher than expected, increase the number of online serving nodes to increase CPU capacity or lower the batch ingestion worker count, both of which can lower CPU usage.
If you do increase the number of online serving nodes, note that Feature Store takes roughly 15 minutes to reach optimal performance after you make the update.
For more information about featurestore monitoring, see Cloud Monitoring metrics.
disableOnlineServing field when backfilling historical data
Backfilling is the process of ingesting historical feature values and don't impact the most recent feature values. In this case, you can disable online serving, which skips any changes to the online store. For more information, see the Backfill historical data.
Learn Feature Store best practices for implementing custom-trained ML models on Vertex AI.