Storage-optimized Vector Search

The storage-optimized performance tier for Vector Search is designed for indexing and searching massive datasets. This tier implements a disk-based architecture instead of using RAM, significantly reducing your operational costs. When your priority is cost efficiency at scale as opposed to the lowest possible query latency, the storage-optimized tier is your best choice.

When to use a storage-optimized index

Consider storage-optimized indexes if you have any of the following:

A very large dataset: You must index very large numbers of vectors, and the cost of hosting a large number of performance-optimized shards is prohibitive.
A low-QPS workload: In low-query-volume applications, the cost savings from using fewer shards can be significant.
Flexible latency requirements: Your application can tolerate a minor increase in query latency, which is the time it takes to get a search result.

Performance trade-offs

Compared to the default performance-optimized index, a storage-optimized index has the following characteristics:

Increased query latency: Queries have a slightly higher latency at a given recall level.

How to configure a storage-optimized index

To create an index that is storage optimized, set the shardSize parameter to SHARD_SIZE_SO_DYNAMIC in your index configuration.

Example: Creating a storage-optimized index

The following example shows the metadata for creating a new streaming index that is storage-optimized.

{
  "displayName": "my-storage-optimized-index",
  "description": "An index configured to prioritize storage over performance.",
  "metadata": {
    "contentsDeltaUri": "gs://your-bucket/source-data/",
    "config": {
      "dimensions": 100,
      "approximateNeighborsCount": 150,
      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",
      "shardSize": "SHARD_SIZE_SO_DYNAMIC"
    }
  },
  "indexUpdateMethod": "STREAM_UPDATE"
}

In the example, shardSize is set to SHARD_SIZE_SO_DYNAMIC, which instructs Vector Search to build a denser index. This allows each shard to hold significantly more data points, thereby reducing the total number of shards needed for your dataset. Other fields, such as dimensions and distanceMeasureType, are configured according to your needs.