To configure indexes for similarity searches, you need to configure the following fields.
For instructions on how to configure an index, see Configure index parameters.
NearestNeighborSearch
| Fields | |
|---|---|
| contentsDeltaUri | 
 
      Allows inserting, updating or deleting the contents of the
      Vector Search  If you set this field when calling
       | 
| isCompleteOverwrite | 
 If this field is set together with
         | 
| config | The configuration of the Vector Search
           | 
NearestNeighborSearchConfig
| Fields | |
|---|---|
| dimensions | 
 Required. The number of dimensions of the input vectors. Used for dense embeddings only. | 
| approximateNeighborsCount | 
 Required if tree-AH algorithm is used. The default number of neighbors to find through approximate search before exact reordering is performed. Exact reordering is a procedure where results returned by an approximate search algorithm are reordered using a more expensive distance computation. | 
| ShardSize | ShardSizeThe size of each shard. When an index is large, it is sharded based on the specified shard size. During serving, each shard is served on a separate node and scales independently. | 
| distanceMeasureType | The distance measure used in nearest neighbor search. | 
| featureNormType | Type of normalization to be carried out on each vector. | 
| algorithmConfig | oneOf:The configuration for the algorithms that Vector Search uses for efficient search. Used for dense embeddings only. 
 | 
DistanceMeasureType
| Enums | |
|---|---|
| SQUARED_L2_DISTANCE | Euclidean (L2) Distance | 
| L1_DISTANCE | Manhattan (L1) Distance | 
| DOT_PRODUCT_DISTANCE | Default value. Defined as a negative of the dot product. | 
| COSINE_DISTANCE | Cosine Distance. We strongly suggest using DOT_PRODUCT_DISTANCE + UNIT_L2_NORM instead of the COSINE distance. Our algorithms have been more optimized for the DOT_PRODUCT distance, and when combined with UNIT_L2_NORM, it offers the same ranking and mathematical equivalence as the COSINE distance. | 
ShardSize
| Enums | |
|---|---|
| SHARD_SIZE_SMALL | 2 GiB per shard | 
| SHARD_SIZE_MEDIUM | 20 GiB per shard | 
| SHARD_SIZE_LARGE | 50 GiB per shard | 
FeatureNormType
| Enums | |
|---|---|
| UNIT_L2_NORM | Unit L2 normalization type. | 
| NONE | Default value. No normalization type is specified. | 
TreeAhConfig
These are the fields to select for the tree-AH algorithm.
| Fields | |
|---|---|
| fractionLeafNodesToSearch | double | 
| The default fraction of leaf nodes that any query may be searched. Must be in range 0.0 - 1.0, exclusive. The default value is 0.05 if not set. | |
| leafNodeEmbeddingCount | int32 | 
| Number of embeddings on each leaf node. The default value is 1000 if not set. | |
| leafNodesToSearchPercent | int32 | 
| Deprecated, use fractionLeafNodesToSearch.The default percentage of leaf nodes that any query may be searched. Must be in range 1-100, inclusive. The default value is 10 (means 10%) if not set. | |
BruteForceConfig
This option implements the standard linear search in the database for
each query. There are no fields to configure for a brute force search.
To select this algorithm, pass an empty object for BruteForceConfig
to algorithmConfig.