To configure indexes for similarity searches, you need to configure the following fields.
For instructions on how to configure an index, see Configure index parameters.
NearestNeighborSearch
Fields | |
---|---|
contentsDeltaUri |
Allows inserting, updating or deleting the contents of the
Vector Search If you set this field when calling
|
isCompleteOverwrite |
If this field is set together with
|
config |
The configuration of the Vector Search
|
NearestNeighborSearchConfig
Fields | |
---|---|
dimensions |
Required. The number of dimensions of the input vectors. Used for dense embeddings only. |
approximateNeighborsCount |
Required if tree-AH algorithm is used. The default number of neighbors to find through approximate search before exact reordering is performed. Exact reordering is a procedure where results returned by an approximate search algorithm are reordered using a more expensive distance computation. |
ShardSize |
ShardSize
The size of each shard. When an index is large, it is sharded based on the specified shard size. During serving, each shard is served on a separate node and scales independently. |
distanceMeasureType |
The distance measure used in nearest neighbor search. |
featureNormType |
Type of normalization to be carried out on each vector. |
algorithmConfig |
oneOf:
The configuration for the algorithms that Vector Search uses for efficient search. Used for dense embeddings only.
|
DistanceMeasureType
Enums | |
---|---|
SQUARED_L2_DISTANCE |
Euclidean (L2) Distance |
L1_DISTANCE |
Manhattan (L1) Distance |
DOT_PRODUCT_DISTANCE |
Default value. Defined as a negative of the dot product. |
COSINE_DISTANCE |
Cosine Distance. We strongly suggest using DOT_PRODUCT_DISTANCE + UNIT_L2_NORM instead of the COSINE distance. Our algorithms have been more optimized for the DOT_PRODUCT distance, and when combined with UNIT_L2_NORM, it offers the same ranking and mathematical equivalence as the COSINE distance. |
ShardSize
Enums | |
---|---|
SHARD_SIZE_SMALL |
2 GiB per shard |
SHARD_SIZE_MEDIUM |
20 GiB per shard |
SHARD_SIZE_LARGE |
50 GiB per shard |
FeatureNormType
Enums | |
---|---|
UNIT_L2_NORM |
Unit L2 normalization type. |
NONE |
Default value. No normalization type is specified. |
TreeAhConfig
These are the fields to select for the tree-AH algorithm.
Fields | |
---|---|
fractionLeafNodesToSearch |
double |
The default fraction of leaf nodes that any query may be searched. Must be in range 0.0 - 1.0, exclusive. The default value is 0.05 if not set. | |
leafNodeEmbeddingCount |
int32 |
Number of embeddings on each leaf node. The default value is 1000 if not set. | |
leafNodesToSearchPercent |
int32 |
Deprecated, use fractionLeafNodesToSearch .The default percentage of leaf nodes that any query may be searched. Must be in range 1-100, inclusive. The default value is 10 (means 10%) if not set. |
BruteForceConfig
This option implements the standard linear search in the database for
each query. There are no fields to configure for a brute force search.
To select this algorithm, pass an empty object for BruteForceConfig
to algorithmConfig
.