Index configuration parameters

To configure indexes for similarity searches, you need to configure the following fields.

For instructions on how to configure an index, see Configure index parameters.

`NearestNeighborSearch`

Fields

Fields
`contentsDeltaUri`	`string` Allows inserting, updating or deleting the contents of the Vector Search `Index`. The string must be a valid Cloud Storage directory path, such as `gs://BUCKET_NAME/PATH_TO_INDEX_DIR/`. If you set this field when calling `IndexService.UpdateIndex`, then no other `Index` field can be also updated as part of the same call. Learn how to structure individual data files.
`isCompleteOverwrite`	`boolean` If this field is set together with `contentsDeltaUri` when calling `IndexService.UpdateIndex`, then existing content of the `Index` will be replaced by the data from the `contentsDeltaUri`. When this field is set to true, the entire index is completely overwritten with the new metadata file that you provide.
`config`	`NearestNeighborSearchConfig` The configuration of the Vector Search `Index`.

contentsDeltaUri

string

Allows inserting, updating or deleting the contents of the Vector Search Index. The string must be a valid Cloud Storage directory path, such as gs://BUCKET_NAME/PATH_TO_INDEX_DIR/.

If you set this field when calling IndexService.UpdateIndex, then no other Index field can be also updated as part of the same call. Learn how to structure individual data files.

isCompleteOverwrite

boolean

If this field is set together with contentsDeltaUri when calling IndexService.UpdateIndex, then existing content of the Index will be replaced by the data from the contentsDeltaUri. When this field is set to true, the entire index is completely overwritten with the new metadata file that you provide.

config

NearestNeighborSearchConfig

The configuration of the Vector Search Index.

`NearestNeighborSearchConfig`

Fields
`dimensions`	`int32` Required. The number of dimensions of the input vectors. Used for dense embeddings only.
`approximateNeighborsCount`	`int32` Required if tree-AH algorithm is used. The default number of neighbors to find through approximate search before exact reordering is performed. Exact reordering is a procedure where results returned by an approximate search algorithm are reordered using a more expensive distance computation.
`ShardSize`	`ShardSize` The size of each shard. When an index is large, it is sharded based on the specified shard size. During serving, each shard is served on a separate node and scales independently.
`distanceMeasureType`	`DistanceMeasureType` The distance measure used in nearest neighbor search.
`featureNormType`	`FeatureNormType` Type of normalization to be carried out on each vector.
`algorithmConfig`	`oneOf:` `TreeAhConfig` `BruteForceConfig` The configuration for the algorithms that Vector Search uses for efficient search. Used for dense embeddings only. `TreeAhConfig`: Configuration options for using the tree-AH algorithm. For more information, see this blog Scaling deep retrieval with TensorFlow Recommenders and Vector Search `BruteForceConfig`: This option implements the standard linear search in the database for each query. There are no fields to configure for a brute force search. To select this algorithm, pass an empty object for `BruteForceConfig`.

`DistanceMeasureType`

Enums
`SQUARED_L2_DISTANCE`	Euclidean (L₂) Distance
`L1_DISTANCE`	Manhattan (L₁) Distance
`DOT_PRODUCT_DISTANCE`	Default value. Defined as a negative of the dot product.
`COSINE_DISTANCE`	Cosine Distance. We strongly suggest using DOT_PRODUCT_DISTANCE + UNIT_L2_NORM instead of the COSINE distance. Our algorithms have been more optimized for the DOT_PRODUCT distance, and when combined with UNIT_L2_NORM, it offers the same ranking and mathematical equivalence as the COSINE distance.

`ShardSize`

Enums
`SHARD_SIZE_SMALL`	2 GiB per shard
`SHARD_SIZE_MEDIUM`	20 GiB per shard
`SHARD_SIZE_LARGE`	50 GiB per shard

`FeatureNormType`

Enums
`UNIT_L2_NORM`	Unit L2 normalization type.
`NONE`	Default value. No normalization type is specified.

`TreeAhConfig`

These are the fields to select for the tree-AH algorithm.

Fields
`fractionLeafNodesToSearch`	`double`
	The default fraction of leaf nodes that any query may be searched. Must be in range 0.0 - 1.0, exclusive. The default value is 0.05 if not set.
`leafNodeEmbeddingCount`	`int32`
	Number of embeddings on each leaf node. The default value is 1000 if not set.
`leafNodesToSearchPercent`	`int32`
	Deprecated, use `fractionLeafNodesToSearch`. The default percentage of leaf nodes that any query may be searched. Must be in range 1-100, inclusive. The default value is 10 (means 10%) if not set.

`BruteForceConfig`

This option implements the standard linear search in the database for each query. There are no fields to configure for a brute force search. To select this algorithm, pass an empty object for BruteForceConfig to algorithmConfig.