Configure indexes

Stay organized with collections Save and categorize content based on your preferences.

To configure indexes for similarity searches, you need to configure the following fields.

Fields
contentsDeltaUri

string

Allows inserting, updating or deleting the contents of the Matching Engine Index. The string must be a valid Cloud Storage directory path, such as gs://BUCKET_NAME/PATH_TO_INDEX_DIR/.

If you set this field when calling IndexService.UpdateIndex, then no other Index field can be also updated as part of the same call. Learn how to structure individual data files.

isCompleteOverwrite

boolean

If this field is set together with contentsDeltaUri when calling IndexService.UpdateIndex, then existing content of the Index will be replaced by the data from the contentsDeltaUri.

config

NearestNeighborSearchConfig

The configuration of the Matching Engine Index.

NearestNeighborSearchConfig

Fields
dimensions

int32

Required. The number of dimensions of the input vectors.

approximateNeighborsCount

int32

Required if tree-AH algorithm is used.

The default number of neighbors to find through approximate search before exact reordering is performed. Exact reordering is a procedure where results returned by an approximate search algorithm are reordered via a more expensive distance computation.

distanceMeasureType

DistanceMeasureType

The distance measure used in nearest neighbor search.

featureNormType

FeatureNormType

Type of normalization to be carried out on each vector.

algorithmConfig oneOf:

The configuration for the algorithms that Matching Engine uses for efficient search.

  • TreeAhConfig: Configuration options for using the tree-AH algorithm (Shallow tree + Asymmetric Hashing). Refer to this paper for more details: https://arxiv.org/abs/1908.10396
  • BruteForceConfig: This option implements the standard linear search in the database for each query. There are no fields to configure for a brute force search. To select this algorithm, pass an empty object for BruteForceConfig.

DistanceMeasureType

Enums
SQUARED_L2_DISTANCE Euclidean (L2) Distance
L1_DISTANCE Manhattan (L1) Distance
COSINE_DISTANCE Cosine Distance. Defined as 1 - cosine similarity.
DOT_PRODUCT_DISTANCE Default value. Defined as a negative of the dot product.

FeatureNormType

Enums
UNIT_L2_NORM Unit L2 normalization type.
NONE Default value. No normalization type is specified.

TreeAhConfig

These are the fields to select for the tree-AH algorithm (Shallow tree + Asymmetric Hashing).

Fields
leafNodeEmbeddingCount int32
Number of embeddings on each leaf node. The default value is 1000 if not set.
leafNodesToSearchPercent int32
The default percentage of leaf nodes that any query may be searched. Must be in range 1-100, inclusive. The default value is 10 (means 10%) if not set.

BruteForceConfig

This option implements the standard linear search in the database for each query. There are no fields to configure for a brute force search. To select this algorithm, pass an empty object for BruteForceConfig to algorithmConfig.