After you've created the index, you can run queries to get its nearest neighbors.
Each DeployedIndex
has a DEPLOYED_INDEX_SERVER_IP
, which you can retrieve by
listing IndexEndpoints
. To query a DeployedIndex
, connect to its DEPLOYED_INDEX_SERVER_IP
at port 10000
and call the Match
or
BatchMatch
method.
The following examples use the open source tool grpc_cli
to send grpc requests to the deployed index server.
In the first example, you send a single query using the Match
method
./grpc_cli call ${DEPLOYED_INDEX_SERVER_IP}:10000 google.cloud.aiplatform.container.v1.MatchService.Match '{deployed_index_id: "${DEPLOYED_INDEX_ID}", float_val: [-0.1,..]}'
In the second example, you combine two separate queries into the same BatchMatch
request.
./grpc_cli call ${DEPLOYED_INDEX_SERVER_IP}:10000 google.cloud.aiplatform.container.v1.MatchService.BatchMatch 'requests: [{deployed_index_id: "${DEPLOYED_INDEX_ID}", requests: [{deployed_index_id: "${DEPLOYED_INDEX_ID}", float_val: [-0.1,..]}, {deployed_index_id: "${DEPLOYED_INDEX_ID}", float_val: [-0.2,..]}]}]'
You must make calls to these APIs from a client running in the same VPC that the service was peered with.
To run these queries, you can also use the Python Cloud Client Library for Vertex AI. To learn more, see Client libraries explained
Tuning the index
Tuning the index requires setting the configuration parameters that impact the performance of deployed indexes, especially recall and latency. These parameters are set when you first create the index. You can use brute force indexes to measure recall.
Configuration parameters that impact recall and latency
distanceMeasureType
The following values are supported:
SQUARED_L2_DISTANCE
: Euclidean L2 distanceL1_DISTANCE
: Manhattan L1 distanceCOSINE_DISTANCE
: Cosine distance defined as '1 - cosine similarity'DOT_PRODUCT_DISTANCE
: vDot product distance, defined as a negative of the dot product. This is the default value.
In most cases, the embedding vectors used for similarity matching are computed by using metric learning models (also called Siamese networks or two-tower models). These models use a distance metric to compute the contrastive loss function. Ideally, the value of the
distanceMeasureType
parameter for the matching index matches the distance measure used by the model that produced the embedding vectors.approximateNeighborsCount
The default number of neighbors to find by using approximate search before exact reordering is performed. Exact reordering is a procedure where results returned by an approximate search algorithm are reordered by a more expensive distance computation. Increasing this value increases recall, which can create a proportionate increase in latency.
treeAhConfig.leafNodesToSearchPercent
The percentage of leaves to be searched for each query. Increasing this value increases recall, which can also create a proportionate increase in latency. The default value is
10
or 10% of the leaves.treeAhConfig.leafNodeEmbeddingCount
The number of embeddings for each leaf node. By default, this number is set to
1000
.This parameter does not have a linear correlation to recall. Increasing or decreasing the value of the
treeAhConfig.leafNodeEmbeddingCount
parameter doesn't always increase or decrease recall. Experiment to find the optimal value. Changing the value of thetreeAhConfig.leafNodeEmbeddingCount
parameter generally has less affect than changing the value of the other parameters.
Using a brute force index to measure recall
To get the exact nearest neighbors, use indexes with the brute force algorithm. The brute force algorithm provides 100% recall at the expense of higher latency. Using a brute force index to measure recall is usually not a good choice for production serving, but you might find it useful for evaluating the recall of various indexing options offline.
To create an index with the brute force algorithm, specify
brute_force_config
in the index metadata:
curl -X POST -H "Content-Type: application/json" \ -H "Authorization: Bearer `gcloud auth print-access-token`" \ https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/indexes \ -d '{ displayName: "'${DISPLAY_NAME}'", description: "'${DESCRIPTION}'", metadata: { contentsDeltaUri: "'${INPUT_DIR}'", config: { dimensions: 100, approximateNeighborsCount: 150, distanceMeasureType: "DOT_PRODUCT_DISTANCE", featureNormType: "UNIT_L2_NORM", algorithmConfig: { bruteForceConfig: {} } }, }, }'