The following sections describe how to configure, create, list, and delete your indexes. For more information, see the API index documentation.
Configure index parameters
Before you create an index, you need to configure the parameters for your index.
For example, create a file named index_metadata.json
:
{ "contentsDeltaUri": "gs://BUCKET_NAME/path", "config": { "dimensions": 100, "approximateNeighborsCount": 150, "distanceMeasureType": "DOT_PRODUCT_DISTANCE", "shardSize": "SHARD_SIZE_MEDIUM", "algorithm_config": { "treeAhConfig": { "leafNodeEmbeddingCount": 5000, "leafNodesToSearchPercent": 3 } } } }
You can find the definition for each of these fields in Configuring indexes, or view the definitions within the following schema:
title: NearestNeighborSearch
type: object
properties:
contentsDeltaUri:
type: string
description: >
Allows inserting, updating or deleting the contents of the index.
The string must be a valid Cloud Storage directory path. If this
field is set when calling IndexService.UpdateIndex, then no other
Index field can be also updated as part of the same call.
The expected structure and format of the files this URI points to is
described at
cloud.google.com/vertex-ai/docs/vector-search/setup/format-structure
writeOnly: true
isCompleteOverwrite:
type: boolean
description: >
If this field is set together with contentsDeltaUri when calling IndexService.UpdateIndex,
then existing content of the Index will be replaced by the data from the contentsDeltaUri.
default: false
config:
type: object
description: >
The configuration of the index.
required:
- dimensions
- algorithmConfig
properties:
dimensions:
type: integer
format: int32
description: >
The number of dimensions of the input vectors.
approximateNeighborsCount:
type: integer
format: int32
description: >
The default number of neighbors to find via approximate search before exact reordering is
performed. Exact reordering is a procedure where results returned by an
approximate search algorithm are reordered via a more expensive distance computation.
Required if tree-AH algorithm is used.
distanceMeasureType:
description: >
The distance measure used in nearest neighbor search.
oneOf:
- enum: [SQUARED_L2_DISTANCE]
description: >
Euclidean (L_2) Distance
- enum: [L1_DISTANCE]
description: >
Manhattan (L_1) Distance
- enum: [COSINE_DISTANCE]
description: >
Cosine Distance. Defined as 1 - cosine similarity.
- enum: [DOT_PRODUCT_DISTANCE]
description: >
Dot Product Distance. Defined as a negative of the dot product
default: DOT_PRODUCT_DISTANCE
featureNormType:
description: >
Type of normalization to be carried out on each vector.
oneOf:
- enum: [UNIT_L2_NORM]
description: >
Unit L2 normalization type.
- enum: [NONE]
description: >
No normalization type is specified.
default: NONE
algorithmConfig:
description: >
The configuration with regard to the algorithms used for efficient search.
oneOf:
- type: object
description: >
Configuration options for using the tree-AH algorithm (Shallow tree + Asymmetric Hashing).
Refer to this paper for more details: https://arxiv.org/abs/1908.10396
properties:
type:
type: string
enum: [treeAhConfig]
leafNodeEmbeddingCount:
type: integer
format: int64
description: >
Number of embeddings on each leaf node. The default value is 1000 if not set.
leafNodesToSearchPercent:
type: number
format: int32
description: >
The default percentage of leaf nodes that any query may be searched. Must be in
range 1-100, inclusive. The default value is 10 (means 10%) if not set.
- type: object
description: >
Configuration options for using brute force search, which simply implements the
standard linear search in the database for each query.
properties:
type:
type: string
enum: [bruteForceConfig]
discriminator:
propertyName: type
This metadata schema file is available to download from Cloud Storage.
Create an index
Index data is split into equal parts to be processed. These are called "shards". When you create an index you must specify the shard size. Once you create the index, you can determine what machine type to use when you deploy your index. To learn more about the types of shard sizes available, and their corresponding prices, see the pricing page.
Shard Size | Default Machine type |
---|---|
Small (2GB) | e2-standard-2 |
Medium (20GB) | e2-standard-16 |
Large (50GB) | e2-highmem-16 |
Create an index for Batch Update
To create an index:
gcloud
gcloud ai indexes create \
--metadata-file=LOCAL_PATH_TO_METADATA_FILE \
--display-name=INDEX_NAME \
--project=PROJECT_ID \
--region=LOCATION
Replace the following:
- LOCAL_PATH_TO_METADATA_FILE: The local path to the metadata file.
- INDEX_NAME: Display name for the index.
- PROJECT_ID: The ID of the project.
- LOCATION: The region where you are using Vertex AI.
REST
Before using any of the request data, make the following replacements:
- LOCATION: Your region.
- PROJECT: Your project ID.
- INDEX_NAME: Display name for the index.
- INPUT_DIR: The Cloud Storage directory path of the index content.
- PROJECT_NUMBER: Project number for your project
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/indexes
Request JSON body:
{ "display_name": "INDEX_NAME", "metadata": { "contentsDeltaUri": "INPUT_DIR", "config": { "dimensions": 100, "approximateNeighborsCount": 150, "distanceMeasureType": "DOT_PRODUCT_DISTANCE", "algorithm_config": { "treeAhConfig": { "leafNodeEmbeddingCount": 500, "leafNodesToSearchPercent": 7 } } } } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateIndexOperationMetadata", "genericMetadata": { "createTime": "2022-01-08T01:21:10.147035Z", "updateTime": "2022-01-08T01:21:10.147035Z" } } }
"done": true
. Use the example command below to poll the status.
gcloud beta ai operations describe <operation-id> <index-id> --project=<project-id> --region=us-west1
Create an index for Streaming Updates
To create an index available
for Streaming Updates requires similar steps to setting up a Batch Update index,
except you need to set indexUpdateMethod
to STREAM_UPDATE
.
INPUT_GCS_DIR=
DIMENSIONS=
DISPLAY_NAME=
curl -X POST -H "Content-Type: application/json" \
-H "Authorization: Bearer `gcloud auth print-access-token`" \
https://${ENDPOINT}/v1/projects/${PROJECT_ID}/locations/${REGION}/indexes \
-d '{
displayName: "'${DISPLAY_NAME}'",
description: "'${DISPLAY_NAME}'",
metadata: {
contentsDeltaUri: "'${INPUT_GCS_DIR}'",
config: {
dimensions: "'${DIMENSIONS}'",
approximateNeighborsCount: 150,
distanceMeasureType: "DOT_PRODUCT_DISTANCE",
algorithmConfig: {treeAhConfig: {leafNodeEmbeddingCount: 10000, leafNodesToSearchPercent: 2}}
},
},
indexUpdateMethod: "STREAM_UPDATE"
}'
After you've created the index, you can verify the update method using the following command:
curl -H "Authorization: Bearer `gcloud auth print-access-token`" -H "Content-Type: application/json" https://${ENDPOINT}/v1/projects/${PROJECT_ID}/locations/${REGION}/indexes/${INDEX_ID}
{
"name": "projects/${PROJECT_NUMBER}/locations/${REGION}/indexes/${INDEX_ID}",
"displayName": "...",
"description": "...",
"metadataSchemaUri": "gs://google-cloud-aiplatform/schema/matchingengine/metadata/nearest_neighbor_search_1.0.0.yaml",
"metadata": {
"config": {
"dimensions": 100,
"approximateNeighborsCount": 150,
"distanceMeasureType": "DOT_PRODUCT_DISTANCE",
"algorithmConfig": {
"treeAhConfig": {
"leafNodeEmbeddingCount": "10000",
"leafNodesToSearchPercent": 2
}
}
}
},
"etag": "...",
"createTime": "2022-03-16T04:57:29.344329Z",
"updateTime": "2022-03-16T22:20:37.406393Z",
"indexUpdateMethod": "STREAM_UPDATE"
}
List indexes
gcloud
Use the gcloud ai indexes list
command:
gcloud ai indexes list \
--project=PROJECT_ID \
--region=LOCATION
Replace the following:
- PROJECT_ID: The ID of the project.
- LOCATION: The region where you are using Vertex AI.
REST
Before using any of the request data, make the following replacements:
- LOCATION: Your region.
- PROJECT: Your project ID.
- INDEX_NAME: Display name for the index.
- PROJECT_NUMBER: Project number for your project
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/indexes
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "indexes": [ { "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID", "displayName": "INDEX_NAME", "metadataSchemaUri": "gs://google-cloud-aiplatform/schema/matchingengine/metadata/nearest_neighbor_search_1.0.0.yaml", "metadata": { "config": { "dimensions": 100, "approximateNeighborsCount": 150, "distanceMeasureType": "DOT_PRODUCT_DISTANCE", "featureNormType": "NONE", "algorithmConfig": { "treeAhConfig": { "maxLeavesToSearch": 50, "leafNodeCount": 10000 } } } }, "etag": "AMEw9yNU8YX5IvwuINeBkVv3yNa7VGKk11GBQ8GkfRoVvO7LgRUeOo0qobYWuU9DiEc=", "createTime": "2020-11-08T21:56:30.558449Z", "updateTime": "2020-11-08T22:39:25.048623Z" } ] }
Tuning the index
Tuning the index requires setting the configuration parameters that impact the performance of deployed indexes, especially recall and latency. These parameters are set when you first create the index. You can use brute-force indexes to measure recall.
Configuration parameters that impact recall and latency
distanceMeasureType
The following values are supported:
SQUARED_L2_DISTANCE
: Euclidean L2 distanceL1_DISTANCE
: Manhattan L1 distanceCOSINE_DISTANCE
: Cosine distance defined as '1 - cosine similarity'DOT_PRODUCT_DISTANCE
: vDot product distance, defined as a negative of the dot product. This is the default value.
In most cases, the embedding vectors used for similarity matching are computed by using metric learning models (also called Siamese networks or two-tower models). These models use a distance metric to compute the contrastive loss function. Ideally, the value of the
distanceMeasureType
parameter for the matching index matches the distance measure used by the model that produced the embedding vectors.approximateNeighborsCount
The default number of neighbors to find by using approximate search before exact reordering is performed. Exact reordering is a procedure where results returned by an approximate search algorithm are reordered by a more expensive distance computation. Increasing this value increases recall, which can create a proportionate increase in latency.
treeAhConfig.leafNodesToSearchPercent
The percentage of leaves to be searched for each query. Increasing this value increases recall, which can also create a proportionate increase in latency. The default value is
10
or 10% of the leaves.treeAhConfig.leafNodeEmbeddingCount
The number of embeddings for each leaf node. By default, this number is set to
1000
.This parameter does not have a linear correlation to recall. Increasing or decreasing the value of the
treeAhConfig.leafNodeEmbeddingCount
parameter doesn't always increase or decrease recall. Experiment to find the optimal value. Changing the value of thetreeAhConfig.leafNodeEmbeddingCount
parameter generally has less affect than changing the value of the other parameters.
Using a brute-force index to measure recall
To get the exact nearest neighbors, use indexes with the brute-force algorithm. The brute-force algorithm provides 100% recall at the expense of higher latency. Using a brute-force index to measure recall is usually not a good choice for production serving, but you might find it useful for evaluating the recall of various indexing options offline.
To create an index with the brute-force algorithm, specify
brute_force_config
in the index metadata:
curl -X POST -H "Content-Type: application/json" \ -H "Authorization: Bearer `gcloud auth print-access-token`" \ https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/indexes \ -d '{ displayName: "'${DISPLAY_NAME}'", description: "'${DESCRIPTION}'", metadata: { contentsDeltaUri: "'${INPUT_DIR}'", config: { dimensions: 100, approximateNeighborsCount: 150, distanceMeasureType: "DOT_PRODUCT_DISTANCE", featureNormType: "UNIT_L2_NORM", algorithmConfig: { bruteForceConfig: {} } }, }, }'
Delete an index
You can't delete the Index
until all its Index.deployed_indexes
have
been undeployed.
gcloud
Use the gcloud ai indexes delete
command:
gcloud ai indexes delete INDEX_ID \
--project=PROJECT_ID \
--region=LOCATION
Replace the following:
- INDEX_ID: The ID of the index.
- PROJECT_ID: The ID of the project.
- LOCATION: The region where you are using Vertex AI.
REST
Before using any of the request data, make the following replacements:
- LOCATION: Your region.
- PROJECT: Your project ID.
- INDEX_ID: The ID of the index.
- PROJECT_NUMBER: Project number for your project
HTTP method and URL:
DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/indexes/INDEX_ID
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeleteOperationMetadata", "genericMetadata": { "createTime": "2022-01-08T02:35:56.364956Z", "updateTime": "2022-01-08T02:35:56.364956Z" } }, "done": true, "response": { "@type": "type.googleapis.com/google.protobuf.Empty" } }
What's next
- Learn how to Configure indexes
- Learn how to Deploy an index
- Learn how to Update and rebuild your index
- Learn how to Monitor an index