The following sections describe how to configure, create, list, and delete your indexes.
Index overview
An index is a file or files consisting of your embedding vectors. These vectors are made from large amounts of data you want to deploy and query with Vector Search. With Vector Search, you can create two types of indexes, depending on how you plan to update them with your data. You can create an index designed for batch updates, or an index designed for streaming your updates.
A batch index is for when you want to update your index in a batch, with data which has been stored over a set amount of time, like systems which are processed weekly or monthly. A streaming index is when you want index data to be updated as new data is added to your datastore, for instance, if you have a bookstore and want to show new inventory online as soon as possible. Which type you choose is important, since setup and requirements are different.
Configure index parameters
Before you create an index, configure the parameters for your index.
For example, create a file named index_metadata.json
:
{ "contentsDeltaUri": "gs://BUCKET_NAME/path", "config": { "dimensions": 100, "approximateNeighborsCount": 150, "distanceMeasureType": "DOT_PRODUCT_DISTANCE", "shardSize": "SHARD_SIZE_MEDIUM", "algorithm_config": { "treeAhConfig": { "leafNodeEmbeddingCount": 5000, "fractionLeafNodesToSearch": 0.03 } } } }
You can find the definition for each of these fields in Index configuration parameters.
Create an index
Index size
Index data is split into equal parts called shards for processing. When you create an index, you must specify the size of the shards to use. The supported sizes are as follows:
SHARD_SIZE_SMALL
: 2 GiB per shard.SHARD_SIZE_MEDIUM
: 20 GiB per shard.SHARD_SIZE_LARGE
: 50 GiB per shard.
The machine types that you can use to deploy your index (using public endpoints or using VPC endpoints) depends on the shard size of the index. The following table shows the shard sizes that each machine type supports:
Machine type | SHARD_SIZE_SMALL |
SHARD_SIZE_MEDIUM |
SHARD_SIZE_LARGE |
---|---|---|---|
n1-standard-16 |
|||
n1-standard-32 |
|||
e2-standard-2 |
(default) | ||
e2-standard-16 |
(default) | ||
e2-highmem-16 |
(default) | ||
n2d-standard-32 |
To learn how shard size and machine type affect pricing, see the Vertex AI pricing page.
Create an index for batch update
Use these instructions to create and deploy your index. If you don't have your embeddings ready yet, you can skip to Create an empty batch index. With this option, no embeddings data is required at index creation time.
To create an index:
gcloud
Before using any of the command data below, make the following replacements:
- LOCAL_PATH_TO_METADATA_FILE: The local path to the metadata file.
- INDEX_NAME: Display name for the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai indexes create \ --metadata-file=LOCAL_PATH_TO_METADATA_FILE \ --display-name=INDEX_NAME \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai indexes create ` --metadata-file=LOCAL_PATH_TO_METADATA_FILE ` --display-name=INDEX_NAME ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai indexes create ^ --metadata-file=LOCAL_PATH_TO_METADATA_FILE ^ --display-name=INDEX_NAME ^ --region=LOCATION ^ --project=PROJECT_ID
You should receive a response similar to the following:
You can poll for the status of the operation for the response to include "done": true. Use the following example to poll the status. $ gcloud ai operations describe 1234567890123456789 --project=my-test-project --region=us-central1
See gcloud ai operations to learn
more about the describe
command.
REST
Before using any of the request data, make the following replacements:
- INPUT_DIR: The Cloud Storage directory path of the index content.
- INDEX_NAME: Display name for the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes
Request JSON body:
{ "display_name": "INDEX_NAME", "metadata": { "contentsDeltaUri": "INPUT_DIR", "config": { "dimensions": 100, "approximateNeighborsCount": 150, "distanceMeasureType": "DOT_PRODUCT_DISTANCE", "algorithm_config": { "treeAhConfig": { "leafNodeEmbeddingCount": 500, "leafNodesToSearchPercent": 7 } } } } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateIndexOperationMetadata", "genericMetadata": { "createTime": "2022-01-08T01:21:10.147035Z", "updateTime": "2022-01-08T01:21:10.147035Z" } } }
Terraform
The following sample uses the google_vertex_ai_index
Terraform resource to create an index for batch updates.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
Use these instructions to create an index for batch updates.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- Click Create new index to open the Index pane. The Create a new index pane appears.
- In the Display name field, provide a name to uniquely identify your index.
- In the Description field, provide a description for what the index is for.
- In the Region field, select a region from the drop-down.
- In the Cloud Storage field, search and select the Cloud Storage folder where your vector data is stored.
- In the Algorithm type drop-down, select the algorithm type that Vector Search uses for efficient search. If you select the treeAh algorithm, enter the approximate neighbors count.
- In the Dimensions field, enter the number of dimensions of your input vectors.
- In the Update method field, select Batch.
- In the Shard size field, select from the drop-down the shard size you want.
- Click Create. Your new index appears in your list of indexes once it's ready. Note: Build time can take up to an hour to complete.
Create an empty batch index
To create and deploy your index right away, you can create an empty batch index. With this option, no embeddings data is required at index creation time.
To create an empty index, the request is almost identical to creating an index
for batch updates. The difference is you remove the contentsDeltaUri
field,
since you aren't linking a data location. Here's an empty batch index example:
Empty index request example
{ "display_name": INDEX_NAME, "indexUpdateMethod": "BATCH_UPDATE", "metadata": { "config": { "dimensions": 100, "approximateNeighborsCount": 150, "distanceMeasureType": "DOT_PRODUCT_DISTANCE", "algorithm_config": { "treeAhConfig": { "leafNodeEmbeddingCount": 500, "leafNodesToSearchPercent": 7 } } } } }
Create an index for streaming updates
Use these instructions to create and deploy your streaming index. If you don't have your embeddings ready yet, skip to Create an empty index for streaming updates. With this option, no embeddings data is required at index creation time.
REST
Before using any of the request data, make the following replacements:
- INDEX_NAME: Display name for the index.
- DESCRIPTION: A description of the index.
- INPUT_DIR: The Cloud Storage directory path of the index content.
- DIMENSIONS: Number of dimensions of the embedding vector.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
- LOCATION: The region where you are using Vertex AI.
HTTP method and URL:
POST https://ENDPOINT-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes
Request JSON body:
{ displayName: "INDEX_NAME", description: "DESCRIPTION", metadata: { contentsDeltaUri: "INPUT_DIR", config: { dimensions: "DIMENSIONS", approximateNeighborsCount: 150, distanceMeasureType: "DOT_PRODUCT_DISTANCE", algorithmConfig: {treeAhConfig: {leafNodeEmbeddingCount: 10000, leafNodesToSearchPercent: 2}} }, }, indexUpdateMethod: "STREAM_UPDATE" }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.ui.CreateIndexOperationMetadata", "genericMetadata": { "createTime": "2023-12-05T23:17:45.416117Z", "updateTime": "2023-12-05T23:17:45.416117Z", "state": "RUNNING", "worksOn": [ "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID" ] } } }
Terraform
The following sample uses the google_vertex_ai_index
Terraform resource to create an index for streaming updates.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
Use these instructions to create an index for streaming updates in the Google Cloud console.
To create an index available
for Streaming Updates requires similar steps to setting up a Batch Update index,
except you need to set indexUpdateMethod
to STREAM_UPDATE
.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- Click Create new index to open the Index pane. The Create a new index pane appears.
- In the Display name field, provide a name to uniquely identify your index.
- In the Description field, provide a description for what the index is for.
- In the Region field, select a region from the drop-down.
- In the Cloud Storage field, search and select the Cloud Storage folder where your vector data is stored.
- In the Algorithm type drop-down, select the algorithm type that Vector Search will use to perform your search. If you select the treeAh algorithm, enter the approximate neighbors count.
- In the Dimensions field, enter the number of dimensions of your input vectors.
- In the Update method field, select Stream.
- In the Shard size field, select from the drop-down the shard size you want.
- Click Create. Your new index appears in your list of indexes once it's ready. Note: Build time can take up to an hour to complete.
Create an empty index for streaming updates
To create and deploy your index right away, you can create an empty index for streaming. With this option, no embeddings data is required at index creation time.
To create an empty index, the request is almost identical to creating an index
for streaming. The difference is you remove the contentsDeltaUri
field,
since you aren't linking a data location. Here's an empty streaming index example:
Empty index request example
{ "display_name": INDEX_NAME, "indexUpdateMethod": "STREAM_UPDATE", "metadata": { "config": { "dimensions": 100, "approximateNeighborsCount": 150, "distanceMeasureType": "DOT_PRODUCT_DISTANCE", "algorithm_config": { "treeAhConfig": { "leafNodeEmbeddingCount": 500, "leafNodesToSearchPercent": 7 } } } } }
List indexes
gcloud
Before using any of the command data below, make the following replacements:
- INDEX_NAME: Display name for the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai indexes list \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai indexes list ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai indexes list ^ --region=LOCATION ^ --project=PROJECT_ID
You should receive a response similar to the following:
You can poll for the status of the operation for the response to include "done": true. Use the following example to poll the status. $ gcloud ai operations describe 1234567890123456789 --project=my-test-project --region=us-central1
See gcloud ai operations to learn
more about the describe
command.
REST
Before using any of the request data, make the following replacements:
- INDEX_NAME: Display name for the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "indexes": [ { "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID", "displayName": "INDEX_NAME", "metadataSchemaUri": "gs://google-cloud-aiplatform/schema/matchingengine/metadata/nearest_neighbor_search_1.0.0.yaml", "metadata": { "config": { "dimensions": 100, "approximateNeighborsCount": 150, "distanceMeasureType": "DOT_PRODUCT_DISTANCE", "featureNormType": "NONE", "algorithmConfig": { "treeAhConfig": { "maxLeavesToSearch": 50, "leafNodeCount": 10000 } } } }, "etag": "AMEw9yNU8YX5IvwuINeBkVv3yNa7VGKk11GBQ8GkfRoVvO7LgRUeOo0qobYWuU9DiEc=", "createTime": "2020-11-08T21:56:30.558449Z", "updateTime": "2020-11-08T22:39:25.048623Z" } ] }
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
Use these instructions to view a list of your indexes.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- A list of your active indexes is displayed.
Tuning the index
Tuning the index requires setting the configuration parameters that impact the performance of deployed indexes, especially recall and latency. These parameters are set when you first create the index. You can use brute-force indexes to measure recall.
Configuration parameters that impact performance
The following configuration parameters can be set at index creation time and can affect recall, latency, availability, and cost when using Vector Search. This guidance applies to most cases. However, always experiment with your configurations to make sure that they work for your use case.
For parameter definitions, see Index configuration parameters.
Parameter | About | Performance impact |
---|---|---|
shardSize
|
Controls the amount of data on each machine. When choosing a shard size, estimate how large your dataset will be in the future. If the size of your dataset has an upper bound, pick the appropriate shard size to accommodate it. If there is no upper bound or if your use case is extremely sensitive to latency variability, choosing a large shard size is recommended. |
If you configure for a larger number of smaller shards, a larger number of candidate results are processed during search. More shards can affect performance in the following ways:
If you configure for a smaller number of larger shards, fewer candidate results are processed during search. Fewer shards can affect performance in the following ways:
|
distanceMeasureType
|
Determines the algorithm used for distance calculation between data points and the query vector. |
The following
|
leafNodeEmbeddingCount
|
The number of embeddings for each leaf node. By default, this number is set to 1000.
Generally, changing the value of |
Increasing the number of embeddings for each leaf node can reduce latency but reduce recall quality. It can affect performance in the following ways:
Decreasing the number of embeddings for each leaf node can affect performance in the following ways:
|
Using a brute-force index to measure recall
To get the exact nearest neighbors, use indexes with the brute-force algorithm. The brute-force algorithm provides 100% recall at the expense of higher latency. Using a brute-force index to measure recall is usually not a good choice for production serving, but you might find it useful for evaluating the recall of various indexing options offline.
To create an index with the brute-force algorithm, specify
brute_force_config
in the index metadata:
curl -X POST -H "Content-Type: application/json" \ -H "Authorization: Bearer `gcloud auth print-access-token`" \ https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/indexes \ -d '{ displayName: "'${DISPLAY_NAME}'", description: "'${DESCRIPTION}'", metadata: { contentsDeltaUri: "'${INPUT_DIR}'", config: { dimensions: 100, approximateNeighborsCount: 150, distanceMeasureType: "DOT_PRODUCT_DISTANCE", featureNormType: "UNIT_L2_NORM", algorithmConfig: { bruteForceConfig: {} } }, }, }'
Delete an index
gcloud
Before using any of the command data below, make the following replacements:
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai indexes delete INDEX_ID \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai indexes delete INDEX_ID ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai indexes delete INDEX_ID ^ --region=LOCATION ^ --project=PROJECT_ID
REST
Before using any of the request data, make the following replacements:
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeleteOperationMetadata", "genericMetadata": { "createTime": "2022-01-08T02:35:56.364956Z", "updateTime": "2022-01-08T02:35:56.364956Z" } }, "done": true, "response": { "@type": "type.googleapis.com/google.protobuf.Empty" } }
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
Use these instructions to delete one or more indexes.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- A list of your active indexes is displayed.
- To delete an index, go to the options menu that is in the same row as the index and select Delete.
What's next
- Learn about Index configuration parameters
- Learn how to Deploy and manage public index endpoints
- Learn how to Deploy and manage index endpoints in a VPC network
- Learn how to Update and rebuild your index
- Learn how to Monitor an index