Manage indexes

The following sections describe how to configure, create, list, and delete your indexes. For more information, see the API index documentation.

Configure index parameters

Before you create an index, you need to configure the parameters for your index.

For example, create a file named index_metadata.json:

{
  "contentsDeltaUri": "gs://BUCKET_NAME/path",
  "config": {
    "dimensions": 100,
    "approximateNeighborsCount": 150,
    "distanceMeasureType": "DOT_PRODUCT_DISTANCE",
    "shardSize": "SHARD_SIZE_MEDIUM",
    "algorithm_config": {
      "treeAhConfig": {
        "leafNodeEmbeddingCount": 5000,
        "leafNodesToSearchPercent": 3
      }
    }
  }
}

You can find the definition for each of these fields in Configuring indexes, or view the definitions within the following schema:

title: NearestNeighborSearch
type: object
properties:
  contentsDeltaUri:
    type: string
    description: >
      Allows inserting, updating  or deleting the contents of the index.
      The string must be a valid Cloud Storage directory path. If this
      field is set when calling IndexService.UpdateIndex, then no other
      Index field can be also updated as part of the same call.
      The expected structure and format of the files this URI points to is
      described at
      cloud.google.com/vertex-ai/docs/vector-search/setup/format-structure
    writeOnly: true
  isCompleteOverwrite:
    type: boolean
    description: >
      If this field is set together with contentsDeltaUri when calling IndexService.UpdateIndex,
      then existing content of the Index will be replaced by the data from the contentsDeltaUri.
    default: false
  config:
    type: object
    description: >
      The configuration of the index.
    required:
    - dimensions
    - algorithmConfig
    properties:
      dimensions:
        type: integer
        format: int32
        description: >
          The number of dimensions of the input vectors.
      approximateNeighborsCount:
        type: integer
        format: int32
        description: >
          The default number of neighbors to find via approximate search before exact reordering is
          performed. Exact reordering is a procedure where results returned by an
          approximate search algorithm are reordered via a more expensive distance computation.
          Required if tree-AH algorithm is used.
      distanceMeasureType:
        description: >
          The distance measure used in nearest neighbor search.
        oneOf:
        - enum: [SQUARED_L2_DISTANCE]
          description: >
            Euclidean (L_2) Distance
        - enum: [L1_DISTANCE]
          description: >
            Manhattan (L_1) Distance
        - enum: [COSINE_DISTANCE]
          description: >
            Cosine Distance. Defined as 1 - cosine similarity.
        - enum: [DOT_PRODUCT_DISTANCE]
          description: >
            Dot Product Distance. Defined as a negative of the dot product
        default: DOT_PRODUCT_DISTANCE
      featureNormType:
        description: >
          Type of normalization to be carried out on each vector.
        oneOf:
        - enum: [UNIT_L2_NORM]
          description: >
            Unit L2 normalization type.
        - enum: [NONE]
          description: >
            No normalization type is specified.
        default: NONE
      algorithmConfig:
        description: >
          The configuration with regard to the algorithms used for efficient search.
        oneOf:
        - type: object
          description: >
             Configuration options for using the tree-AH algorithm (Shallow tree + Asymmetric Hashing).
             Refer to this paper for more details: https://arxiv.org/abs/1908.10396
          properties:
            type:
              type: string
              enum: [treeAhConfig]
            leafNodeEmbeddingCount:
              type: integer
              format: int64
              description: >
                 Number of embeddings on each leaf node. The default value is 1000 if not set.
            leafNodesToSearchPercent:
              type: number
              format: int32
              description: >
                 The default percentage of leaf nodes that any query may be searched. Must be in
                 range 1-100, inclusive. The default value is 10 (means 10%) if not set.
        - type: object
          description: >
             Configuration options for using brute force search, which simply implements the
             standard linear search in the database for each query.
          properties:
            type:
              type: string
              enum: [bruteForceConfig]
        discriminator:
          propertyName: type

This metadata schema file is available to download from Cloud Storage.

Create an index

Index data is split into equal parts to be processed. These are called "shards". When you create an index you must specify the shard size. Once you create the index, you can determine what machine type to use when you deploy your index. To learn more about the types of shard sizes available, and their corresponding prices, see the pricing page.

Shard Size Default Machine type
Small (2GB) e2-standard-2
Medium (20GB) e2-standard-16
Large (50GB) e2-highmem-16

Create an index for Batch Update

To create an index:

gcloud

  1. Define your index metadata.
  2. Use the gcloud ai indexes create command:
gcloud ai indexes create \
  --metadata-file=LOCAL_PATH_TO_METADATA_FILE \
  --display-name=INDEX_NAME \
  --project=PROJECT_ID \
  --region=LOCATION

Replace the following:

  • LOCAL_PATH_TO_METADATA_FILE: The local path to the metadata file.
  • INDEX_NAME: Display name for the index.
  • PROJECT_ID: The ID of the project.
  • LOCATION: The region where you are using Vertex AI.

REST

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID.
  • INDEX_NAME: Display name for the index.
  • INPUT_DIR: The Cloud Storage directory path of the index content.
  • PROJECT_NUMBER: Project number for your project

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/indexes

Request JSON body:

{
  "display_name": "INDEX_NAME",
  "metadata": {
    "contentsDeltaUri": "INPUT_DIR",
    "config": {
      "dimensions": 100,
      "approximateNeighborsCount": 150,
      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",
      "algorithm_config": {
        "treeAhConfig": {
          "leafNodeEmbeddingCount": 500,
          "leafNodesToSearchPercent": 7
        }
      }
    }
  }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateIndexOperationMetadata",
    "genericMetadata": {
      "createTime": "2022-01-08T01:21:10.147035Z",
      "updateTime": "2022-01-08T01:21:10.147035Z"
    }
  }
}
You can poll for the status of the operation until the response includes "done": true. Use the example command below to poll the status.


gcloud beta ai operations describe <operation-id> <index-id> --project=<project-id> --region=us-west1

Create an index for Streaming Updates

To create an index available for Streaming Updates requires similar steps to setting up a Batch Update index, except you need to set indexUpdateMethod to STREAM_UPDATE.


INPUT_GCS_DIR=
DIMENSIONS=
DISPLAY_NAME=

curl -X POST -H "Content-Type: application/json" \
-H "Authorization: Bearer `gcloud auth print-access-token`" \
https://${ENDPOINT}/v1/projects/${PROJECT_ID}/locations/${REGION}/indexes \
-d '{
    displayName: "'${DISPLAY_NAME}'",
    description: "'${DISPLAY_NAME}'",
    metadata: {
       contentsDeltaUri: "'${INPUT_GCS_DIR}'",
       config: {
          dimensions: "'${DIMENSIONS}'",
          approximateNeighborsCount: 150,
          distanceMeasureType: "DOT_PRODUCT_DISTANCE",
          algorithmConfig: {treeAhConfig: {leafNodeEmbeddingCount: 10000, leafNodesToSearchPercent: 2}}
       },
    },
    indexUpdateMethod: "STREAM_UPDATE"
}'

After you've created the index, you can verify the update method using the following command:


curl  -H "Authorization: Bearer `gcloud auth print-access-token`" -H "Content-Type: application/json" https://${ENDPOINT}/v1/projects/${PROJECT_ID}/locations/${REGION}/indexes/${INDEX_ID}

{
  "name": "projects/${PROJECT_NUMBER}/locations/${REGION}/indexes/${INDEX_ID}",
  "displayName": "...",
  "description": "...",
  "metadataSchemaUri": "gs://google-cloud-aiplatform/schema/matchingengine/metadata/nearest_neighbor_search_1.0.0.yaml",
  "metadata": {
    "config": {
      "dimensions": 100,
      "approximateNeighborsCount": 150,
      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",
      "algorithmConfig": {
        "treeAhConfig": {
          "leafNodeEmbeddingCount": "10000",
          "leafNodesToSearchPercent": 2
        }
      }
    }
  },
  "etag": "...",
  "createTime": "2022-03-16T04:57:29.344329Z",
  "updateTime": "2022-03-16T22:20:37.406393Z",
  "indexUpdateMethod": "STREAM_UPDATE"
}

List indexes

gcloud

Use the gcloud ai indexes list command:

gcloud ai indexes list \
  --project=PROJECT_ID \
  --region=LOCATION

Replace the following:

  • PROJECT_ID: The ID of the project.
  • LOCATION: The region where you are using Vertex AI.

REST

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID.
  • INDEX_NAME: Display name for the index.
  • PROJECT_NUMBER: Project number for your project

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/indexes

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "indexes": [
    {
      "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID",
      "displayName": "INDEX_NAME",
      "metadataSchemaUri": "gs://google-cloud-aiplatform/schema/matchingengine/metadata/nearest_neighbor_search_1.0.0.yaml",
      "metadata": {
        "config": {
          "dimensions": 100,
          "approximateNeighborsCount": 150,
          "distanceMeasureType": "DOT_PRODUCT_DISTANCE",
          "featureNormType": "NONE",
          "algorithmConfig": {
            "treeAhConfig": {
              "maxLeavesToSearch": 50,
              "leafNodeCount": 10000
            }
          }
        }
      },
      "etag": "AMEw9yNU8YX5IvwuINeBkVv3yNa7VGKk11GBQ8GkfRoVvO7LgRUeOo0qobYWuU9DiEc=",
      "createTime": "2020-11-08T21:56:30.558449Z",
      "updateTime": "2020-11-08T22:39:25.048623Z"
    }
  ]
}

Tuning the index

Tuning the index requires setting the configuration parameters that impact the performance of deployed indexes, especially recall and latency. These parameters are set when you first create the index. You can use brute-force indexes to measure recall.

Configuration parameters that impact recall and latency

  1. distanceMeasureType

    The following values are supported:

    • SQUARED_L2_DISTANCE: Euclidean L2 distance
    • L1_DISTANCE: Manhattan L1 distance
    • COSINE_DISTANCE: Cosine distance defined as '1 - cosine similarity'
    • DOT_PRODUCT_DISTANCE: vDot product distance, defined as a negative of the dot product. This is the default value.

    In most cases, the embedding vectors used for similarity matching are computed by using metric learning models (also called Siamese networks or two-tower models). These models use a distance metric to compute the contrastive loss function. Ideally, the value of the distanceMeasureType parameter for the matching index matches the distance measure used by the model that produced the embedding vectors.

  2. approximateNeighborsCount

    The default number of neighbors to find by using approximate search before exact reordering is performed. Exact reordering is a procedure where results returned by an approximate search algorithm are reordered by a more expensive distance computation. Increasing this value increases recall, which can create a proportionate increase in latency.

  3. treeAhConfig.leafNodesToSearchPercent

    The percentage of leaves to be searched for each query. Increasing this value increases recall, which can also create a proportionate increase in latency. The default value is 10 or 10% of the leaves.

  4. treeAhConfig.leafNodeEmbeddingCount

    The number of embeddings for each leaf node. By default, this number is set to 1000.

    This parameter does not have a linear correlation to recall. Increasing or decreasing the value of the treeAhConfig.leafNodeEmbeddingCount parameter doesn't always increase or decrease recall. Experiment to find the optimal value. Changing the value of the treeAhConfig.leafNodeEmbeddingCount parameter generally has less affect than changing the value of the other parameters.

Using a brute-force index to measure recall

To get the exact nearest neighbors, use indexes with the brute-force algorithm. The brute-force algorithm provides 100% recall at the expense of higher latency. Using a brute-force index to measure recall is usually not a good choice for production serving, but you might find it useful for evaluating the recall of various indexing options offline.

To create an index with the brute-force algorithm, specify brute_force_config in the index metadata:

curl -X POST -H "Content-Type: application/json" \
-H "Authorization: Bearer `gcloud auth print-access-token`" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/indexes \
-d '{
    displayName: "'${DISPLAY_NAME}'",
    description: "'${DESCRIPTION}'",
    metadata: {
       contentsDeltaUri: "'${INPUT_DIR}'",
       config: {
          dimensions: 100,
          approximateNeighborsCount: 150,
          distanceMeasureType: "DOT_PRODUCT_DISTANCE",
          featureNormType: "UNIT_L2_NORM",
          algorithmConfig: {
             bruteForceConfig: {}
          }
       },
    },
}'

Delete an index

You can't delete the Index until all its Index.deployed_indexes have been undeployed.

gcloud

Use the gcloud ai indexes delete command:

gcloud ai indexes delete INDEX_ID \
  --project=PROJECT_ID \
  --region=LOCATION

Replace the following:

  • INDEX_ID: The ID of the index.
  • PROJECT_ID: The ID of the project.
  • LOCATION: The region where you are using Vertex AI.

REST

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID.
  • INDEX_ID: The ID of the index.
  • PROJECT_NUMBER: Project number for your project

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/indexes/INDEX_ID

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeleteOperationMetadata",
    "genericMetadata": {
      "createTime": "2022-01-08T02:35:56.364956Z",
      "updateTime": "2022-01-08T02:35:56.364956Z"
    }
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.protobuf.Empty"
  }
}

What's next