Vertex AI Matching Engine setup

Stay organized with collections Save and categorize content based on your preferences.

This guide explains how to configure and use Vertex AI Matching Engine to perform vector similarity searches.

Set up a VPC Network Peering connection

To reduce network latency for vector matching online queries, call the Vertex AI service endpoints from your Virtual Private Cloud (VPC) by using Private Service Access. For each Google Cloud project, only one VPC network can be peered with Matching Engine. If you already have a VPC with private services access configured, you can use that VPC to peer with Vertex AI Matching Engine.

Configuring a VPC Network Peering connection is an initial task required only one time per Google Cloud project. After this setup is done, you can make calls to the Matching Engine index from any client running inside your VPC.

The VPC Network Peering connection is required only for vector matching online queries. API calls to create, deploy, and delete indexes do not require a VPC Network Peering connection.

The following steps are to be completed by your Cloud project administrator or network administrator:

  1. To set up your Cloud projects, enable billing, and enable APIs, complete the following Before you begin steps.

  2. To avoid IP address collisions between your VPC network and our service producer's network, you must allocate an IP address range for the Matching Engine service in which the Matching Engine indexes are deployed. For more information, see Allocating IP address ranges.

    # Note: `prefix-length=16` means a CIDR block with mask /16 is reserved for
    # use by Google services. Make sure to enable the Service Networking API.
    gcloud compute addresses create $PEERING_RANGE_NAME \
        --global \
        --prefix-length=16 \
        --description="peering range for Matching Engine service" \
        --network=$NETWORK_NAME \
        --purpose=VPC_PEERING \
        --project=$PROJECT_ID
    
    # Create the VPC connection.
    gcloud services vpc-peerings connect \
        --service=servicenetworking.googleapis.com \
        --network=$NETWORK_NAME \
        --ranges=$PEERING_RANGE_NAME \
        --project=$PROJECT_ID
    

After you create a private connection, you can make online calls to a Matching Engine index from any virtual machine (VM) instance running within the peered VPC.

Example notebook

After you complete the initial VPC Network Peering setup, you can create a user-managed notebooks instance within that VPC, and issue commands from the notebook.

[Launch the example notebook][launch-example-notebook] in Vertex AI Workbench, or [view the notebook in GitHub][example-notebook].

Access control with IAM

Vertex AI uses Identity and Access Management (IAM) to manage access to resources. To grant access to a resource, assign one or more roles to a user, group, or service account.

To use Matching Engine, use these predefined roles to grant varying levels of access to resources at the project level.

Input data format and structure

To build a new index or update an existing index, provide vectors to Matching Engine in the format and structure described in the following sections.

Input data storage

Store your input data in a Cloud Storage bucket, in your Cloud project.

Input directory structure

Structure your input data directory as follows:

  • Batch root directory: Create a root directory for each batch of input data files. Use a single Cloud Storage directory as the root directory. In the following example, the root directory is named batch_root.
  • File naming: Place individual data files directly under batch_root and name them by using the suffix .csv, .json, or .avro, depending on which file format you use.

    • Matching Engine interprets each data file as a set of records.

      The format of the record is determined by the suffix of the filename and is described in one of the following sections.

    • Each record should have an ID and a feature vector, optionally with additional fields such as restricts and crowding.

  • Delete directory: You can create a delete subdirectory under batch_root. This directory is optional.

    • Each file directly under batch_root/delete is a text file of record IDs, with one ID in each line. Each ID must be a valid UTF-8 string.
  • All other directories and files are ignored.

  • All records from all data files, including those under delete, comprise a single batch of input. The relative ordering of records within a data file is immaterial.

  • A single ID can appear only once per batch.

    • Note: An ID cannot appear both in a regular data file and a delete data file.
  • All IDs from a data file under delete are removed from the next index version. Records from regular data files are included in the next version, potentially overwriting a value in an earlier index version.

Data file formats

Data files can be in CSV, JSON, or Avro format.

CSV

  • Encode the file using UTF-8.
  • Make each line a valid CSV to be interpreted as a single record.
  • Make the first value the id, and the id a valid UTF-8 string.
  • Make the next N values the dimension of the feature vector, which is configured when creating an index. Make each value a floating point literal as defined in the Java language spec.

JSON

  • Encode the file using UTF-8.
  • Make each line a valid JSON object to be interpreted as a record.
  • Include in each record a field named id that requires a valid UTF-8 string that is the ID of the vector.
  • Include in each record a field named embedding that requires an array of numbers. This is the feature vector.

AVRO

  • Use a valid Avro file.
  • Make records that conform to the following schema:

    {
      "type": "record",
      "name": "FeatureVector",
      "fields": [
        {
          "name": "id",
          "type": "string"
        },
        {
          "name": "embedding",
          "type": {
            "type": "array",
            "items": "float"
          }
        },
        {
          "name": "restricts",
          "type": [
            "null",
            {
              "type": "array",
              "items": {
                "type": "record",
                "name": "Restrict",
                "fields": [
                  {
                    "name": "namespace",
                    "type": "string"
                  },
                  {
                    "name": "allow",
                    "type": [
                      "null",
                      {
                        "type": "array",
                        "items": "string"
                      }
                    ]
                  },
                  {
                    "name": "deny",
                    "type": [
                      "null",
                      {
                        "type": "array",
                        "items": "string"
                      }
                    ]
                  }
                ]
              }
            }
          ]
        },
        {
          "name": "crowding_tag",
          "type": [
            "null",
            "string"
          ]
        }
      ]
    }