Use RagManagedDb with Vertex AI RAG Engine

This guide shows you how to use RagManagedDb with Vertex AI RAG Engine, covering the following topics:

  • Manage your retrieval strategy: Learn about the k-Nearest Neighbors (KNN) and Approximate Nearest Neighbors (ANN) retrieval strategies and how to create a RAG corpus for each.
  • Manage your encryption: Understand how to use Customer-Managed Encryption Keys (CMEK) to protect your RAG corpus data.

The following diagram summarizes the workflow for choosing and configuring a retrieval strategy:

Manage your retrieval strategy

RagManagedDb offers the following retrieval strategies to support your RAG use cases:

Retrieval Strategy Description Pros Cons Use Case
k-Nearest Neighbors (KNN) (Default) Finds the exact nearest neighbors by comparing all data points in your RAG corpus. If you don't specify a strategy when you create your RAG corpus, KNN is the default. Provides perfect recall (1.0). Latency increases with the size of the corpus as it searches every data point. Recall-sensitive applications and small to medium-sized corpora (less than 10,000 files).
Approximate Nearest Neighbors (ANN) Uses approximation techniques to find similar neighbors faster than the KNN technique. Significantly reduces query latency on large corpora. Slightly lower recall due to approximation. Large-scale corpora (more than 10,000 files) where a small trade-off in recall for improved performance is acceptable.

Create a RAG corpus with KNN RagManagedDb

The following code sample shows how to create a RAG corpus that uses the KNN RagManagedDb retrieval strategy.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
DISPLAY_NAME = YOUR_RAG_CORPUS_DISPLAY_NAME

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

vector_db = rag.RagManagedDb(retrieval_strategy=rag.KNN())
rag_corpus = rag.create_corpus(
    display_name=DISPLAY_NAME, backend_config=rag.RagVectorDbConfig(vector_db=vector_db))

REST

Replace the following variables:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
PROJECT_ID=PROJECT_ID
LOCATION=LOCATION
CORPUS_DISPLAY_NAME=CORPUS_DISPLAY_NAME

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora \
-d '{
      "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
      "vector_db_config": {
        "ragManagedDb": {
          "knn": {}
        }
      }
    }'

Create a RAG corpus with ANN RagManagedDb

RagManagedDb uses a tree-based structure to partition data and provide faster searches for the ANN feature. For the best recall and latency, configure the tree's structure by experimenting with your data's size and distribution. RagManagedDb lets you configure the tree_depth and leaf_count of the tree.

  • tree_depth: Determines the number of layers or levels in the tree.
    • If you have approximately 10,000 RAG files in the RAG corpus, set the value to 2.
    • If you have more RAG files, set this to 3.
    • If tree_depth isn't specified, Vertex AI RAG Engine assigns a default value of 2.
  • leaf_count: Determines the number of leaf nodes in the tree-based structure. Each leaf node contains groups of closely related vectors along with their corresponding centroid.

    • The recommended value is 10 * sqrt(num of RAG files in your RAG corpus).
    • If not specified, Vertex AI RAG Engine assigns a default value of 500.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
DISPLAY_NAME = YOUR_RAG_CORPUS_DISPLAY_NAME
TREE_DEPTH = YOUR_TREE_DEPTH # Optional: Acceptable values are 2 or 3. Default is 2.
LEAF_COUNT = YOUR_LEAF_COUNT # Optional: Default is 500.

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

ann_config = rag.ANN(tree_depth=TREE_DEPTH, leaf_count=LEAF_COUNT)
vector_db = rag.RagManagedDb(retrieval_strategy=ann_config)
rag_corpus = rag.create_corpus(
    display_name=DISPLAY_NAME, backend_config=rag.RagVectorDbConfig(vector_db=vector_db))

REST

Replace the following variables:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
  • TREE_DEPTH: Your tree depth.
  • LEAF_COUNT: Your leaf count.
PROJECT_ID=PROJECT_ID
LOCATION=LOCATION
CORPUS_DISPLAY_NAME=CORPUS_DISPLAY_NAME
TREE_DEPTH=TREE_DEPTH
LEAF_COUNT=LEAF_COUNT

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora \
-d '{
      "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
      "vector_db_config": {
        "ragManagedDb": {
          "ann": {
            "tree_depth": '"${TREE_DEPTH}"',
            "leaf_count": '"${LEAF_COUNT}"'
          }
        }
      }
    }'

Import your data into ANN RagManagedDb

You can use either the ImportRagFiles API or the UploadRagFile API to import your data into the ANN RagManagedDb. However, unlike the KNN retrieval strategy, the ANN approach requires the underlying tree-based index to be rebuilt for optimal recall. To have Vertex AI RAG Engine rebuild your ANN index, set rebuild_ann_index to true in the ImportRagFiles API request.

Note the following:

  • Before you query the RAG corpus, you must rebuild the ANN index at least once.
  • Only one concurrent index rebuild is supported on a project in each location.

To upload your local file into your RAG corpus, see Upload a RAG file. The following code sample shows how to import data from Cloud Storage into your RAG corpus and trigger an ANN index rebuild. To learn about the supported data sources, see Data sources supported for RAG.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
CORPUS_ID = YOUR_CORPUS_ID
PATHS = ["gs://my_bucket/my_files_dir"]
REBUILD_ANN_INDEX = REBUILD_ANN_INDEX # Choose true or false.

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

corpus_name = f"projects/{PROJECT_ID}/locations/{LOCATION}/ragCorpora/{CORPUS_ID}"
# This is a non blocking call.
response = await rag.import_files_async(
    corpus_name=corpus_name,
    paths=PATHS,
    rebuild_ann_index=REBUILD_ANN_INDEX
)

# Wait for the import to complete.
await response.result()

REST

GCS_URI=GCS_URI
REBUILD_ANN_INDEX=<true/false>

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": '\""${GCS_URI}"\"',
      },
    "rebuild_ann_index": '${REBUILD_ANN_INDEX}'
  }
}'

Manage your encryption

By default, all user data within RagManagedDb is encrypted using a Google-owned and Google-managed encryption key. This default encryption helps secure your data without requiring any configuration.

If you need more control over the keys used for encryption, you can use Customer-Managed Encryption Keys (CMEK), which Vertex AI RAG Engine supports. With CMEK, you can use your cryptographic keys, managed within Cloud Key Management Service (KMS), to protect your RAG corpus data.

For information on CMEK limitations for RAG corpora, see CMEK limitations for Vertex AI RAG Engine.

Set up your KMS key and grant permissions

Before you create a RAG corpus encrypted with CMEK, you must set up a cryptographic key in Cloud Key Management Service and grant the Vertex AI RAG Engine service account the permissions to use this key.

Prerequisites

To perform the following setup steps, make sure that your user account has the appropriate Identity and Access Management (IAM) permissions in the Google Cloud project where you intend to create the KMS key and the RAG corpus. Typically, a role like the Cloud KMS Admin role (roles/cloudkms.admin) is required.

Enable the API

  1. Go to the Google Cloud console.
  2. Select the project where you want to manage your keys and create your RAG corpus.
  3. In the search bar, type "Key Management", and select the "Key Management" service.
  4. If the API isn't enabled, click Enable. It might take a few minutes for the API to be fully provisioned.

Create your KMS key ring and key

To create a key ring, do the following:

  1. In the Key Management section, click Create Key Ring.

    Enter the following:

    • Key ring name: Enter a unique name for your key ring, such as rag-engine-cmek-keys.
    • Location type: Select Region. The Cloud Key Management Service key ring must be in the same region as the Vertex AI RAG Engine endpoint that you use when you encrypt a RAG corpus with CMEK.
    • Location: Choose the selected region, such as us-central1. This region should match the region where your RAG Engine resources reside.
  2. Click Create.

To create a key within the key ring, do the following:

  1. After the key ring is created, you are prompted to create a key, or you can go to Create Key.

    Enter the following:

    • Key name: Enter a unique name for your key, such as my-rag-corpus-key.
    • Protection level: Choose a protection level (Software or HSM). If you require hardware-backed keys, select HSM.
    • Purpose: Select Symmetric encrypt/decrypt. This is required for CMEK.
    • Key material source: Select Generated key.
    • Rotation period: Optional. Recommended. Configure a key rotation schedule according to your organization's security policies, for example, every 90 days.
  2. Click Create.

To copy the key resource name, do the following:

  1. After the key is created, go to its details page.

  2. Find the resource name. The format is projects/YOUR_PROJECT_ID/locations/YOUR_REGION/keyRings/YOUR_KEY_RING_NAME/cryptoKeys/YOUR_KEY_NAME/cryptoKeyVersions/1.

  3. Copy the resource name, and remove the /cryptoKeyVersions/VERSION_NUMBER part. The correctly formatted resource name is projects/YOUR_PROJECT_ID/locations/YOUR_REGION/keyRings/YOUR_KEY_RING_NAME/cryptoKeys/YOUR_KEY_NAME.

Grant permissions to the Vertex AI RAG Engine service agent

For Vertex AI RAG Engine to encrypt and decrypt data using your KMS key, its service agent needs appropriate permissions on that specific key.

To identify your Vertex AI RAG Engine service agent, do the following:

  1. Go to the IAM & Admin > IAM page in the Google Cloud console for your project.

  2. On the Identity and Access Management page, enable the Include Google-provided role grants checkbox.

  3. In the filter or search bar for the principals list, search for the Vertex AI RAG Engine service agent. It follows the pattern service-YOUR_PROJECT_NUMBER@gcp-sa-vertex-rag.iam.gserviceaccount.com.

    Replace YOUR_PROJECT_NUMBER with your Google Cloud project number.

If your Vertex AI RAG Engine service agent isn't present, do the following to trigger its creation:

  1. Enable the Resource Manager API.

  2. Run this command in the Cloud Shell or your command line:

    gcloud beta services identity create --service=aiplatform.googleapis.com \
        --projects=PROJECT_ID
    

    Alternatively, you can send the following REST API call:

    curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json; charset=utf-8" -d "" "https://serviceusage.googleapis.com/v1beta1/projects/PROJECT_ID/services/aiplatform.googleapis.com:generateServiceIdentity"
    
  3. Verify that the Vertex AI RAG Engine service agent was created.

To grant permissions on the KMS key, do the following:

  1. Go back to the Key Management service in the Google Cloud console.

  2. Select the key ring containing the key you created.

  3. Select the specific key you created.

  4. In the key's details page, go to the Permissions tab.

  5. Click Add Principal.

  6. In the New principals field, type the Vertex AI RAG Engine service agent's email address.

  7. In the Select a role drop-down, select the Cloud KMS CryptoKey Encrypter/Decrypter role (roles/cloudkms.cryptoKeyEncrypterDecrypter). This role grants the service agent the necessary permissions to use the key for encryption and decryption operations.

  8. Click Save.

Create a RAG corpus with customer-managed encryption

The following code sample shows how to create a RAG corpus encrypted with a Customer-Managed Encryption Key (CMEK).

Replace the variables in the following code samples:

Python

import vertexai
from google.cloud import aiplatform
from vertexai import rag
from google.cloud.aiplatform_v1.types.encryption_spec import EncryptionSpec

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
DISPLAY_NAME = YOUR_RAG_CORPUS_DISPLAY_NAME
KMS_KEY_NAME = YOUR_KMS_KEY_NAME

vertexai.init(project=PROJECT_ID)

rag_corpus = rag.create_corpus(display_name=DISPLAY_NAME, encryption_spec=EncryptionSpec(kms_key_name=KMS_KEY_NAME))

REST

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
DISPLAY_NAME = YOUR_RAG_CORPUS_DISPLAY_NAME
KMS_KEY_NAME = YOUR_KMS_KEY_NAME

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora \
-d '{
      "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
      "encryption_spec" : {
        "kms_key_name" : '\""${KMS_KEY_NAME}"\"'
      }
    }'

Quotas

When you use CMEK with Vertex AI services like Vertex AI RAG Engine, there is a quota on the number of unique Cloud KMS keys that can be in use per project, per region. This quota is tracked by the metric aiplatform.googleapis.com/in_use_customer_managed_encryption_keys.

Each time you use a new, unique KMS key to create a resource like a RAG corpus within a project and region, the KMS key consumes one unit of this quota. This quota unit isn't released even if the resources using that specific key are deleted.

If you need more unique keys than the current limit, you must request a quota increase for aiplatform.googleapis.com/in_use_customer_managed_encryption_keys for the selected region.

For more information on how to request a quota increase, see View and edit the quotas in the Google Cloud console.

What's next