Use Vertex AI Feature Store in RAG Engine

This page shows you how to set up the Vertex AI Feature Store as the vector database to use with RAG Engine.

RAG Engine uses a built-in vector database powered by Spanner to store and manage vector representations of text documents. The vector database retrieves relevant documents based on the documents' semantic similarity to a given query.

By integrating Vertex AI Feature Store as an additional vector database, RAG Engine can use Vertex AI Feature Store to handle large data volumes with low latency, which helps to improve the performance and scalability of your RAG applications.

Set up a Vertex AI Feature Store

Vertex AI Feature Store, a managed cloud-native service, is an essential component of Vertex AI. It simplifies machine learning (ML) feature management and online serving by letting you manage feature data within a BigQuery table or view. This enables low-latency online feature serving.

For FeatureOnlineStore instances created with optimized online serving, you can take advantage of a vector similarity search to retrieve a list of semantically similar or related entities, which are known as approximate nearest neighbors.

The following sections show you how to set up a Vertex AI Feature Store instance for your RAG application.

Create a BigQuery table schema

Use Google Cloud console to create a BigQuery table schema. It must contain the following fields to serve as the data source.

Field name Data type Status
corpus_id String Required
file_id String Required
chunk_id String Required
chunk_data_type String Nullable
chunk_data String Nullable
file_original_uri String Nullable
embeddings Float Repeated

This code sample demonstrates how to define your BigQuery table schema.

  # Use this sql query as reference for creating the table
  CREATE TABLE `your-project-id.input_us_central1.rag_source_new` (
    `corpus_id` STRING ,
    `file_id` STRING,
    `chunk_id` STRING,
    `chunk_data_type` STRING,
    `chunk_data` STRING,
    `embeddings` ARRAY<FLOAT64>,
    `file_original_uri` STRING,
  );

Provision a FeatureOnlineStore instance

To enable online serving of features, use the Vertex AI Feature Store CreateFeatureOnlineStore API to set up a FeatureOnlineStore instance. If you are provisioning a FeatureOnlineStore for the first time, the operation might take approximately five minutes to complete.

  # TODO(developer): Update and uncomment the following lines:
  # PROJECT_ID = "your-project-id"
  #
  # Set feature_online_store_id.
  # Example: "rag_fos_test"
  # FEATURE_ONLINE_STORE_ID="your-feature-online-store-id"

  # Call CreateFeatureOnlineStore to create a FeatureOnlineStore instance
  curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/featureOnlineStores?feature_online_store_id=${FEATURE_ONLINE_STORE_ID}   -d '{
      "optimized": {},
  }'

  # TODO(developer): Update and uncomment the following lines:
  # Get operation_id returned in CreateFeatureOnlineStore
  # OPERATION_ID="your-operation-id"

  # Poll Operation status until done = true in the response
  curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

Create a FeatureView resource

To connect the BigQuery table, which stores the feature data source, to the FeatureOnlineStore instance, call the CreateFeatureView API to create a FeatureView resource. When you create a FeatureView resource, choose the default distance metric DOT_PRODUCT_DISTANCE, which is defined as the negative of the dot product (smaller DOT_PRODUCT_DISTANCE indicates higher similarity).

This code sample demonstrates how to create a FeatureView resource.

  # TODO(developer): Update and uncomment the following lines:
  # Set feature_view_id
  # Example: "feature_view_test"
  # FEATURE_VIEW_ID = "your-feature-view-id"
  #
  # The big_query_uri generated in the above BigQuery table schema creation step
  # The format should be "bq://" + BigQuery table ID
  # Example: "bq://tester.ragtest1.rag_testdata"
  # BIG_QUERY_URI=YOUR_BIG_QUERY_URI

  # Call CreateFeatureView API to create a FeatureView
  curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/featureOnlineStores/${FEATURE_ONLINE_STORE_ID}/featureViews?feature_view_id=${FEATURE_VIEW_ID} \
    -d '{
          "vertex_rag_source": {
            "uri": '\""${BIG_QUERY_URI}"\"'
          }
      }'

  # Call ListFeatureViews API to verify the FeatureView is created successfully
  curl -X GET -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/featureOnlineStores/${FEATURE_ONLINE_STORE_ID}/featureViews

Upload data and online serving

The RAG API handles the data upload and online serving.

Use Vertex AI Feature Store in RAG Engine

After the Vertex AI Feature Store instance is set up, the following sections show you how to set it up as the vector database to use with the RAG application.

Use the Vertex AI Feature Store instance as the vector database to create a RAG corpus

To create the RAG corpus, you must use FEATURE_VIEW_RESOURCE_NAME. The RAG corpus is created and automatically associated with the Vertex AI Feature Store instance. RAG APIs use the generated rag_corpus_id to handle the data upload to the Vertex AI Feature Store instance and to retrieve relevant contexts from the rag_corpus_id.

This code sample demonstrates how to use the Vertex AI Feature Store instance as the vector database to create a RAG corpus.

REST

# TODO(developer): Update and uncomment the following lines:
# CORPUS_DISPLAY_NAME = "your-corpus-display-name"
#
# Full feature view resource name
# Format: projects/${PROJECT_ID}/locations/us-central1/featureOnlineStores/${FEATURE_ONLINE_STORE_ID}/featureViews/${FEATURE_VIEW_ID}
# FEATURE_VIEW_RESOURCE_NAME = "your-feature-view-resource-name"

# Call CreateRagCorpus API to create a new RAG corpus
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
 https://us-central1-aiplatform.googleapis.com/v1beta1/projects//{PROJECT_ID}/locations/us-central1/ragCorpora -d '{
      "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
      "rag_vector_db_config" : {
              "vertex_feature_store": {
                "feature_view_resource_name":'\""${FEATURE_VIEW_RESOURCE_NAME}"\"'
              }
        }
  }'

# Call ListRagCorpora API to verify the RAG corpus is created successfully
curl -sS -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora"

Python

import vertexai
from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool
# Set Project
PROJECT_ID = "YOUR_PROJECT_ID"  # @param {type:"string"}
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure a Google first-party embedding model
embedding_model_config = rag.EmbeddingModelConfig(
    publisher_model="text-embedding-004"
)

# Configure a Vertex AI Feature Store instance for the corpus
FEATURE_VIEW_RESOURCE_NAME = "YOUR_FEATURE_VIEW_RESOURCE_NAME"  # @param {type:"string"}
vector_db = rag.VertexFeatureStore(
   resource_name=FEATURE_VIEW_RESOURCE_NAME,
)

# Name your corpus
DISPLAY_NAME = "YOUR_DISPLAY_NAME"  # @param {type:"string"}

rag_corpus = rag.create_corpus(
    display_name=DISPLAY_NAME, embedding_model_config=embedding_model_config, vector_db=vector_db
)

# Check the corpus just created
rag.list_corpora()

Import files into the BigQuery table using the RAG API

Use the ImportRagFiles API to import files from Google Cloud Storage or Google Drive into the BigQuery table of the Vertex AI Feature Store instance. The files are embedded and stored in the BigQuery table.

This code sample demonstrates how to import files into the BigQuery table using the RAG API.

REST

# TODO(developer): Update and uncomment the following lines:
# RAG_CORPUS_ID = "your-rag-corpus-id"
#
# Google Cloud Storage bucket/file location.
# For example, "gs://rag-fos-test/"
# GCS_URIS= "your-gcs-uris"

# Call ImportRagFiles API to embed files and store in the BigQuery table
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": '\""${GCS_URIS}"\"'
    },
    "rag_file_chunking_config": {
      "chunk_size": 512
    }
  }
}'

# Call ListRagFiles API to verify the files are imported successfully
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles

Python

RAG_CORPUS_RESOURCE = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/YOUR_RAG_CORPUS_ID"
GS_BUCKET = "YOUR_GS_BUCKET"

response = rag.import_files(
    corpus_name=RAG_CORPUS_RESOURCE,
    paths=[GS_BUCKET],
    chunk_size=512,  # Optional
    chunk_overlap=100,  # Optional
)

Run a synchronization process to construct a FeatureOnlineStore index

After uploading your data into the BigQuery table, run a synchronization process to make your data available for online serving. You must generate a FeatureOnlineStore index using the FeatureView, and the synchronization process might take 20 minutes to complete.

This code sample demonstrates how to run a synchronization process to construct a FeatureOnlineStore index.

  # Call Feature Store SyncFeatureView API to run the synchronization process
  curl   "https://us-central1-aiplatform.googleapis.com/v1/${FEATURE_VIEW_RESOURCE_NAME}:sync" \
    -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8"

  # TODO(developer): Update and uncomment the following lines:
  # Call Vertex AI Feature Store GetFeatureViewSync API to check the running synchronization // status
  # FEATURE_VIEW_SYNC_ID = "your-feature-view-sync-id" returned in SyncFeatureView
  curl   "https://us-central1-aiplatform.googleapis.com/v1/${FEATURE_VIEW_RESOURCE_NAME}/featureViewSyncs/${FEATURE_VIEW_SYNC_ID}" \
    -X GET \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8"

Retrieve relevant contexts using the RAG API

After the synchronization process completes, you can retrieve relevant contexts from the FeatureOnlineStore index through the RetrieveContexts API.

REST

# TODO(developer): Update and uncomment the following lines:
# RETRIEVAL_QUERY="your-retrieval-query"
#
# Full RAG corpus resource name
# Format:
# "projects/${PROJECT_ID}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}"
# RAG_CORPUS_RESOURCE="your-rag-corpus-resource"

# Call RetrieveContexts API to retrieve relevant contexts
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1:retrieveContexts \
  -d '{
    "vertex_rag_store": {
      "rag_resources": {
          "rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
        },
    },
    "query": {
      "text": '\""${RETRIEVAL_QUERY}"\"',
      "similarity_top_k": 10
    }
  }'
 ```

Python

RAG_CORPUS_RESOURCE = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/YOUR_RAG_CORPUS_ID"
RETRIEVAL_QUERY = "YOUR_RETRIEVAL_QUERY"

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
             rag_corpus=RAG_CORPUS_RESOURCE,
             # Optional: supply IDs from `rag.list_files()`.
             # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text=RETRIEVAL_QUERY,
    similarity_top_k=10,  # Optional
)
print(response)

Generate content using Vertex AI Gemini API

Call the Vertex AI GenerateContent API to use Gemini models to generate content, and specify RAG_CORPUS_RESOURCE in the request to retrieve data from the FeatureOnlineStore index.

REST

# TODO(developer): Update and uncomment the following lines:
# MODEL_ID=gemini-1.5-flash-001
# GENERATE_CONTENT_PROMPT="your-generate-content-prompt"

# GenerateContent with contexts retrieved from the FeatureStoreOnline index
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json"  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:generateContent \
-d '{
  "contents": {
    "role": "user",
    "parts": {
      "text": '\""${GENERATE_CONTENT_PROMPT}"\"'
    }
  },
  "tools": {
    "retrieval": {
      "vertex_rag_store": {
        "rag_resources": {
            "rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
          },
        "similarity_top_k": 8,
      }
    }
  }
}'

Python

RAG_CORPUS_RESOURCE = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/YOUR_RAG_CORPUS_ID"

rag_resource = rag.RagResource(
    rag_corpus=RAG_CORPUS_RESOURCE,
    # Optional: supply IDs from `rag.list_files()`.
    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
)

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=RAG_CORPUS_RESOURCE,
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            similarity_top_k=10,  # Optional
        ),
    )
)

rag_model = GenerativeModel(
   model_name="gemini-1.5-flash-001", tools=[rag_retrieval_tool]
)

GENERATE_CONTENT_PROMPT="YOUR_GENERATE_CONTENT_PROMPT"

response = rag_model.generate_content(GENERATE_CONTENT_PROMPT)
print(response.text)

What's next