This page shows you how to set up the Vertex AI Feature Store as the vector database to use with RAG Engine.
RAG Engine uses a built-in vector database powered by Spanner to store and manage vector representations of text documents. The vector database retrieves relevant documents based on the documents' semantic similarity to a given query.
By integrating Vertex AI Feature Store as an additional vector database, RAG Engine can use Vertex AI Feature Store to handle large data volumes with low latency, which helps to improve the performance and scalability of your RAG applications.
Set up a Vertex AI Feature Store
Vertex AI Feature Store, a managed cloud-native service, is an essential component of Vertex AI. It simplifies machine learning (ML) feature management and online serving by letting you manage feature data within a BigQuery table or view. This enables low-latency online feature serving.
For FeatureOnlineStore
instances created with optimized online serving, you
can take advantage of a vector similarity search to retrieve a list of
semantically similar or related entities, which are known as
approximate nearest neighbors.
The following sections show you how to set up a Vertex AI Feature Store instance for your RAG application.
Create a BigQuery table schema
Use Google Cloud console to create a BigQuery table schema. It must contain the following fields to serve as the data source.
Field name | Data type | Status |
---|---|---|
corpus_id |
String |
Required |
file_id |
String |
Required |
chunk_id |
String |
Required |
chunk_data_type |
String |
Nullable |
chunk_data |
String |
Nullable |
file_original_uri |
String |
Nullable |
embeddings |
Float |
Repeated |
This code sample demonstrates how to define your BigQuery table schema.
# Use this sql query as reference for creating the table
CREATE TABLE `your-project-id.input_us_central1.rag_source_new` (
`corpus_id` STRING ,
`file_id` STRING,
`chunk_id` STRING,
`chunk_data_type` STRING,
`chunk_data` STRING,
`embeddings` ARRAY<FLOAT64>,
`file_original_uri` STRING,
);
Provision a FeatureOnlineStore
instance
To enable online serving of features, use the Vertex AI Feature Store
CreateFeatureOnlineStore
API to set up a FeatureOnlineStore
instance. If you
are provisioning a FeatureOnlineStore
for the first time, the operation might
take approximately five minutes to complete.
# TODO(developer): Update and uncomment the following lines:
# PROJECT_ID = "your-project-id"
#
# Set feature_online_store_id.
# Example: "rag_fos_test"
# FEATURE_ONLINE_STORE_ID="your-feature-online-store-id"
# Call CreateFeatureOnlineStore to create a FeatureOnlineStore instance
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/featureOnlineStores?feature_online_store_id=${FEATURE_ONLINE_STORE_ID} -d '{
"optimized": {},
}'
# TODO(developer): Update and uncomment the following lines:
# Get operation_id returned in CreateFeatureOnlineStore
# OPERATION_ID="your-operation-id"
# Poll Operation status until done = true in the response
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}
Create a FeatureView
resource
To connect the BigQuery table, which stores the feature data source, to
the FeatureOnlineStore
instance, call the CreateFeatureView
API to create a
FeatureView
resource. When you create a FeatureView
resource, choose the
default distance metric DOT_PRODUCT_DISTANCE
, which is defined as the
negative of the dot product (smaller DOT_PRODUCT_DISTANCE
indicates higher
similarity).
This code sample demonstrates how to create a FeatureView
resource.
# TODO(developer): Update and uncomment the following lines:
# Set feature_view_id
# Example: "feature_view_test"
# FEATURE_VIEW_ID = "your-feature-view-id"
#
# The big_query_uri generated in the above BigQuery table schema creation step
# The format should be "bq://" + BigQuery table ID
# Example: "bq://tester.ragtest1.rag_testdata"
# BIG_QUERY_URI=YOUR_BIG_QUERY_URI
# Call CreateFeatureView API to create a FeatureView
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/featureOnlineStores/${FEATURE_ONLINE_STORE_ID}/featureViews?feature_view_id=${FEATURE_VIEW_ID} \
-d '{
"vertex_rag_source": {
"uri": '\""${BIG_QUERY_URI}"\"'
}
}'
# Call ListFeatureViews API to verify the FeatureView is created successfully
curl -X GET -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/featureOnlineStores/${FEATURE_ONLINE_STORE_ID}/featureViews
Upload data and online serving
The RAG API handles the data upload and online serving.
Use Vertex AI Feature Store in RAG Engine
After the Vertex AI Feature Store instance is set up, the following sections show you how to set it up as the vector database to use with the RAG application.
Use the Vertex AI Feature Store instance as the vector database to create a RAG corpus
To create the RAG corpus, you must use FEATURE_VIEW_RESOURCE_NAME
. The
RAG corpus is created and automatically associated with the
Vertex AI Feature Store instance. RAG APIs use the generated
rag_corpus_id
to handle the data upload to the Vertex AI Feature Store
instance and to retrieve relevant contexts from the rag_corpus_id
.
This code sample demonstrates how to use the Vertex AI Feature Store instance as the vector database to create a RAG corpus.
REST
# TODO(developer): Update and uncomment the following lines:
# CORPUS_DISPLAY_NAME = "your-corpus-display-name"
#
# Full feature view resource name
# Format: projects/${PROJECT_ID}/locations/us-central1/featureOnlineStores/${FEATURE_ONLINE_STORE_ID}/featureViews/${FEATURE_VIEW_ID}
# FEATURE_VIEW_RESOURCE_NAME = "your-feature-view-resource-name"
# Call CreateRagCorpus API to create a new RAG corpus
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects//{PROJECT_ID}/locations/us-central1/ragCorpora -d '{
"display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
"rag_vector_db_config" : {
"vertex_feature_store": {
"feature_view_resource_name":'\""${FEATURE_VIEW_RESOURCE_NAME}"\"'
}
}
}'
# Call ListRagCorpora API to verify the RAG corpus is created successfully
curl -sS -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora"
Python
import vertexai
from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool
# Set Project
PROJECT_ID = "YOUR_PROJECT_ID" # @param {type:"string"}
vertexai.init(project=PROJECT_ID, location="us-central1")
# Configure a Google first-party embedding model
embedding_model_config = rag.EmbeddingModelConfig(
publisher_model="text-embedding-004"
)
# Configure a Vertex AI Feature Store instance for the corpus
FEATURE_VIEW_RESOURCE_NAME = "YOUR_FEATURE_VIEW_RESOURCE_NAME" # @param {type:"string"}
vector_db = rag.VertexFeatureStore(
resource_name=FEATURE_VIEW_RESOURCE_NAME,
)
# Name your corpus
DISPLAY_NAME = "YOUR_DISPLAY_NAME" # @param {type:"string"}
rag_corpus = rag.create_corpus(
display_name=DISPLAY_NAME, embedding_model_config=embedding_model_config, vector_db=vector_db
)
# Check the corpus just created
rag.list_corpora()
Import files into the BigQuery table using the RAG API
Use the ImportRagFiles
API to import files from Google Cloud Storage or
Google Drive into the BigQuery table of the Vertex AI Feature Store
instance. The files are embedded and stored in the BigQuery table.
This code sample demonstrates how to import files into the BigQuery table using the RAG API.
REST
# TODO(developer): Update and uncomment the following lines:
# RAG_CORPUS_ID = "your-rag-corpus-id"
#
# Google Cloud Storage bucket/file location.
# For example, "gs://rag-fos-test/"
# GCS_URIS= "your-gcs-uris"
# Call ImportRagFiles API to embed files and store in the BigQuery table
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
"import_rag_files_config": {
"gcs_source": {
"uris": '\""${GCS_URIS}"\"'
},
"rag_file_chunking_config": {
"chunk_size": 512
}
}
}'
# Call ListRagFiles API to verify the files are imported successfully
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles
Python
RAG_CORPUS_RESOURCE = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/YOUR_RAG_CORPUS_ID"
GS_BUCKET = "YOUR_GS_BUCKET"
response = rag.import_files(
corpus_name=RAG_CORPUS_RESOURCE,
paths=[GS_BUCKET],
chunk_size=512, # Optional
chunk_overlap=100, # Optional
)
Run a synchronization process to construct a FeatureOnlineStore
index
After uploading your data into the BigQuery table, run a
synchronization process to make your data available for online serving. You must
generate a FeatureOnlineStore
index using the FeatureView
, and the
synchronization process might take 20 minutes to complete.
This code sample demonstrates how to run a synchronization process to construct
a FeatureOnlineStore
index.
# Call Feature Store SyncFeatureView API to run the synchronization process
curl "https://us-central1-aiplatform.googleapis.com/v1/${FEATURE_VIEW_RESOURCE_NAME}:sync" \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8"
# TODO(developer): Update and uncomment the following lines:
# Call Vertex AI Feature Store GetFeatureViewSync API to check the running synchronization // status
# FEATURE_VIEW_SYNC_ID = "your-feature-view-sync-id" returned in SyncFeatureView
curl "https://us-central1-aiplatform.googleapis.com/v1/${FEATURE_VIEW_RESOURCE_NAME}/featureViewSyncs/${FEATURE_VIEW_SYNC_ID}" \
-X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8"
Retrieve relevant contexts using the RAG API
After the synchronization process completes, you can retrieve relevant contexts
from the FeatureOnlineStore
index through the RetrieveContexts
API.
REST
# TODO(developer): Update and uncomment the following lines:
# RETRIEVAL_QUERY="your-retrieval-query"
#
# Full RAG corpus resource name
# Format:
# "projects/${PROJECT_ID}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}"
# RAG_CORPUS_RESOURCE="your-rag-corpus-resource"
# Call RetrieveContexts API to retrieve relevant contexts
curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1:retrieveContexts \
-d '{
"vertex_rag_store": {
"rag_resources": {
"rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
},
},
"query": {
"text": '\""${RETRIEVAL_QUERY}"\"',
"similarity_top_k": 10
}
}'
```
Python
RAG_CORPUS_RESOURCE = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/YOUR_RAG_CORPUS_ID"
RETRIEVAL_QUERY = "YOUR_RETRIEVAL_QUERY"
response = rag.retrieval_query(
rag_resources=[
rag.RagResource(
rag_corpus=RAG_CORPUS_RESOURCE,
# Optional: supply IDs from `rag.list_files()`.
# rag_file_ids=["rag-file-1", "rag-file-2", ...],
)
],
text=RETRIEVAL_QUERY,
similarity_top_k=10, # Optional
)
print(response)
Generate content using Vertex AI Gemini API
Call the Vertex AI GenerateContent
API to use Gemini models
to generate content, and specify RAG_CORPUS_RESOURCE
in the request to retrieve
data from the FeatureOnlineStore
index.
REST
# TODO(developer): Update and uncomment the following lines:
# MODEL_ID=gemini-1.5-flash-001
# GENERATE_CONTENT_PROMPT="your-generate-content-prompt"
# GenerateContent with contexts retrieved from the FeatureStoreOnline index
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:generateContent \
-d '{
"contents": {
"role": "user",
"parts": {
"text": '\""${GENERATE_CONTENT_PROMPT}"\"'
}
},
"tools": {
"retrieval": {
"vertex_rag_store": {
"rag_resources": {
"rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
},
"similarity_top_k": 8,
}
}
}
}'
Python
RAG_CORPUS_RESOURCE = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/YOUR_RAG_CORPUS_ID"
rag_resource = rag.RagResource(
rag_corpus=RAG_CORPUS_RESOURCE,
# Optional: supply IDs from `rag.list_files()`.
# rag_file_ids=["rag-file-1", "rag-file-2", ...],
)
rag_retrieval_tool = Tool.from_retrieval(
retrieval=rag.Retrieval(
source=rag.VertexRagStore(
rag_resources=[
rag.RagResource(
rag_corpus=RAG_CORPUS_RESOURCE,
# Optional: supply IDs from `rag.list_files()`.
# rag_file_ids=["rag-file-1", "rag-file-2", ...],
)
],
similarity_top_k=10, # Optional
),
)
)
rag_model = GenerativeModel(
model_name="gemini-1.5-flash-001", tools=[rag_retrieval_tool]
)
GENERATE_CONTENT_PROMPT="YOUR_GENERATE_CONTENT_PROMPT"
response = rag_model.generate_content(GENERATE_CONTENT_PROMPT)
print(response.text)
What's next
- To learn more about grounding, see Grounding overview.
- To learn more about RAG Engine, see Use RAG Engine.
- To learn more about grounding and RAG, see Ground responses using
RAG.