This page introduces Vertex AI Search integration with the RAG Engine.
Vertex AI Search provides a solution for retrieving and managing data within your Vertex AI RAG applications. By using Vertex AI Search as your retrieval backend, you can improve performance, scalability, and ease of integration.
Enhanced performance and scalability: Vertex AI Search is designed to handle large volumes of data with exceptionally low latency. This translates to faster response times and improved performance for your RAG applications, especially when dealing with complex or extensive knowledge bases.
Simplified data management: Import your data from various sources, such as websites, BigQuery datasets, and Cloud Storage buckets, that can streamline your data ingestion process.
Seamless integration: Vertex AI provides built-in integration with Vertex AI Search, which lets you select Vertex AI Search as the corpus backend for your RAG application. This simplifies the integration process and helps to ensure optimal compatibility between components.
Improved LLM output quality: By using the retrieval capabilities of Vertex AI Search, you can help to ensure that your RAG application retrieves the most relevant information from your corpus, which leads to more accurate and informative LLM-generated outputs.
Vertex AI Search
Vertex AI Search brings together the power of deep information retrieval, natural language processing, and the latest features in large language model (LLM) processing, which helps to understand user intent and to return the most relevant results for the user.
With Vertex AI Search, you can build a Google-quality search application using data that you control.
Configure Vertex AI Search
To set up a Vertex AI Search, do the following:
Use the Vertex AI Search as a retrieval backend for RAG Engine
Once the Vertex AI Search is set up, follow these steps to set it as the retrieval backend for the RAG application.
Set the Vertex AI Search as the retrieval backend to create a RAG corpus
These code samples show you how to configure Vertex AI Search as the retrieval backend for a RAG corpus.
REST
To use the command line to create a RAG corpus, do the following:
Create a RAG corpus
Replace the following variables used in the code sample:
- PROJECT_ID: The ID of your Google Cloud project.
- LOCATION: The region to process the request.
- DISPLAY_NAME: The display name of the RAG corpus that you want to create.
- ENGINE_NAME: The full resource name of the Vertex AI Search engine or Vertex AI Search Datastore.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora" \ -d '{ "display_name" : "DISPLAY_NAME", "vertex_ai_search_config" : { "serving_config": "ENGINE_NAME/servingConfigs/default_search" } }'
Monitor progress
Replace the following variables used in the code sample:
- PROJECT_ID: The ID of your Google Cloud project.
- LOCATION: The region to process the request.
- OPERATION_ID: The ID of the RAG corpus create operation.
curl -X GET \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID"
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Replace the following variables used in the sample code:
- PROJECT_ID: The ID of your Google Cloud project.
- LOCATION: The region to process the request.
- DISPLAY_NAME: The display name of the RAG corpus that you want to create.
- ENGINE_NAME: The full resource name of the Vertex AI Search engine or Vertex AI Search Datastore.
from vertexai.preview import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
DISPLAY_NAME = "DISPLAY_NAME"
ENGINE_NAME = "ENGINE_NAME"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")
# Create a corpus
vertex_ai_search_config = rag.VertexAiSearchConfig(
serving_config=f"{ENGINE_NAME}/servingConfigs/default_search",
)
rag_corpus = rag.create_corpus(
display_name=DISPLAY_NAME,
vertex_ai_search_config=vertex_ai_search_config,
)
# Check the corpus just created
new_corpus = rag.get_corpus(name=rag_corpus.name)
print(new_corpus)
Retrieve contexts using the RAG API
After the RAG corpus creation, relevant contexts can be retrieved from the Vertex AI Search through the RetrieveContexts
API.
REST
This code sample demonstrates how to retrieve contexts using REST.
Replace the following variables used in the code sample:
- PROJECT_ID: The ID of your Google Cloud project.
- LOCATION: The region to process the request.
- RAG_CORPUS_RESOURCE: The name of the RAG corpus
resource.
Format:
projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
- TEXT: The query text to get relevant contexts.
curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" \
-d '{
"vertex_rag_store": {
"rag_resources": {
"rag_corpus": "RAG_CORPUS_RESOURCE"
}
},
"query": {
"text": "TEXT"
}
}'
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Replace the following variables used in the code sample:
- PROJECT_ID: The ID of your Google Cloud project.
- LOCATION: The region to process the request.
- RAG_CORPUS_RESOURCE: The name of the RAG corpus
resource.
Format:
projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
- TEXT: The query text to get relevant contexts.
from vertexai.preview import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
CORPUS_NAME = "projects/[PROJECT_ID]/locations/LOCATION/ragCorpora/RAG_CORPUS_RESOURCE"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
response = rag.retrieval_query(
rag_resources=[
rag.RagResource(
rag_corpus=CORPUS_NAME,
)
],
text="TEXT",
similarity_top_k=10, # Optional
)
print(response)
# Example response:
# contexts {
# contexts {
# source_uri: "gs://your-bucket-name/file.txt"
# text: "....
# ....
Generate content using Vertex AI Gemini API
REST
To generate content using {gemini_name} models, make a call to the
Vertex AI GenerateContent
API. By specifying the
RAG_CORPUS_RESOURCE
in the request, it automatically retrieves data from the
Vertex AI Search.
Replace the following variables used in the sample code:
- PROJECT_ID: The ID of your Google Cloud project.
- LOCATION: The region to process the request.
- MODEL_ID: LLM model for content generation. For
example,
gemini-1.5-flash-002
- GENERATION_METHOD: LLM method for content generation.
For example,
generateContent
,streamGenerateContent
- INPUT_PROMPT: The text that is sent to the LLM for content generation. Try to use a prompt relevant to the documents in Vertex AI Search.
- RAG_CORPUS_RESOURCE: The name of the RAG corpus
resource. Format:
projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
. - SIMILARITY_TOP_K: Optional: The number of top contexts to retrieve.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" \
-d '{
"contents": {
"role": "user",
"parts": {
"text": "INPUT_PROMPT"
}
},
"tools": {
"retrieval": {
"disable_attribution": false,
"vertex_rag_store": {
"rag_resources": {
"rag_corpus": "RAG_CORPUS_RESOURCE"
},
"similarity_top_k": SIMILARITY_TOP_K
}
}
}
}'
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Replace the following variables used in the sample code:
- PROJECT_ID: The ID of your Google Cloud project.
- LOCATION: The region to process the request.
- MODEL_ID: LLM model for content generation. For
example,
gemini-1.5-flash-002
. - GENERATION_METHOD: LLM method for content generation.
For example,
generateContent
,streamGenerateContent
. - INPUT_PROMPT: The text that is sent to the LLM for content generation. Try to use a prompt relevant to the documents in Vertex AI Search.
- RAG_CORPUS_RESOURCE: The name of the RAG corpus
resource. Format:
projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
. - SIMILARITY_TOP_K: Optional: The number of top contexts to retrieve.
from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool
import vertexai
PROJECT_ID = "PROJECT_ID"
CORPUS_NAME = "projects/{PROJECT_ID}/locations/LOCATION/ragCorpora/RAG_CORPUS_RESOURCE"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
rag_retrieval_tool = Tool.from_retrieval(
retrieval=rag.Retrieval(
source=rag.VertexRagStore(
rag_resources=[
rag.RagResource(
rag_corpus=CORPUS_NAME,
)
],
similarity_top_k=103, # Optional
),
)
)
rag_model = GenerativeModel(
model_name="MODEL_ID", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("INPUT_PROMPT")
print(response.text)
# Example response:
# The sky appears blue due to a phenomenon called Rayleigh scattering.
# Sunlight, which contains all colors of the rainbow, is scattered
# by the tiny particles in the Earth's atmosphere....
# ...
What's next
- To learn more about choosing embedding models, see Use embedding models with RAG Engine.
- To learn more about RAG Engine, see
Overview of RAG Engine.