This page lists Gemini models, self-deployed models, and models with
managed APIs on Vertex AI that support Vertex AI RAG Engine. The following table lists the Gemini models and their versions that
support Vertex AI RAG Engine: Fine-tuned Gemini models are unsupported when the Gemini
models use Vertex AI RAG Engine. Vertex AI RAG Engine supports all models in
Model Garden. Use Vertex AI RAG Engine with your self-deployed open model endpoints. Replace the variables used in the code sample: ENDPOINT_ID: Your endpoint ID. The models with managed APIs on Vertex AI that support
Vertex AI RAG Engine include the following: The following code sample demonstrates how to use the Gemini
Replace the variables used in the code sample: RAG_RETRIEVAL_TOOL: Your RAG retrieval tool. The following code sample demonstrates how to use the OpenAI compatible
Replace the variables used in the code sample: CONTENT: Your content.Gemini models
Self-deployed models
# Create a model instance with your self-deployed open model endpoint
rag_model = GenerativeModel(
"projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID",
tools=[rag_retrieval_tool]
)
Models with managed APIs on Vertex AI
GenerateContent
API to create a generative model instance. The model ID,
/publisher/meta/models/llama-3.1-405B-instruct-maas
, is found in the
model card.
# Create a model instance with Llama 3.1 MaaS endpoint
rag_model = GenerativeModel(
"projects/PROJECT_ID/locations/LOCATION/publisher/meta/models/llama-3.1-405B-instruct-maas",
tools=RAG_RETRIEVAL_TOOL
)
ChatCompletions
API to generate a model response.
meta/llama-3.1-405b-instruct-maas
. # Generate a response with Llama 3.1 MaaS endpoint
response = client.chat.completions.create(
model="MODEL_ID",
messages=[{"ROLE": "USER", "content": "CONTENT"}],
extra_body={
"extra_body": {
"google": {
"vertex_rag_store": {
"rag_resources": {
"rag_corpus": "RAG_CORPUS_ID"
},
"similarity_top_k": 10
}
}
}
},
)
What's next
Vertex AI RAG Engine supported models
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-27 UTC.