# Create a model instance with your self-deployed open model endpointrag_model=GenerativeModel("projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID",tools=[rag_retrieval_tool])
以下程式碼範例示範如何使用 Gemini GenerateContent API 建立生成模型執行個體。模型 ID /publisher/meta/models/llama-3.1-405B-instruct-maas 位於模型資訊卡中。
取代程式碼範例中使用的變數:
PROJECT_ID:您的專案 ID。
LOCATION:處理要求的區域。
RAG_RETRIEVAL_TOOL:您的 RAG 檢索工具。
# Create a model instance with Llama 3.1 MaaS endpointrag_model=GenerativeModel("projects/PROJECT_ID/locations/LOCATION/publisher/meta/models/llama-3.1-405B-instruct-maas",tools=RAG_RETRIEVAL_TOOL)
以下程式碼範例示範如何使用與 OpenAI 相容的 ChatCompletions API 生成模型回應。
INPUT_PROMPT:傳送至大型語言模型以生成內容的文字。使用與 Vertex AI Search 文件相關的提示。
RAG_CORPUS_ID:RAG 語料庫資源的 ID。
ROLE:您的角色。
USER:您的使用者名稱。
CONTENT:您的內容。
# Generate a response with Llama 3.1 MaaS endpointresponse=client.chat.completions.create(model="MODEL_ID",messages=[{"ROLE":"USER","content":"CONTENT"}],extra_body={"extra_body":{"google":{"vertex_rag_store":{"rag_resources":{"rag_corpus":"RAG_CORPUS_ID"},"similarity_top_k":10}}}},)
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-04 (世界標準時間)。"],[],[],null,["# Vertex AI RAG Engine supported models\n\n| The [VPC-SC security controls](/vertex-ai/generative-ai/docs/security-controls) and\n| CMEK are supported by Vertex AI RAG Engine. Data residency and AXT security controls aren't\n| supported.\n\nThis page lists Gemini models, self-deployed models, and models with\nmanaged APIs on Vertex AI that support Vertex AI RAG Engine.\n\nGemini models\n-------------\n\nThe following table lists the Gemini models and their versions that\nsupport Vertex AI RAG Engine:\n\n- [Gemini 2.5 Flash-Lite](/vertex-ai/generative-ai/docs/models/gemini/2-5-flash-lite)\n- [Gemini 2.5 Pro](/vertex-ai/generative-ai/docs/models/gemini/2-5-pro)\n- [Gemini 2.5 Flash](/vertex-ai/generative-ai/docs/models/gemini/2-5-flash)\n- [Gemini 2.0 Flash](/vertex-ai/generative-ai/docs/models/gemini/2-0-flash)\n\nFine-tuned Gemini models are unsupported when the Gemini\nmodels use Vertex AI RAG Engine.\n\nSelf-deployed models\n--------------------\n\nVertex AI RAG Engine supports all models in\n[Model Garden](/vertex-ai/generative-ai/docs/model-garden/explore-models).\n\nUse Vertex AI RAG Engine with your self-deployed open model endpoints.\n\nReplace the variables used in the code sample:\n\n- **\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e**: Your project ID.\n- **\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e**: The region to process your request.\n- **\u003cvar translate=\"no\"\u003eENDPOINT_ID\u003c/var\u003e**: Your endpoint ID.\n\n # Create a model instance with your self-deployed open model endpoint\n rag_model = GenerativeModel(\n \"projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/locations/\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e/endpoints/\u003cvar translate=\"no\"\u003eENDPOINT_ID\u003c/var\u003e\",\n tools=[rag_retrieval_tool]\n )\n\nModels with managed APIs on Vertex AI\n-------------------------------------\n\nThe models with managed APIs on Vertex AI that support\nVertex AI RAG Engine include the following:\n\n- [Mistral on Vertex AI](/vertex-ai/generative-ai/docs/partner-models/mistral)\n- [Llama 3.1 and 3.2](/vertex-ai/generative-ai/docs/partner-models/llama)\n\nThe following code sample demonstrates how to use the Gemini\n`GenerateContent` API to create a generative model instance. The model ID,\n`/publisher/meta/models/llama-3.1-405B-instruct-maas`, is found in the\n[model card](/vertex-ai/generative-ai/docs/model-garden/explore-models).\n\nReplace the variables used in the code sample:\n\n- **\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e**: Your project ID.\n- **\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e**: The region to process your request.\n- **\u003cvar translate=\"no\"\u003eRAG_RETRIEVAL_TOOL\u003c/var\u003e**: Your RAG retrieval tool.\n\n # Create a model instance with Llama 3.1 MaaS endpoint\n rag_model = GenerativeModel(\n \"projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/locations/\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e/publisher/meta/models/llama-3.1-405B-instruct-maas\",\n tools=\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eRAG_RETRIEVAL_TOOL\u003c/span\u003e\u003c/var\u003e\n )\n\nThe following code sample demonstrates how to use the OpenAI compatible\n`ChatCompletions` API to generate a model response.\n\nReplace the variables used in the code sample:\n\n- **\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e**: Your project ID.\n- **\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e**: The region to process your request.\n- **\u003cvar translate=\"no\"\u003eMODEL_ID\u003c/var\u003e** : LLM model for content generation. For example, `meta/llama-3.1-405b-instruct-maas`.\n- **\u003cvar translate=\"no\"\u003eINPUT_PROMPT\u003c/var\u003e**: The text sent to the LLM for content generation. Use a prompt relevant to the documents in Vertex AI Search.\n- **\u003cvar translate=\"no\"\u003eRAG_CORPUS_ID\u003c/var\u003e**: The ID of the RAG corpus resource.\n- **\u003cvar translate=\"no\"\u003eROLE\u003c/var\u003e**: Your role.\n- **\u003cvar translate=\"no\"\u003eUSER\u003c/var\u003e**: Your username.\n- **\u003cvar translate=\"no\"\u003eCONTENT\u003c/var\u003e**: Your content.\n\n # Generate a response with Llama 3.1 MaaS endpoint\n response = client.chat.completions.create(\n model=\"\u003cvar translate=\"no\"\u003eMODEL_ID\u003c/var\u003e\",\n messages=[{\"\u003cvar translate=\"no\"\u003eROLE\u003c/var\u003e\": \"\u003cvar translate=\"no\"\u003eUSER\u003c/var\u003e\", \"content\": \"\u003cvar translate=\"no\"\u003eCONTENT\u003c/var\u003e\"}],\n extra_body={\n \"extra_body\": {\n \"google\": {\n \"vertex_rag_store\": {\n \"rag_resources\": {\n \"rag_corpus\": \"\u003cvar translate=\"no\"\u003eRAG_CORPUS_ID\u003c/var\u003e\"\n },\n \"similarity_top_k\": 10\n }\n }\n }\n },\n )\n\nWhat's next\n-----------\n\n- [Use Embedding models with Vertex AI RAG Engine](/vertex-ai/generative-ai/docs/use-embedding-models)."]]