本頁面由 Cloud Translation API 翻譯而成。

RAG 快速入門

本頁說明如何使用 Vertex AI SDK 執行 Vertex AI RAG 引擎工作。

您也可以使用這本筆記本「Vertex AI RAG Engine 簡介」逐步操作。

必要的角色

Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/aiplatform.user

gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE

Replace the following:

PROJECT_ID: Your project ID.
USER_IDENTIFIER: The identifier for your user account. For example, myemail@example.com.
ROLE: The IAM role that you grant to your user account.

準備 Google Cloud 主機

如要使用 Vertex AI RAG 引擎，請按照下列步驟操作：

安裝 Vertex AI SDK for Python。
在 Google Cloud 控制台中執行這個指令，即可設定專案。

gcloud config set project {project}
執行這項指令來授權登入。

gcloud auth application-default login

執行 Vertex AI RAG 引擎

將這段範例程式碼複製並貼到 Google Cloud 控制台，即可執行 Vertex AI RAG 引擎。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai

# Create a RAG Corpus, Import Files, and Generate a response

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# display_name = "test_corpus"
# paths = ["https://drive.google.com/file/d/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Create RagCorpus
# Configure embedding model, for example "text-embedding-005".
embedding_model_config = rag.RagEmbeddingModelConfig(
    vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
        publisher_model="publishers/google/models/text-embedding-005"
    )
)

rag_corpus = rag.create_corpus(
    display_name=display_name,
    backend_config=rag.RagVectorDbConfig(
        rag_embedding_model_config=embedding_model_config
    ),
)

# Import Files to the RagCorpus
rag.import_files(
    rag_corpus.name,
    paths,
    # Optional
    transformation_config=rag.TransformationConfig(
        chunking_config=rag.ChunkingConfig(
            chunk_size=512,
            chunk_overlap=100,
        ),
    ),
    max_embedding_requests_per_min=1000,  # Optional
)

# Direct context retrieval
rag_retrieval_config = rag.RagRetrievalConfig(
    top_k=3,  # Optional
    filter=rag.Filter(vector_distance_threshold=0.5),  # Optional
)
response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=rag_corpus.name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="What is RAG and why it is helpful?",
    rag_retrieval_config=rag_retrieval_config,
)
print(response)

# Enhance generation
# Create a RAG retrieval tool
rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=rag_corpus.name,  # Currently only 1 corpus is allowed.
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag_retrieval_config,
        ),
    )
)

# Create a Gemini model instance
rag_model = GenerativeModel(
    model_name="gemini-2.0-flash-001", tools=[rag_retrieval_tool]
)

# Generate response
response = rag_model.generate_content("What is RAG and why it is helpful?")
print(response.text)
# Example response:
#   RAG stands for Retrieval-Augmented Generation.
#   It's a technique used in AI to enhance the quality of responses
# ...

curl

建立 RAG 語料庫。

  export LOCATION=LOCATION
  export PROJECT_ID=PROJECT_ID
  export CORPUS_DISPLAY_NAME=CORPUS_DISPLAY_NAME

  // CreateRagCorpus
  // Output: CreateRagCorpusOperationMetadata
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora \
  -d '{
        "display_name" : "'"CORPUS_DISPLAY_NAME"'"
    }'

詳情請參閱建立 RAG 語料庫範例。

匯入 RAG 檔案。

  // ImportRagFiles
  // Import a single Cloud Storage file or all files in a Cloud Storage bucket.
  // Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
  export RAG_CORPUS_ID=RAG_CORPUS_ID
  export GCS_URIS=GCS_URIS
  export CHUNK_SIZE=CHUNK_SIZE
  export CHUNK_OVERLAP=CHUNK_OVERLAP
  export EMBEDDING_MODEL_QPM_RATE=EMBEDDING_MODEL_QPM_RATE

  // Output: ImportRagFilesOperationMetadataNumber
  // Use ListRagFiles, or import_result_sink to get the correct rag_file_id.
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
  -d '{
    "import_rag_files_config": {
      "gcs_source": {
        "uris": "GCS_URIS"
      },
      "rag_file_chunking_config": {
        "chunk_size": CHUNK_SIZE,
        "chunk_overlap": CHUNK_OVERLAP
      },
      "max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
    }
  }'

詳情請參閱匯入 RAG 檔案範例。

執行 RAG 擷取查詢。

  export RAG_CORPUS_RESOURCE=RAG_CORPUS_RESOURCE
  export VECTOR_DISTANCE_THRESHOLD=VECTOR_DISTANCE_THRESHOLD
  export SIMILARITY_TOP_K=SIMILARITY_TOP_K

  {
  "vertex_rag_store": {
      "rag_resources": {
        "rag_corpus": "RAG_CORPUS_RESOURCE"
      },
      "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
    },
    "query": {
    "text": TEXT
    "similarity_top_k": SIMILARITY_TOP_K
    }
  }

  curl -X POST \
      -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      -H "Content-Type: application/json; charset=utf-8" \
      -d @request.json \
      "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"

詳情請參閱 RAG 引擎 API。

生成內容。

{
"contents": {
  "role": "USER",
  "parts": {
    "text": "INPUT_PROMPT"
  }
},
"tools": {
  "retrieval": {
  "disable_attribution": false,
  "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE"
    },
    "similarity_top_k": "SIMILARITY_TOP_K",
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
  }
  }
}
}

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"