このページは Cloud Translation API によって翻訳されました。

RAG クイックスタート

このページでは、Vertex AI SDK を使用して Vertex AI RAG Engine タスクを実行する方法について説明します。

このノートブック Vertex AI RAG Engine の概要を使用して、手順に沿って操作することもできます。

必要なロール

Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/aiplatform.user

gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE

Replace the following:

PROJECT_ID: Your project ID.
USER_IDENTIFIER: The identifier for your user account. For example, myemail@example.com.
ROLE: The IAM role that you grant to your user account.

Google Cloud コンソールを準備する

Vertex AI RAG Engine を使用する手順は次のとおりです。

Vertex AI SDK for Python をインストールします。
Google Cloud コンソールで次のコマンドを実行して、プロジェクトを設定します。

gcloud config set project {project}
次のコマンドを実行して、ログインを承認します。

gcloud auth application-default login

Vertex AI RAG Engine を実行する

このサンプルコードをコピーして Google Cloud コンソールに貼り付け、Vertex AI RAG Engine を実行します。

Python

Vertex AI SDK for Python のインストールまたは更新の方法については、Vertex AI SDK for Python をインストールするをご覧ください。詳細については、Python API リファレンスドキュメントをご覧ください。

from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai

# Create a RAG Corpus, Import Files, and Generate a response

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# display_name = "test_corpus"
# paths = ["https://drive.google.com/file/d/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-east4")

# Create RagCorpus
# Configure embedding model, for example "text-embedding-005".
embedding_model_config = rag.RagEmbeddingModelConfig(
    vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
        publisher_model="publishers/google/models/text-embedding-005"
    )
)

rag_corpus = rag.create_corpus(
    display_name=display_name,
    backend_config=rag.RagVectorDbConfig(
        rag_embedding_model_config=embedding_model_config
    ),
)

# Import Files to the RagCorpus
rag.import_files(
    rag_corpus.name,
    paths,
    # Optional
    transformation_config=rag.TransformationConfig(
        chunking_config=rag.ChunkingConfig(
            chunk_size=512,
            chunk_overlap=100,
        ),
    ),
    max_embedding_requests_per_min=1000,  # Optional
)

# Direct context retrieval
rag_retrieval_config = rag.RagRetrievalConfig(
    top_k=3,  # Optional
    filter=rag.Filter(vector_distance_threshold=0.5),  # Optional
)
response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=rag_corpus.name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="What is RAG and why it is helpful?",
    rag_retrieval_config=rag_retrieval_config,
)
print(response)

# Enhance generation
# Create a RAG retrieval tool
rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=rag_corpus.name,  # Currently only 1 corpus is allowed.
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag_retrieval_config,
        ),
    )
)

# Create a Gemini model instance
rag_model = GenerativeModel(
    model_name="gemini-2.0-flash-001", tools=[rag_retrieval_tool]
)

# Generate response
response = rag_model.generate_content("What is RAG and why it is helpful?")
print(response.text)
# Example response:
#   RAG stands for Retrieval-Augmented Generation.
#   It's a technique used in AI to enhance the quality of responses
# ...

curl

RAG コーパスを作成する。

  export LOCATION=LOCATION
  export PROJECT_ID=PROJECT_ID
  export CORPUS_DISPLAY_NAME=CORPUS_DISPLAY_NAME

  // CreateRagCorpus
  // Output: CreateRagCorpusOperationMetadata
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora \
  -d '{
        "display_name" : "'"CORPUS_DISPLAY_NAME"'"
    }'

詳細については、RAG コーパスの作成の例をご覧ください。

RAG ファイルをインポートする。

  // ImportRagFiles
  // Import a single Cloud Storage file or all files in a Cloud Storage bucket.
  // Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
  export RAG_CORPUS_ID=RAG_CORPUS_ID
  export GCS_URIS=GCS_URIS
  export CHUNK_SIZE=CHUNK_SIZE
  export CHUNK_OVERLAP=CHUNK_OVERLAP
  export EMBEDDING_MODEL_QPM_RATE=EMBEDDING_MODEL_QPM_RATE

  // Output: ImportRagFilesOperationMetadataNumber
  // Use ListRagFiles, or import_result_sink to get the correct rag_file_id.
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
  -d '{
    "import_rag_files_config": {
      "gcs_source": {
        "uris": "GCS_URIS"
      },
      "rag_file_chunking_config": {
        "chunk_size": CHUNK_SIZE,
        "chunk_overlap": CHUNK_OVERLAP
      },
      "max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
    }
  }'

詳細については、RAG ファイルのインポートの例をご覧ください。

RAG 取得クエリを実行します。

  export RAG_CORPUS_RESOURCE=RAG_CORPUS_RESOURCE
  export VECTOR_DISTANCE_THRESHOLD=VECTOR_DISTANCE_THRESHOLD
  export SIMILARITY_TOP_K=SIMILARITY_TOP_K

  {
  "vertex_rag_store": {
      "rag_resources": {
        "rag_corpus": "RAG_CORPUS_RESOURCE"
      },
      "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
    },
    "query": {
    "text": TEXT
    "similarity_top_k": SIMILARITY_TOP_K
    }
  }

  curl -X POST \
      -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      -H "Content-Type: application/json; charset=utf-8" \
      -d @request.json \
      "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"

詳細については、RAG Engine API をご覧ください。

コンテンツの生成

{
"contents": {
  "role": "USER",
  "parts": {
    "text": "INPUT_PROMPT"
  }
},
"tools": {
  "retrieval": {
  "disable_attribution": false,
  "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE"
    },
    "similarity_top_k": "SIMILARITY_TOP_K",
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
  }
  }
}
}

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"

詳細については、RAG Engine API をご覧ください。

次のステップ

RAG API の詳細については、Vertex AI RAG Engine API をご覧ください。
RAG からのレスポンスの詳細については、Vertex AI RAG Engine の検索と生成の出力をご覧ください。
Vertex AI RAG Engine の詳細については、Vertex AI RAG Engine の概要をご覧ください。