本頁面由 Cloud Translation API 翻譯而成。

搭配使用 Vertex AI Vector Search 和 Vertex AI RAG 引擎

本頁說明如何將 Vertex AI RAG 引擎連結至 Vertex AI Vector Search。

您也可以使用 Vertex AI RAG 引擎搭配 Vertex AI Vector Search 筆記本，逐步完成操作。

設定 Vertex AI Vector Search

Vertex AI Vector Search 採用 Google 研究團隊開發的 Vector Search 技術，透過向量搜尋，您可以運用為 Google 搜尋、YouTube 和 Google Play 等 Google 產品奠定基礎的相同基礎架構。

如要與 Vertex AI RAG 引擎整合，必須使用空白的 Vector Search 索引。

設定 Vertex AI SDK

如要設定 Vertex AI SDK，請參閱「設定」。

建立 Vector Search 索引

如要建立與 RAG Corpus 相容的向量搜尋索引，索引必須符合下列條件：

IndexUpdateMethod 必須為 STREAM_UPDATE，請參閱建立串流索引。
距離度量類型必須明確設為下列其中一項：
- DOT_PRODUCT_DISTANCE
- COSINE_DISTANCE
向量的維度必須與您打算在 RAG 語料庫中使用的嵌入模型一致。您可以根據選擇調整其他參數，這些選擇會決定是否可以調整其他參數。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

def vector_search_create_streaming_index(
    project: str, location: str, display_name: str, gcs_uri: Optional[str] = None
) -> aiplatform.MatchingEngineIndex:
    """Create a vector search index.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        display_name (str): Required. The index display name
        gcs_uri (str): Optional. The Google Cloud Storage uri for index content

    Returns:
        The created MatchingEngineIndex.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create Index
    index = aiplatform.MatchingEngineIndex.create_tree_ah_index(
        display_name=display_name,
        contents_delta_uri=gcs_uri,
        description="Matching Engine Index",
        dimensions=100,
        approximate_neighbors_count=150,
        leaf_node_embedding_count=500,
        leaf_nodes_to_search_percent=7,
        index_update_method="STREAM_UPDATE",  # Options: STREAM_UPDATE, BATCH_UPDATE
        distance_measure_type=aiplatform.matching_engine.matching_engine_index_config.DistanceMeasureType.DOT_PRODUCT_DISTANCE,
    )

    return index

建立 Vector Search 索引端點

Vertex AI RAG 引擎支援公開端點。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

def vector_search_create_index_endpoint(
    project: str, location: str, display_name: str
) -> None:
    """Create a vector search index endpoint.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        display_name (str): Required. The index endpoint display name
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create Index Endpoint
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(
        display_name=display_name,
        public_endpoint_enabled=True,
        description="Matching Engine Index Endpoint",
    )

    print(index_endpoint.name)

將索引部署至索引端點

執行最鄰近搜尋前，必須先將索引部署至索引端點。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

def vector_search_deploy_index(
    project: str,
    location: str,
    index_name: str,
    index_endpoint_name: str,
    deployed_index_id: str,
) -> None:
    """Deploy a vector search index to a vector search index endpoint.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        index_name (str): Required. The index to update. A fully-qualified index
          resource name or a index ID.  Example:
          "projects/123/locations/us-central1/indexes/my_index_id" or
          "my_index_id".
        index_endpoint_name (str): Required. Index endpoint to deploy the index
          to.
        deployed_index_id (str): Required. The user specified ID of the
          DeployedIndex.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create the index instance from an existing index
    index = aiplatform.MatchingEngineIndex(index_name=index_name)

    # Create the index endpoint instance from an existing endpoint.
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoint_name
    )

    # Deploy Index to Endpoint
    index_endpoint = index_endpoint.deploy_index(
        index=index, deployed_index_id=deployed_index_id
    )

    print(index_endpoint.deployed_indexes)

如果是首次將索引部署至索引端點，系統會自動建構並啟動後端，大約需要 30 分鐘，索引才能儲存。首次部署後，索引會在幾秒內準備就緒。如要查看索引部署狀態，請開啟 Vector Search Console，選取「索引端點」分頁，然後選擇索引端點。

找出索引和索引端點的資源名稱，格式如下：

projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/indexes/${INDEX_ID}
projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/indexEndpoints/${INDEX_ENDPOINT_ID}。

在 Vertex AI RAG 引擎中使用 Vertex AI Vector Search

設定 Vector Search 執行個體後，請按照本節的步驟，將 Vector Search 執行個體設為 RAG 應用程式的向量資料庫。

設定向量資料庫，建立 RAG 語料庫

建立 RAG 語料庫時，請只指定完整的 INDEX_ENDPOINT_NAME 和 INDEX_NAME。請務必使用索引和索引端點資源名稱的數字 ID。系統會建立 RAG 語料庫，並自動與向量搜尋索引建立關聯。系統會根據條件執行驗證。如未符合任何一項規定，系統就會拒絕要求。

Python

在試用這個範例之前，請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Python API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# vector_search_index_name = "projects/{PROJECT_ID}/locations/{LOCATION}/indexes/{INDEX_ID}"
# vector_search_index_endpoint_name = "projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}"
# display_name = "test_corpus"
# description = "Corpus Description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure embedding model (Optional)
embedding_model_config = rag.RagEmbeddingModelConfig(
    vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
        publisher_model="publishers/google/models/text-embedding-005"
    )
)

# Configure Vector DB
vector_db = rag.VertexVectorSearch(
    index=vector_search_index_name, index_endpoint=vector_search_index_endpoint_name
)

corpus = rag.create_corpus(
    display_name=display_name,
    description=description,
    backend_config=rag.RagVectorDbConfig(
        rag_embedding_model_config=embedding_model_config,
        vector_db=vector_db,
    ),
)
print(corpus)
# Example response:
# RagCorpus(name='projects/1234567890/locations/us-central1/ragCorpora/1234567890',
# display_name='test_corpus', description='Corpus Description', embedding_model_config=...
# ...

REST

  # TODO(developer): Update and un-comment the following lines:
  # CORPUS_DISPLAY_NAME = "YOUR_CORPUS_DISPLAY_NAME"
  # Full index/indexEndpoint resource name
  # Index: projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/indexes/${INDEX_ID}
  # IndexEndpoint: projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/indexEndpoints/${INDEX_ENDPOINT_ID}
  # INDEX_RESOURCE_NAME = "YOUR_INDEX_ENDPOINT_RESOURCE_NAME"
  # INDEX_NAME = "YOUR_INDEX_RESOURCE_NAME"
  # Call CreateRagCorpus API to create a new RagCorpus
  curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://${LOCATION_ID}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/ragCorpora -d '{
        "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
        "rag_vector_db_config" : {
                "vertex_vector_search": {
                  "index":'\""${INDEX_NAME}"\"'
              "index_endpoint":'\""${INDEX_ENDPOINT_NAME}"\"'
                }
          }
    }'

  # Call ListRagCorpora API to verify the RagCorpus is created successfully
  curl -sS -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://${LOCATION_ID}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/ragCorpora"

使用 RAG API 匯入檔案

使用 ImportRagFiles API 將 Cloud Storage 或 Google 雲端硬碟中的檔案匯入 Vector Search 索引。檔案會嵌入並儲存在 Vector Search 索引中。

REST

# TODO(developer): Update and uncomment the following lines:
# RAG_CORPUS_ID = "your-rag-corpus-id"
#
# Google Cloud Storage bucket/file location.
# For example, "gs://rag-fos-test/"
# GCS_URIS= "your-gcs-uris"

# Call ImportRagFiles API to embed files and store in the BigQuery table
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": '\""${GCS_URIS}"\"'
    },
    "rag_file_chunking_config": {
      "chunk_size": 512
    }
  }
}'

# Call ListRagFiles API to verify the files are imported successfully
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# paths = ["https://drive.google.com/file/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.import_files(
    corpus_name=corpus_name,
    paths=paths,
    transformation_config=rag.TransformationConfig(
        rag.ChunkingConfig(chunk_size=512, chunk_overlap=100)
    ),
    import_result_sink="gs://sample-existing-folder/sample_import_result_unique.ndjson",  # Optional, this has to be an existing storage bucket folder, and file name has to be unique (non-existent).
    max_embedding_requests_per_min=900,  # Optional
)
print(f"Imported {response.imported_rag_files_count} files.")
# Example response:
# Imported 2 files.

使用 RAG API 擷取相關背景資訊

檔案匯入完成後，您可以使用 RetrieveContexts API 從向量搜尋索引擷取相關內容。

REST

# TODO(developer): Update and uncomment the following lines:
# RETRIEVAL_QUERY="your-retrieval-query"
#
# Full RAG corpus resource name
# Format:
# &quot;projects/${PROJECT_NUMBER}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}&quot;
# RAG_CORPUS_RESOURCE=&quot;your-rag-corpus-resource&quot;

# Call RetrieveContexts API to retrieve relevant contexts
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/us-central1:retrieveContexts \
  -d '{
    "vertex_rag_store": {
      "rag_resources": {
          "rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
        },
    },
    "query": {
      "text": '\""${RETRIEVAL_QUERY}"\"',
      "similarity_top_k": 10
    }
  }'

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/[rag_corpus_id]"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=corpus_name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="Hello World!",
    rag_retrieval_config=rag.RagRetrievalConfig(
        top_k=10,
        filter=rag.utils.resources.Filter(vector_distance_threshold=0.5),
    ),
)
print(response)
# Example response:
# contexts {
#   contexts {
#     source_uri: "gs://your-bucket-name/file.txt"
#     text: "....
#   ....

使用 Vertex AI Gemini API 生成內容

如要使用 Gemini 模型生成內容，請呼叫 Vertex AI GenerateContent API。在要求中指定 RAG_CORPUS_RESOURCE，API 就會自動從 Vector Search 索引擷取資料。

REST

# TODO(developer): Update and uncomment the following lines:
# MODEL_ID=gemini-2.0-flash
# GENERATE_CONTENT_PROMPT="your-generate-content-prompt"

# GenerateContent with contexts retrieved from the FeatureStoreOnline index
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json"  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/us-central1/publishers/google/models/${MODEL_ID}:generateContent \
-d '{
  "contents": {
    "role": "user",
    "parts": {
      "text": '\""${GENERATE_CONTENT_PROMPT}"\"'
    }
  },
  "tools": {
    "retrieval": {
      "vertex_rag_store": {
        "rag_resources": {
            "rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
          },
        "similarity_top_k": 8,
        "vector_distance_threshold": 0.32
      }
    }
  }
}'

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。


from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=corpus_name,
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag.RagRetrievalConfig(
                top_k=10,
                filter=rag.utils.resources.Filter(vector_distance_threshold=0.5),
            ),
        ),
    )
)

rag_model = GenerativeModel(
    model_name="gemini-2.0-flash-001", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)
# Example response:
#   The sky appears blue due to a phenomenon called Rayleigh scattering.
#   Sunlight, which contains all colors of the rainbow, is scattered
#   by the tiny particles in the Earth's atmosphere....
#   ...

後續步驟

使用 Vertex AI RAG 引擎，將 Vertex AI Search 做為檢索後端