此页面由 Cloud Translation API 翻译。

将 Pinecone 与 RAG Engine 搭配使用

本页介绍了如何将 RAG 语料库连接到 Pinecone 数据库。

您还可以使用此笔记本 RAG Engine with Pinecone 进行操作。

您可以将 Pinecone 数据库实例与 RAG Engine 搭配使用，以进行索引编制和基于矢量的相似度搜索。相似度搜索是一种用于查找与您要查找的文本相似的文本片段的方法，需要使用嵌入模型。嵌入模型会为要比较的每段文本生成向量数据。相似搜索用于检索标准答案关联的语义上下文，以便从 LLM 返回最准确的内容。

借助 RAG Engine，您可以继续使用由您负责预配的全代管向量数据库实例。RAG Engine 使用您的矢量数据库进行存储、索引管理和搜索。

考虑是否将 Pinecone 与 RAG Engine 搭配使用

请考虑以下事项，确定使用 Pinecone 数据库是否是您的 RAG 应用的最佳选择：

您必须创建、配置和管理 Pinecone 数据库实例的伸缩。
RAG Engine 会使用索引的默认命名空间。确保此命名空间无法被任何其他内容修改。
您必须提供 Pinecone API 密钥，以便 RAG Engine 与 Pinecone 数据库进行交互。RAG Engine 不会存储和管理您的 Pinecone API 密钥。而是必须执行以下操作：
- 将密钥存储在 Google Cloud Secret Manager 中。
- 向项目的服务账号授予访问 Secret 的权限。
- 向 RAG Engine 授予对 Secret 的资源名称的访问权限。
- 当您与 RAG 语料库互动时，RAG Engine 会使用您的服务账号访问您的 Secret 资源。
RAG 语料库与 Pinecone 索引是一对一映射。此关联是在 CreateRagCorpus API 调用或 UpdateRagCorpus API 调用中建立的。

创建 Pinecone 索引

如需创建 Pinecone 索引，您必须按照以下步骤操作：

请参阅 Pinecone 快速入门指南，了解必须在索引上指定哪些索引配置，才能使索引与 RAG 语料库兼容。
您希望确保松果索引的位置与您使用 RAG Engine 的位置相同或相近，原因如下：
- 您希望保持较低的延迟时间。
- 您希望满足适用法律规定的数据驻留要求。
在创建 Pinecone 索引期间，指定要与 RAG Engine 搭配使用的嵌入维度。下表提供了尺寸或尺寸位置：

模型维度大小

第一方 Gecko 768

经过微调的第一方 Gecko 768

E5 请参阅使用 OSS 嵌入模型。
选择以下支持的距离衡量标准之一：
- cosine
- dotproduct
- euclidean
可选：创建基于 pod 的索引时，您必须在 pod.metadata_config.indexed 字段上指定 file_id。如需了解详情，请参阅选择性元数据索引编制。

模型	维度大小
第一方 Gecko	768
经过微调的第一方 Gecko	768
E5	请参阅使用 OSS 嵌入模型。

创建 Pinecone API 密钥

RAG Engine 只能使用您的 API 密钥进行身份验证和授权，才能连接到您的 Pinecone 索引。您必须按照 Pinecone 官方指南进行身份验证，才能在 Pinecone 项目中配置基于 API 密钥的身份验证。

将 API 密钥存储在 Secret Manager 中

API 密钥包含敏感的个人身份信息 (SPII)，需要遵守法律要求。如果 SPII 数据被泄露或滥用，个人可能会面临重大风险或伤害。为尽可能降低个人在使用 RAG Engine 时的风险，请勿存储和管理 API 密钥，并避免共享未加密的 API 密钥。

如需保护 SPII，您必须执行以下操作：

将 API 密钥存储在 Secret Manager 中。
向您的 RAG Engine 服务账号授予对 Secret 的权限，并在 Secret 资源级别管理访问权限控制。
1. 前往项目的权限。
2. 启用包括 Google 提供的角色授权选项。
3. 找到服务账号，其格式为：
  
  service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com
4. 修改服务账号的主账号。
5. 将 Secret Manager Secret Accessor 角色添加到服务账号。
在创建或更新 RAG 语料库期间，将 Secret 资源名称传递给 RAG Engine，并存储 Secret 资源名称。

向 Pinecone 索引发出 API 请求时，RAG Engine 会使用每个服务账号从您的项目中读取与 Secret Manager 中的 Secret 资源对应的 API 密钥。

预配 RAG Engine 服务账号

当您在项目中创建第一个 RAG 语料库时，RAG Engine 会创建一个专用服务账号。您可以在项目的“Identity and Access Management”页面中找到您的服务账号。

服务账号遵循以下固定格式：

service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com

例如，

service-123456789@gcp-sa-vertex-rag.iam.gserviceaccount.com

准备 RAG 语料库

如需将 Pinecone 索引与 RAG Engine 搭配使用，您必须在创建索引的阶段将其与 RAG 语料库相关联。建立关联后，此绑定将在 RAG 语料库的整个生命周期内保持不变。您可以使用 CreateRagCorpus 或 UpdateRagCorpus API 进行关联。

为了让关联被视为已完成，您必须在 RAG 语料库上设置三个关键字段：

rag_vector_db_config.pinecone：此字段可帮助您选择要与 RAG 语料库关联的矢量数据库，并且必须在 CreateRagCorpus API 调用期间进行设置。如果未设置此参数，系统会将默认的矢量数据库选项 RagManagedDb 分配给您的 RAG 语料库。
rag_vector_db_config.pinecone.index_name：这是用于创建与 RAG 语料库搭配使用的 Pinecone 索引的名称。您可以在 CreateRagCorpus 调用期间设置名称，也可以在调用 UpdateRagCorpus API 时指定名称。
rag_vector_db_config.api_auth.api_key_config.api_key_secret_version：这是存储在 Secret Manager 中的 Secret 的完整资源名称，其中包含您的 Pinecone API 密钥。您可以在 CreateRagCorpus 调用期间设置名称，也可以在调用 UpdateRagCorpus API 时指定名称。在您指定此字段之前，无法将数据导入 RAG 语料库。
此字段应采用以下格式：
projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

创建 RAG 语料库

如果您有权访问 Pinecone 索引名称和已设置权限的 Secret 资源名称，则可以创建 RAG 语料库并将其与 Pinecone 索引相关联，如以下示例代码所示。

如果您是首次创建 RAG 语料库，则还没有准备好服务账号信息。不过，这些字段是可选字段，可以使用 UpdateRagCorpus API 与 RAG 语料库相关联。

如需查看有关如何在不提供服务账号信息的情况下创建 RAG 语料库的示例，请参阅在不提供索引名称或 API 密钥的情况下创建 RAG 语料库。

Python

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Python 设置说明执行操作。如需了解详情，请参阅 Vertex AI Python API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# pinecone_index_name = "pinecone-index-name"
# pinecone_api_key_secret_manager_version = "projects/{PROJECT_ID}/secrets/{SECRET_NAME}/versions/latest"
# display_name = "test_corpus"
# description = "Corpus Description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure embedding model (Optional)
embedding_model_config = rag.EmbeddingModelConfig(
    publisher_model="publishers/google/models/text-embedding-004"
)

# Configure Vector DB
vector_db = rag.Pinecone(
    index_name=pinecone_index_name,
    api_key=pinecone_api_key_secret_manager_version,
)

corpus = rag.create_corpus(
    display_name=display_name,
    description=description,
    embedding_model_config=embedding_model_config,
    vector_db=vector_db,
)
print(corpus)
# Example response:
# RagCorpus(name='projects/1234567890/locations/us-central1/ragCorpora/1234567890',
# display_name='test_corpus', description='Corpus Description', embedding_model_config=...
# ...

REST

   # Set your project ID under which you want to create the corpus
   PROJECT_ID = "YOUR_PROJECT_ID"

   # Choose a display name for your corpus
   CORPUS_DISPLAY_NAME=YOUR_CORPUS_DISPLAY_NAME

   # Set your Pinecone index name
   PINECONE_INDEX_NAME=YOUR_INDEX_NAME

   # Set the full resource name of your secret. Follows the format
   # projects/{PROJECT_NUMER}/secrets/{SECRET_ID}/versions/{VERSION_ID}
   SECRET_RESOURCE_NAME=YOUR_SECRET_RESOURCE_NAME

   # Call CreateRagCorpus API with all the Vector DB information.
   # You can also add the embedding model choice or set other RAG corpus parameters on
   # this call per your choice.
   curl -X POST \
   -H "Authorization: Bearer $(gcloud auth print-access-token)" \
   -H "Content-Type: application/json" \
   https://us-central1-aiplatform.googleapis.com}/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora -d '{
         "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
         "rag_vector_db_config" : {
            "pinecone": {"index_name": '\""${PINECONE_INDEX_NAME}"\"'},
            "api_auth": {"api_key_config":
                  {"api_key_secret_version": '\""${SECRET_RESOURCE_NAME}"\"'}
            }
         }
      }'

   # To poll the status of your RAG corpus creation, get the operation_id returned in
   # response of your CreateRagCorpus call.
   OPERATION_ID="YOUR_OPERATION_ID"

   # Poll Operation status until done = true in the response.
   # The response to this call will contain the ID for your created RAG corpus
   curl -X GET \
   -H "Authorization: Bearer $(gcloud auth print-access-token)" \
   -H "Content-Type: application/json" \
   https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

在不使用索引名称或 API 密钥的情况下创建 RAG 语料库

如果这是您创建的第一个 RAG 语料库，并且您无权访问服务账号详细信息，或者您尚未完成 Pinecone 索引的预配步骤，则仍可以创建 RAG 语料库。然后，您可以将 RAG 语料库与空的 Pinecone 配置相关联，并稍后添加详细信息。

必须考虑以下事项：

如果您未提供索引名称和 API 密钥密文名称，则无法将文件导入 RAG 语料库。
如果您选择 Pinecone 作为 RAG 语料库的矢量数据库，则日后无法切换到其他数据库。

此代码示例演示了如何使用 Pinecone 创建 RAG 语料库，而无需提供 Pinecone 索引名称或 API 密钥名称。稍后，您可以使用 UpdateRagCorpus API 指定缺少的信息。

Python

import vertexai
from vertexai.preview import rag

# Set Project
PROJECT_ID = "YOUR_PROJECT_ID"
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure the Pinecone vector DB information
vector_db = rag.Pinecone()

# Name your corpus
DISPLAY_NAME = "YOUR_CORPUS_NAME"

rag_corpus = rag.create_corpus(display_name=DISPLAY_NAME, vector_db=vector_db)

REST

# Set your project ID under which you want to create the corpus
PROJECT_ID = "YOUR_PROJECT_ID"

# Choose a display name for your corpus
CORPUS_DISPLAY_NAME=YOUR_CORPUS_DISPLAY_NAME

# Call CreateRagCorpus API with all the Vector DB information.
# You can also add the embedding model choice or set other RAG corpus parameters on
# this call per your choice.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com}/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora -d '{
      "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
      "rag_vector_db_config" : {
         "pinecone": {}
      }
   }'

# To poll the status of your RAG corpus creation, get the operation_id returned in
# response of your CreateRagCorpus call.
OPERATION_ID="YOUR_OPERATION_ID"

# Poll Operation status until done = true in the response.
# The response to this call will contain the ID for your created RAG corpus
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

更新 RAG 语料库

借助 UpdateRagCorpus API，您可以更新矢量数据库配置。如果之前未设置 Pinecone 索引名称和 API 密钥 Secret 版本，您可以使用 Pinecone API 更新这些字段。无法更新矢量数据库的选择。您可以选择是否提供 API 密钥 Secret。不过，如果您未指定 API 密钥 Secret，则可以将数据导入 RAG 语料库。

字段	可变性	必填或可选
`rag_vector_db_config.vector_db`	您做出选择后，便无法再更改。	必填
`rag_vector_db_config.pinecone.index_name`	在 RAG 语料库中设置此字段后，该字段将不可变。	必填
`rag_vector_db_config.api_auth.api_key_config.api_key_secret_version`	可更改。设置 API 密钥后，您将无法移除该密钥。	可选

Python

import vertexai
from vertexai.preview import rag

# Set Project
PROJECT_ID = "YOUR_PROJECT_ID"
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure the Pinecone vector DB information
vector_db = rag.Pinecone(index_name=)

# Name your corpus
DISPLAY_NAME = "YOUR_CORPUS_NAME"

rag_corpus = rag.create_corpus(display_name=DISPLAY_NAME, vector_db=vector_db)

REST

# Set your project ID for the corpus that you want to create.
PROJECT_ID = "YOUR_PROJECT_ID"

# Set your Pinecone index name
PINECONE_INDEX_NAME=YOUR_INDEX_NAME

# Set the full resource name of your secret. Follows the format
# projects/{PROJECT_NUMER}/secrets/{SECRET_ID}/versions/{VERSION_ID}
SECRET_RESOURCE_NAME=YOUR_SECRET_RESOURCE_NAME

# Call UpdateRagCorpus API with the Vector DB information.
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com}/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora -d '{
      "rag_vector_db_config" : {
         "pinecone": {"index_name": '\""${PINECONE_INDEX_NAME}"\"'},
         "api_auth": {"api_key_config":
               {"api_key_secret_version": '\""${SECRET_RESOURCE_NAME}"\"'}
         }
      }
   }'

# To poll the status of your RAG corpus creation, get the operation_id returned in
# response of your CreateRagCorpus call.
OPERATION_ID="YOUR_OPERATION_ID"

# Poll Operation status until done = true in the response.
# The response to this call will contain the ID for your created RAG corpus
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

后续步骤

如需详细了解如何选择嵌入模型，请参阅将嵌入模型与 RAG Engine 搭配使用。
如需详细了解如何导入文件，请参阅导入 RAG 文件。