将 Weaviate 数据库与 RAG Engine 搭配使用

本页介绍了如何将 RAG Engine 语料库连接到 Weaviate 数据库。

您也可以使用此笔记本 RAG Engine with Weaviate 进行操作。

您可以将 Weaviate 数据库实例(这是一个开源数据库)与 RAG Engine 搭配使用,以编制索引并执行基于向量的相似度搜索。相似搜索是一种用于查找与您要查找的文本相似的文本片段的方法,需要使用嵌入模型。嵌入模型会为要比较的每段文本生成向量数据。相似搜索用于检索语义上下文以进行归因,从而从 LLM 返回最准确的内容。

借助 RAG Engine,您可以继续使用全托管向量数据库实例,您负责预配此实例。RAG Engine 使用矢量数据库进行存储、索引管理和搜索。

注意事项

在使用 Weaviate 数据库之前,请考虑以下步骤:

  1. 您必须创建、配置和部署 Weaviate 数据库实例和集合。按照创建 Weaviate 集合中的说明,根据您的架构设置集合。
  2. 您必须提供 Weaviate API 密钥,以便 RAG Engine 与 Weaviate 数据库进行交互。RAG Engine 支持基于 API 密钥的 AuthNAuthZ,它们会连接到您的 Weaviate 数据库并支持 HTTPS 连接。
  3. RAG Engine 不会存储和管理您的 Weaviate API 密钥。而是必须执行以下操作:
    1. 将密钥存储在 Google Cloud Secret Manager 中。
    2. 向项目的服务账号授予访问 Secret 的权限。
    3. 向 RAG Engine 授予对 Secret 的资源名称的访问权限。
    4. 当您与 Weaviate 数据库交互时,RAG Engine 会使用您的服务账号访问您的 Secret 资源。
  4. RAG Engine 语料库与 Weaviate 集合是一对一映射。RAG 文件存储在 Weaviate 数据库集合中。调用 CreateRagCorpus API 或 UpdateRagCorpus API 时,RAG 语料库会与数据库集合相关联。
  5. 除了基于密集嵌入的语义搜索之外,RAG Engine 还通过 Weaviate 数据库支持混合搜索。您还可以在混合搜索中调整密集向量相似度和稀疏向量相似度的权重。

预配 Weaviate 数据库

在将 Weaviate 数据库与 RAG Engine 搭配使用之前,您必须执行以下操作:

  1. 配置和部署 Weaviate 数据库实例。
  2. 准备 HTTPS 端点。
  3. 创建您的 Weaviate 集合。
  4. 使用 API 密钥通过 AuthNAuthZ 预配 Weaviate。
  5. 预配 RAG Engine 服务账号。

配置和部署 Weaviate 数据库实例

您必须按照 Weaviate 官方指南快速入门中的说明操作。不过,您可以使用 Google Cloud Marketplace 指南(可选)。

您可以在任何位置设置 Weaviate 实例,只要能够在项目中访问 Weaviate 端点以进行配置和部署即可。然后,您可以完全管理 Weaviate 数据库实例。

由于 RAG Engine 不会参与 Weaviate 数据库实例生命周期的任何阶段,因此您有责任向 RAG Engine 授予权限,以便其能够在 Weaviate 数据库中存储和搜索数据。此外,您还应负责确保数据库中的数据可供 RAG Engine 使用。例如,如果您更改了数据,RAG Engine 对因这些更改而导致的任何意外行为概不负责。

准备 HTTPS 端点

在预配 Weaviate 期间,请确保创建 HTTPS 端点。虽然支持 HTTP 连接,但我们更希望 RAG Engine 和 Weaviate 数据库流量使用 HTTPS 连接。

创建 Weaviate 集合

由于 RAG Engine 语料库与 Weaviate 集合是一对一映射,因此您必须先在 Weaviate 数据库中创建集合,然后才能将集合与 RAG Engine 语料库相关联。当您调用 CreateRagCorpus API 或 UpdateRagCorpus API 时,系统会建立此一次性关联。

在 Weaviate 中创建合集时,您必须使用以下架构:

属性名称 数据类型
fileId text
corpusId text
chunkId text
chunkDataType text
chunkData text
fileOriginalUri text

使用 API 密钥通过 AuthNAuthZ 预配 Weaviate

预配 Weaviate API 密钥包括以下步骤:

  1. 创建 Weaviate API 密钥。
  2. 使用您的 Weaviate API 密钥配置 Weaviate。
  3. 将您的 Weaviate API 密钥存储在 Secret Manager 中。

创建 API 密钥

RAG Engine 只能使用您的 API 密钥进行身份验证和授权,才能连接到您的 Weaviate 数据库实例。您必须按照 Weaviate 官方身份验证指南在 Weaviate 数据库实例中配置基于 API 密钥的身份验证。

如果创建 Weaviate API 密钥需要与来自 RAG Engine 的身份信息相关联,您必须创建第一个语料库,并使用您的 RAG Engine 服务账号作为身份信息。

将 API 密钥存储在 Secret Manager 中

API 密钥包含敏感的个人身份信息 (SPII),需要遵守法律要求。如果 SPII 数据被泄露或滥用,个人可能会面临重大风险或伤害。为尽可能降低个人在使用 RAG Engine 时的风险,请勿存储和管理 API 密钥,并避免共享未加密的 API 密钥。

如需保护 SPII,请执行以下操作:

  1. 将 API 密钥存储在 Secret Manager 中。
  2. 向您的 RAG Engine 服务账号授予对 Secret 的权限,并在 Secret 资源级管理访问权限控制。
    1. 前往项目的权限
    2. 启用包括 Google 提供的角色授权选项。
    3. 查找服务账号,其格式为

      service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com

    4. 修改服务账号的主账号。
    5. 向服务账号添加 Secret Manager Secret Accessor 角色。
  3. 在创建或更新 RAG 语料库期间,将 Secret 资源名称传递给 RAG Engine,并存储 Secret 资源名称。

当您向 Weaviate 数据库实例发出 API 请求时,RAG Engine 会使用每个服务账号从您的项目中读取与 Secret Manager 中的 Secret 资源对应的 API 密钥。

预配 RAG Engine 服务账号

当您在项目中创建第一个资源时,RAG Engine 会创建一个专用服务账号。您可以在项目的 IAM 页面中找到您的服务账号。服务账号遵循以下格式:

service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com

例如 service-123456789@gcp-sa-vertex-rag.iam.gserviceaccount.com

与 Weaviate 数据库集成时,您的服务账号会在以下场景中使用:

  • 您可以使用服务账号生成 Weaviate API 密钥以进行身份验证。在某些情况下,生成 API 密钥不需要任何用户信息,这意味着生成 API 密钥时不需要服务账号。
  • 您可以将服务账号与 Weaviate 数据库中的 API 密钥绑定,以配置身份验证 (AuthN) 和授权 (AuthZ)。不过,您不必使用服务账号。
  • 您可以在项目中存储 API 密钥 Secret Manager,并可以向服务账号授予对这些 Secret 资源的权限。
  • RAG Engine 使用服务账号从项目中的 Secret Manager 访问 API 密钥。

设置 Google Cloud 控制台环境

点击此处了解如何设置环境

选择以下任一标签页,了解如何设置环境:

Python

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

  8. 通过运行以下命令来安装或更新 Python 版 Vertex AI SDK:

    pip3 install --upgrade "google-cloud-aiplatform>=1.38"
        

Node.js

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

  8. 通过运行以下命令来安装或更新 Node.js 版 Vertex AI SDK:

    npm install @google-cloud/vertexai
        

Java

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

  8. 如需将 google-cloud-vertexai 添加为依赖项,请为您的环境添加相应的代码:

    带有 BOM 的 Maven

    将以下 HTML 添加到 pom.xml 中:

    <dependencyManagement>
      <dependencies>
        <dependency>
          <groupId>com.google.cloud</groupId>
          <artifactId>libraries-bom</artifactId>
          <version>26.32.0</version>
          <type>pom</type>
          <scope>import</scope>
        </dependency>
      </dependencies>
    </dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>com.google.cloud</groupId>
        <artifactId>google-cloud-vertexai</artifactId>
      </dependency>
    </dependencies>
                

    不带 BOM 的 Maven

    将以下 HTML 添加到 pom.xml 中:

    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>google-cloud-vertexai</artifactId>
      <version>0.4.0</version>
    </dependency>
              

    Gradle without BOM

    Add the following to your build.gradle

    implementation 'com.google.cloud:google-cloud-vertexai:0.4.0'

Go

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

  8. 查看可用的 Vertex AI API Go 软件包,以确定哪个软件包最符合您的项目需求:

    • 软件包 cloud.google.com/go/vertexai推荐

      vertexai 是人工编写的软件包,可通过其访问常用功能和特征。

      对于大多数使用 Vertex AI API 进行构建的开发者,建议将此软件包作为起点。如需访问此软件包尚未涵盖的功能和特性,请改用自动生成的 aiplatform

    • 软件包 cloud.google.com/go/aiplatform

      aiplatform 是自动生成的软件包。

      此软件包适用于需要访问人工编写的 vertexai 软件包尚未提供的 Vertex AI API 功能和特性的项目。

  9. 通过运行以下命令之一,根据项目需求安装所需的 Go 软件包:

    # Human authored package. Recommended for most developers.
    go get cloud.google.com/go/vertexai
        
    # Auto-generated package. go get cloud.google.com/go/aiplatform

C#

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

REST

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. 输入以下命令来配置环境变量。将 PROJECT_ID 替换为您的 Google Cloud 项目的 ID。
    MODEL_ID="gemini-1.5-flash-002"
    PROJECT_ID="PROJECT_ID"
        
  8. 预配端点:
    gcloud beta services identity create --service=aiplatform.googleapis.com --project=${PROJECT_ID}
        
  9. 可选:如果您使用的是 Cloud Shell,并且系统要求您为 Cloud Shell 授权,请点击授权

准备 RAG 语料库

如需访问 Weaviate 数据库中的数据,RAG Engine 必须有权访问 RAG 语料库。本部分介绍了创建单个 RAG 语料库和其他 RAG 语料库的步骤。

使用 CreateRagCorpusUpdateRagCorpus API

调用 CreateRagCorpusUpdateRagCorpus API 时,您必须指定以下字段:

  • rag_vector_db_config.weaviate:调用 CreateRagCorpus API 后,系统会选择矢量数据库配置。矢量数据库配置包含所有配置字段。如果未设置 rag_vector_db_config.weaviate 字段,则系统会默认设置 rag_vector_db_config.rag_managed_db
  • weaviate.http_endpoint:在预配 Weaviate 数据库实例期间,系统会创建 HTTPS 或 HTTP Weaviate 端点。
  • weaviate.collection_name:在 Weaviate 实例预配期间创建的集合的名称。名称必须以大写字母开头。
  • api_auth.api_key_config:配置指定使用 API 密钥授权您访问矢量数据库。
  • api_key_config.api_key_secret_version:存储在 Secret Manager 中的 Secret 的资源名称,其中包含您的 Weaviate API 密钥。

您可以在数据库实例中创建 RAG 语料库并将其与 Weaviate 集合相关联。不过,您可能需要使用该服务账号生成 API 密钥并配置 Weaviate 数据库实例。创建第一个 RAG 语料库时,系统会生成服务账号。创建第一个 RAG 语料库后,Weaviate 数据库与 API 密钥之间的关联可能无法用于创建另一个 RAG 语料库。

如果您的数据库和密钥尚未准备好与 RAG 语料库相关联,请对 RAG 语料库执行以下操作:

  1. 设置 rag_vector_db_config 中的 weaviate 字段。

    • 您无法更改关联的矢量数据库。
    • http_endpointcollection_name 字段留空。您可以稍后更新这两个字段。
  2. 如果您的 API 密钥未存储在 Secret Manager 中,则可以将 api_auth 字段留空。调用 UpdateRagCorpus API 时,您可以更新 api_auth 字段。Weaviate 要求您执行以下操作:

    1. api_auth 字段中设置 api_key_config
    2. 在 Secret Manager 中设置 Weaviate API 密钥的 api_key_secret_versionapi_key_secret_version 字段采用以下格式:

      projects/{project}/secrets/{secret}/versions/{version}

  3. 如果您指定的字段只能设置一次(例如 http_endpointcollection_name),则无法更改它们,除非您删除 RAG 语料库并重新创建 RAG 语料库。其他字段(例如 API 密钥字段 api_key_secret_version)可以更新。

  4. 调用 UpdateRagCorpus 时,您可以设置 vector_db 字段。CreateRagCorpus API 调用应将 vector_db 设置为 weaviate。否则,系统会选择 RAG 管理的数据库选项(默认选项)。调用 UpdateRagCorpus API 时,此选项无法更改。当您调用 UpdateRagCorpusvector_db 字段已部分设置时,您可以更新标记为可变(也称为可更改)的字段。

下表列出了代码中使用的 WeaviateConfig 可变字段和不可变字段。

字段名称 可变或不可变
http_endpoint 一经设置便无法更改
collection_name 一经设置便无法更改
api_key_authentication 可更改

创建第一个 RAG 语料库

如果 RAG Engine 服务账号不存在,请执行以下操作:

  1. 在 RAG Engine 中使用空的 Weaviate 配置创建 RAG 语料库,这会启动 RAG Engine 预配以创建服务账号。
  2. 为您的 RAG Engine 服务账号选择一个名称,格式如下:

    service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com

    例如 service-123456789@gcp-sa-vertex-rag.iam.gserviceaccount.com

  3. 使用您的服务账号访问存储在项目的 Secret Manager 中的密钥,其中包含您的 Weaviate API 密钥。
  4. 在 Weaviate 预配完成后,获取以下信息:
    • 您的 Weaviate HTTPS 或 HTTP 端点。
    • 您的 Weaviate 合集的名称。
  5. 调用 CreateRagCorpus API 以创建具有空 Weaviate 配置的 RAG 语料库,然后调用 UpdateRagCorpus API 以使用以下信息更新 RAG 语料库:
    • 您的 Weaviate HTTPS 或 HTTP 端点。
    • 您的 Weaviate 合集的名称。
    • API 密钥资源名称。

创建另一个 RAG 语料库

如果 RAG Engine 服务账号存在,请执行以下操作:

  1. 项目的权限中获取您的 RAG Engine 服务账号。
  2. 启用“包括 Google 提供的角色授权”选项
  3. 为您的 RAG Engine 服务账号选择一个名称,格式如下:

    service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com

  4. 使用您的服务账号访问存储在项目的 Secret Manager 中的密钥,其中包含您的 Weaviate API 密钥。
  5. 在 Weaviate 预配期间,请获取以下信息:
    • Weaviate HTTPS 或 HTTP 端点。
    • 您的 Weaviate 合集的名称。
  6. 在 RAG Engine 中创建 RAG 语料库,然后执行以下操作之一,将其与您的 Weaviate 集合相关联:
    1. 发出 CreateRagCorpus API 调用,以创建包含已填充的 Weaviate 配置的 RAG 语料库,这是首选选项。
    2. 发出 CreateRagCorpus API 调用以创建具有空 Weaviate 配置的 RAG 语料库,然后发出 UpdateRagCorpus API 调用以使用以下信息更新 RAG 语料库:
      • Weaviate 数据库 HTTP 端点
      • Weaviate 合集名称
      • API 密钥

示例

本部分提供了示例代码,演示了如何设置 Weaviate 数据库、Secret Manager、RAG 语料库和 RAG 文件。我们还提供了示例代码,演示了如何导入文件、检索上下文、生成内容以及删除 RAG 语料库和 RAG 文件。

如需使用 Model Garden RAG API 笔记本,请参阅将 Weaviate 与 Llama 3 搭配使用

设置 Weaviate 数据库

此代码示例演示了如何设置 Weaviate 数据和 Secret Manager。

REST

# TODO(developer): Update the variables.
# The HTTPS/HTTP Weaviate endpoint you created during provisioning.
HTTP_ENDPOINT_NAME="https://your.weaviate.endpoint.com"

# Your Weaviate API Key.
WEAVIATE_API_KEY="example-api-key"

# Select your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
# For example, "MyCollectionName"
# Note that the first letter needs to be capitalized.
# Otherwise, Weavaite will capitalize it for you.
WEAVIATE_COLLECTION_NAME="MyCollectionName"

# Create a collection in Weaviate which includes the required schema fields shown below.
echo '{
  "class": "'${WEAVIATE_COLLECTION_NAME}'",
  "properties": [
    { "name": "fileId", "dataType": [ "string" ] },
    { "name": "corpusId", "dataType": [ "string" ] },
    { "name": "chunkId", "dataType": [ "string" ] },
    { "name": "chunkDataType", "dataType": [ "string" ] },
    { "name": "chunkData", "dataType": [ "string" ] },
    { "name": "fileOriginalUri", "dataType": [ "string" ] }
  ]
}' | curl \
    -X POST \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer "${WEAVIATE_API_KEY} \
    -d @- \
    ${HTTP_ENDPOINT_NAME}/v1/schema

设置 Secret Manager

如需设置 Secret Manager,您必须启用 Secret Manager 并设置权限。

创建 Secret

如需启用 Secret Manager,请执行以下操作:

控制台

  1. 前往 Secret Manager 页面。

    前往 Secret Manager

  2. 点击 + 创建 Secret

  3. 输入 Secret 的名称。Secret 名称只能包含英文字母 (A-Z)、数字 (0-9)、短划线 (-) 和下划线 (_)。

  4. 指定以下字段是可选的:

    1. 如需上传包含密钥的文件,请点击浏览
    2. 阅读复制政策
    3. 如果您想手动管理 Secret 的位置,请选中手动管理此 Secret 的位置。必须至少选择一个区域。
    4. 选择加密选项。
    5. 如果您想手动设置轮播周期,请选中设置轮播周期
    6. 如果您想指定发布或订阅主题以接收事件通知,请点击添加主题
    7. 默认情况下,密文永不过期。如果您想设置到期日期,请选中设置到期日期
    8. 默认情况下,系统会在收到请求后销毁密钥版本。如需推迟销毁密钥版本,请选中设置推迟销毁时长
    9. 如果您想使用标签对 Secret 进行整理和分类,请点击 + 添加标签
    10. 如果您想使用注解将非身份元数据附加到 Secret,请点击 + 添加注解
  5. 点击创建密钥

REST

# Create a secret in SecretManager.
curl "https://secretmanager.googleapis.com/v1/projects/${PROJECT_ID}/secrets?secretId=${SECRET_NAME}" \
    --request "POST" \
    --header "authorization: Bearer $(gcloud auth print-access-token)" \
    --header "content-type: application/json" \
    --data "{\"replication\": {\"automatic\": {}}}"

Python

在尝试此示例之前,请按照《Vertex AI 快速入门:使用客户端库》中的 Python 设置说明执行操作。 如需了解详情,请参阅 Vertex AI Python API 参考文档

如需向 Vertex AI 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

def create_secret(
    project_id: str, secret_id: str, ttl: Optional[str] = None
) -> secretmanager.Secret:
    """
    Create a new secret with the given name. A secret is a logical wrapper
    around a collection of secret versions. Secret versions hold the actual
    secret material.

     Args:
        project_id (str): The project ID where the secret is to be created.
        secret_id (str): The ID to assign to the new secret. This ID must be unique within the project.
        ttl (Optional[str]): An optional string that specifies the secret's time-to-live in seconds with
                             format (e.g., "900s" for 15 minutes). If specified, the secret
                             versions will be automatically deleted upon reaching the end of the TTL period.

    Returns:
        secretmanager.Secret: An object representing the newly created secret, containing details like the
                              secret's name, replication settings, and optionally its TTL.

    Example:
        # Create a secret with automatic replication and no TTL
        new_secret = create_secret("my-project", "my-new-secret")

        # Create a secret with a TTL of 30 days
        new_secret_with_ttl = create_secret("my-project", "my-timed-secret", "7776000s")
    """

    # Import the Secret Manager client library.
    from google.cloud import secretmanager

    # Create the Secret Manager client.
    client = secretmanager.SecretManagerServiceClient()

    # Build the resource name of the parent project.
    parent = f"projects/{project_id}"

    # Create the secret.
    response = client.create_secret(
        request={
            "parent": parent,
            "secret_id": secret_id,
            "secret": {"replication": {"automatic": {}}, "ttl": ttl},
        }
    )

    # Print the new secret name.
    print(f"Created secret: {response.name}")

设置权限

您必须向服务账号授予 Secret Manager 权限。

控制台

  1. 在 Google Cloud 控制台的 IAM 和管理部分中,找到您的服务账号,然后点击铅笔图标进行修改。

  2. 角色字段中,选择 Secret Manager Secret Accessor

Python

在尝试此示例之前,请按照《Vertex AI 快速入门:使用客户端库》中的 Python 设置说明执行操作。 如需了解详情,请参阅 Vertex AI Python API 参考文档

如需向 Vertex AI 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

def iam_grant_access(
    project_id: str, secret_id: str, member: str
) -> iam_policy_pb2.SetIamPolicyRequest:
    """
    Grant the given member access to a secret.
    """

    # Import the Secret Manager client library.
    from google.cloud import secretmanager

    # Create the Secret Manager client.
    client = secretmanager.SecretManagerServiceClient()

    # Build the resource name of the secret.
    name = client.secret_path(project_id, secret_id)

    # Get the current IAM policy.
    policy = client.get_iam_policy(request={"resource": name})

    # Add the given member with access permissions.
    policy.bindings.add(role="roles/secretmanager.secretAccessor", members=[member])

    # Update the IAM Policy.
    new_policy = client.set_iam_policy(request={"resource": name, "policy": policy})

    # Print data about the secret.
    print(f"Updated IAM policy on {secret_id}")

添加 Secret 版本

REST

# TODO(developer): Update the variables.
# Select a resource name for your Secret, which contains your API Key.
SECRET_NAME="MyWeaviateApiKeySecret"

# Your Weaviate API Key.
WEAVIATE_API_KEY="example-api-key"
# Encode your WEAVIATE_API_KEY using base 64.
SECRET_DATA=$(echo ${WEAVIATE_API_KEY} | base64)

# Create a new version of your secret which uses SECRET_DATA as payload
curl "https://secretmanager.googleapis.com/v1/projects/${PROJECT_ID}/secrets/${SECRET_NAME}:addVersion" \
    --request "POST" \
    --header "authorization: Bearer $(gcloud auth print-access-token)" \
    --header "content-type: application/json" \
    --data "{\"payload\": {\"data\": \"${SECRET_DATA}\"}}"

Python

在尝试此示例之前,请按照《Vertex AI 快速入门:使用客户端库》中的 Python 设置说明执行操作。 如需了解详情,请参阅 Vertex AI Python API 参考文档

如需向 Vertex AI 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import secretmanager
import google_crc32c  # type: ignore


def add_secret_version(
    project_id: str, secret_id: str, payload: str
) -> secretmanager.SecretVersion:
    """
    Add a new secret version to the given secret with the provided payload.
    """

    # Create the Secret Manager client.
    client = secretmanager.SecretManagerServiceClient()

    # Build the resource name of the parent secret.
    parent = client.secret_path(project_id, secret_id)

    # Convert the string payload into a bytes. This step can be omitted if you
    # pass in bytes instead of a str for the payload argument.
    payload_bytes = payload.encode("UTF-8")

    # Calculate payload checksum. Passing a checksum in add-version request
    # is optional.
    crc32c = google_crc32c.Checksum()
    crc32c.update(payload_bytes)

    # Add the secret version.
    response = client.add_secret_version(
        request={
            "parent": parent,
            "payload": {
                "data": payload_bytes,
                "data_crc32c": int(crc32c.hexdigest(), 16),
            },
        }
    )

    # Print the new secret version name.
    print(f"Added secret version: {response.name}")

将 Weaviate 与 Llama 3 搭配使用

Model Garden RAG API 笔记本演示了如何将 Python 版 Vertex AI SDK 与 Weaviate 语料库和 Llama 3 模型搭配使用。如需使用该记事本,您必须执行以下操作:

  1. 设置 Weaviate 数据库

  2. 设置 Secret Manager

  3. 使用 Model Garden RAG API 笔记本

如需查看更多示例,请参阅示例

创建 RAG 语料库

此代码示例演示了如何创建 RAG 语料库,并将 Weaviate 实例设置为其向量数据库。

REST

  # TODO(developer): Update the variables.
  PROJECT_ID = "YOUR_PROJECT_ID"
  # The HTTPS/HTTP Weaviate endpoint you created during provisioning.
  HTTP_ENDPOINT_NAME="https://your.weaviate.endpoint.com"

  # Your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
  # For example, "MyCollectionName"
  # Note that the first letter needs to be capitalized.
  # Otherwise, Weaviate will capitalize it for you.
  WEAVIATE_COLLECTION_NAME="MyCollectionName"

  # The resource name of your Weaviate API Key your Secret.
  SECRET_NAME="MyWeaviateApiKeySecret"
  # The Secret Manager resource name containing the API Key for your Weaviate endpoint.
  # For example, projects/{project}/secrets/{secret}/versions/latest
  APIKEY_SECRET_VERSION="projects/${PROJECT_ID}/secrets/${SECRET_NAME}/versions/latest"

  # Select a Corpus display name.
  CORPUS_DISPLAY_NAME="SpecialCorpus"

  # Call CreateRagCorpus API and set all Vector DB Config parameters for Weaviate to create a new corpus associated to your selected Weaviate collection.
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora \
  -d '{
        "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
        "rag_vector_db_config" : {
                "weaviate": {
                      "http_endpoint": '\""${HTTP_ENDPOINT_NAME}"\"',
                      "collection_name": '\""${WEAVIATE_COLLECTION_NAME}"\"'
                },
          "api_auth" : {
                  "api_key_config": {
                        "api_key_secret_version": '\""${APIKEY_SECRET_VERSION}"\"'
                  }
          }
        }
    }'

  # TODO(developer): Update the variables.
  # Get operation_id returned in CreateRagCorpus.
  OPERATION_ID="your-operation-id"

  # Poll Operation status until done = true in the response.
  curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

  # Call ListRagCorpora API to verify the RAG corpus is created successfully.
  curl -sS -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora"

Python

在尝试此示例之前,请按照《Vertex AI 快速入门:使用客户端库》中的 Python 设置说明执行操作。 如需了解详情,请参阅 Vertex AI Python API 参考文档

如需向 Vertex AI 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# weaviate_http_endpoint = "weaviate-http-endpoint"
# weaviate_collection_name = "weaviate-collection-name"
# weaviate_api_key_secret_manager_version = "projects/{PROJECT_ID}/secrets/{SECRET_NAME}/versions/latest"
# display_name = "test_corpus"
# description = "Corpus Description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure embedding model (Optional)
embedding_model_config = rag.EmbeddingModelConfig(
    publisher_model="publishers/google/models/text-embedding-004"
)

# Configure Vector DB
vector_db = rag.Weaviate(
    weaviate_http_endpoint=weaviate_http_endpoint,
    collection_name=weaviate_collection_name,
    api_key=weaviate_api_key_secret_manager_version,
)

corpus = rag.create_corpus(
    display_name=display_name,
    description=description,
    embedding_model_config=embedding_model_config,
    vector_db=vector_db,
)
print(corpus)
# Example response:
# RagCorpus(name='projects/1234567890/locations/us-central1/ragCorpora/1234567890',
# display_name='test_corpus', description='Corpus Description', embedding_model_config=...
# ...

使用 RAG 文件

RAG API 负责处理文件上传、导入、列出和删除。

REST

在使用任何请求数据之前,请先进行以下替换:

  • PROJECT_ID:您的项目 ID
  • LOCATION:处理请求的区域。
  • RAG_CORPUS_IDRagCorpus 资源的 ID。
  • INPUT_FILE:本地文件的路径。
  • FILE_DISPLAY_NAMERagFile 的显示名。
  • RAG_FILE_DESCRIPTIONRagFile 的说明。

HTTP 方法和网址:

POST https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload

请求 JSON 正文:

{
 "rag_file": {
  "display_name": "FILE_DISPLAY_NAME",
  "description": "RAG_FILE_DESCRIPTION"
 }
}

如需发送请求,请选择以下方式之一:

curl

将请求正文保存在名为 INPUT_FILE 的文件中,然后执行以下命令:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @INPUT_FILE \
"https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"

PowerShell

将请求正文保存在名为 INPUT_FILE 的文件中,然后执行以下命令:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile INPUT_FILE `
-Uri "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload" | Select-Object -Expand Content
成功的响应会返回 RagFile 资源。RagFile.name 字段的最后一个组成部分是服务器生成的 rag_file_id

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# path = "path/to/local/file.txt"
# display_name = "file_display_name"
# description = "file description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_file = rag.upload_file(
    corpus_name=corpus_name,
    path=path,
    display_name=display_name,
    description=description,
)
print(rag_file)
# RagFile(name='projects/[PROJECT_ID]/locations/us-central1/ragCorpora/1234567890/ragFiles/09876543',
#  display_name='file_display_name', description='file description')

导入 RAG 文件

您可以从云端硬盘或 Cloud Storage 导入文件和文件夹。

REST

使用 response.metadata 可在 SDK 的 response 对象中查看部分失败、请求时间和响应时间。

在使用任何请求数据之前,请先进行以下替换:

  • PROJECT_ID:您的项目 ID
  • LOCATION:处理请求的区域。
  • RAG_CORPUS_IDRagCorpus 资源的 ID。
  • GCS_URIS:Cloud Storage 位置列表。示例:gs://my-bucket1, gs://my-bucket2
  • DRIVE_RESOURCE_ID:云端硬盘资源的 ID。示例:
    • https://drive.google.com/file/d/ABCDE
    • https://drive.google.com/corp/drive/u/0/folders/ABCDEFG
  • DRIVE_RESOURCE_TYPE:云端硬盘资源的类型。选项:
    • RESOURCE_TYPE_FILE - 文件
    • RESOURCE_TYPE_FOLDER - 文件夹
  • CHUNK_SIZE(可选):每个分块应具有的词元数。
  • CHUNK_OVERLAP(可选):分块之间的词元重叠数。

HTTP 方法和网址:

POST https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import

请求 JSON 正文:

{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": GCS_URIS
    },
    "google_drive_source": {
      "resource_ids": {
        "resource_id": DRIVE_RESOURCE_ID,
        "resource_type": DRIVE_RESOURCE_TYPE
      },
    }
  }
}

如需发送请求,请选择以下方式之一:

curl

将请求正文保存在名为 request.json 的文件中,然后执行以下命令:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"

PowerShell

将请求正文保存在名为 request.json 的文件中,然后执行以下命令:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content
成功的响应会返回 ImportRagFilesOperationMetadata 资源。

以下示例演示了如何从 Cloud Storage 导入文件。使用 max_embedding_requests_per_min 控制字段限制 RAG Engine 在 ImportRagFiles 索引编制过程中调用嵌入模型的速率。该字段的默认值为每分钟 1000 次调用。

// Cloud Storage bucket/file location.
// Such as "gs://rag-e2e-test/"
GCS_URIS=YOUR_GCS_LOCATION

// Enter the QPM rate to limit RAG's access to your embedding model
// Example: 1000
EMBEDDING_MODEL_QPM_RATE=MAX_EMBEDDING_REQUESTS_PER_MIN_LIMIT

// ImportRagFiles
// Import a single Cloud Storage file or all files in a Cloud Storage bucket.
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": '\""${GCS_URIS}"\"'
    },
    "rag_file_chunking_config": {
      "chunk_size": 512
    },
    "max_embedding_requests_per_min": '"${EMBEDDING_MODEL_QPM_RATE}"'
  }
}'

// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID=OPERATION_ID
poll_op_wait ${OPERATION_ID}

以下示例演示了如何从云端硬盘导入文件。使用 max_embedding_requests_per_min 控制字段限制 RAG Engine 在 ImportRagFiles 索引编制过程中调用嵌入模型的速率。该字段的默认值为每分钟 1000 次调用。

// Google Drive folder location.
FOLDER_RESOURCE_ID=YOUR_GOOGLE_DRIVE_FOLDER_RESOURCE_ID

// Enter the QPM rate to limit RAG's access to your embedding model
// Example: 1000
EMBEDDING_MODEL_QPM_RATE=MAX_EMBEDDING_REQUESTS_PER_MIN_LIMIT

// ImportRagFiles
// Import all files in a Google Drive folder.
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID, FOLDER_RESOURCE_ID
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "google_drive_source": {
      "resource_ids": {
        "resource_id": '\""${FOLDER_RESOURCE_ID}"\"',
        "resource_type": "RESOURCE_TYPE_FOLDER"
      }
    },
    "max_embedding_requests_per_min": '"${EMBEDDING_MODEL_QPM_RATE}"'
  }
}'

// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID=OPERATION_ID
poll_op_wait ${OPERATION_ID}

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# paths = ["https://drive.google.com/file/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.import_files(
    corpus_name=corpus_name,
    paths=paths,
    chunk_size=512,  # Optional
    chunk_overlap=100,  # Optional
    max_embedding_requests_per_min=900,  # Optional
)
print(f"Imported {response.imported_rag_files_count} files.")
# Example response:
# Imported 2 files.

获取 RAG 文件

REST

在使用任何请求数据之前,请先进行以下替换:

  • PROJECT_ID:您的项目 ID
  • LOCATION:处理请求的区域。
  • RAG_CORPUS_IDRagCorpus 资源的 ID。
  • RAG_FILE_IDRagFile 资源的 ID。

HTTP 方法和网址:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

如需发送请求,请选择以下方式之一:

curl

执行以下命令:

curl -X GET \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

执行以下命令:

$headers = @{  }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
成功的响应会返回 RagFile 资源。

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_file = rag.get_file(name=file_name)
print(rag_file)
# Example response:
# RagFile(name='projects/1234567890/locations/us-central1/ragCorpora/11111111111/ragFiles/22222222222',
# display_name='file_display_name', description='file description')

列出 RAG 文件

REST

在使用任何请求数据之前,请先进行以下替换:

  • PROJECT_ID:您的项目 ID
  • LOCATION:处理请求的区域。
  • RAG_CORPUS_IDRagCorpus 资源的 ID。
  • PAGE_SIZE:标准列表页面大小。您可以通过更新 page_size 参数来调整每页返回的 RagFiles 数量。
  • PAGE_TOKEN:标准列表页面词元。通常使用前一个 VertexRagDataService.ListRagFiles 调用的 ListRagFilesResponse.next_page_token 获取。

HTTP 方法和网址:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

如需发送请求,请选择以下方式之一:

curl

执行以下命令:

curl -X GET \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

执行以下命令:

$headers = @{  }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
您应该会收到一个成功状态代码 (2xx) 以及给定 RAG_CORPUS_ID 下的 RagFiles 列表。

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

files = rag.list_files(corpus_name=corpus_name)
for file in files:
    print(file.display_name)
    print(file.name)
# Example response:
# g-drive_file.txt
# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/222222222222
# g_cloud_file.txt
# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/333333333333

删除 RAG 文件

REST

在使用任何请求数据之前,请先进行以下替换:

  • PROJECT_ID:您的项目 ID
  • LOCATION:处理请求的区域。
  • RAG_CORPUS_IDRagCorpus 资源的 ID。
  • RAG_FILE_IDRagFile 资源的 ID。格式:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}

HTTP 方法和网址:

DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

如需发送请求,请选择以下方式之一:

curl

执行以下命令:

curl -X DELETE \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

执行以下命令:

$headers = @{  }

Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
成功的响应会返回 DeleteOperationMetadata 资源。

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag.delete_file(name=file_name)
print(f"File {file_name} deleted.")
# Example response:
# Successfully deleted the RagFile.
# File projects/1234567890/locations/us-central1/ragCorpora/1111111111/ragFiles/2222222222 deleted.

检索上下文

当用户提问或提供问题时,RAG 中的检索组件会搜索其知识库,以查找与查询相关的信息。

REST

在使用任何请求数据之前,请先进行以下替换:

  • LOCATION:处理请求的区域。
  • PROJECT_ID:您的项目 ID
  • RAG_CORPUS_RESOURCERagCorpus 资源的名称。格式:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
  • VECTOR_DISTANCE_THRESHOLD:仅返回向量距离小于阈值的上下文。
  • TEXT:要获取相关上下文的查询文本。
  • SIMILARITY_TOP_K:要检索的热门上下文数量。

HTTP 方法和网址:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts

请求 JSON 正文:

{
 "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE",
    },
    "vector_distance_threshold": 0.8
  },
  "query": {
   "text": "TEXT",
   "similarity_top_k": SIMILARITY_TOP_K
  }
 }

如需发送请求,请选择以下方式之一:

curl

将请求正文保存在名为 request.json 的文件中,然后执行以下命令:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"

PowerShell

将请求正文保存在名为 request.json 的文件中,然后执行以下命令:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content
您应该会收到一个成功状态代码 (2xx) 以及相关RagFiles 的列表。

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/[rag_corpus_id]"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=corpus_name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="Hello World!",
    similarity_top_k=10,  # Optional
    vector_distance_threshold=0.5,  # Optional
)
print(response)
# Example response:
# contexts {
#   contexts {
#     source_uri: "gs://your-bucket-name/file.txt"
#     text: "....
#   ....

生成内容

预测可控制用于生成内容的 LLM 方法。

REST

在使用任何请求数据之前,请先进行以下替换:

  • PROJECT_ID:您的项目 ID
  • LOCATION:处理请求的区域。
  • MODEL_ID:用于内容生成的 LLM 模型。示例:gemini-1.5-pro-002
  • GENERATION_METHOD:用于生成内容的 LLM 方法。选项:generateContentstreamGenerateContent
  • INPUT_PROMPT:发送到 LLM 用于生成内容的文本。尝试使用与上传的 rag 文件相关的问题。
  • RAG_CORPUS_RESOURCERagCorpus 资源的名称。格式:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
  • SIMILARITY_TOP_K(可选):要检索的热门上下文数量。
  • VECTOR_DISTANCE_THRESHOLD(可选):返回向量距离小于阈值的上下文。

HTTP 方法和网址:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD

请求 JSON 正文:

{
 "contents": {
  "role": "user",
  "parts": {
    "text": "INPUT_PROMPT"
  }
 },
 "tools": {
  "retrieval": {
   "disable_attribution": false,
   "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE",
    },
    "similarity_top_k": SIMILARITY_TOP_K,
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
   }
  }
 }
}

如需发送请求,请选择以下方式之一:

curl

将请求正文保存在名为 request.json 的文件中,然后执行以下命令:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"

PowerShell

将请求正文保存在名为 request.json 的文件中,然后执行以下命令:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content
成功的响应会返回生成的内容以及引用。

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档


from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=corpus_name,
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            similarity_top_k=3,  # Optional
            vector_distance_threshold=0.5,  # Optional
        ),
    )
)

rag_model = GenerativeModel(
    model_name="gemini-1.5-flash-001", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)
# Example response:
#   The sky appears blue due to a phenomenon called Rayleigh scattering.
#   Sunlight, which contains all colors of the rainbow, is scattered
#   by the tiny particles in the Earth's atmosphere....
#   ...

Weaviate 数据库支持混合搜索,可同时结合使用语义搜索和关键字搜索,从而提高搜索结果的相关性。在检索搜索结果期间,系统会将语义(稠密向量)和关键字匹配(稀疏向量)的相似度得分相结合,生成最终的排名结果。

使用 RAG Engine 检索 API 的混合搜索

以下示例展示了如何使用 RAG Engine 检索 API 启用混合搜索。

REST

  # TODO(developer): Update the variables.
  PROJECT_ID = "YOUR_PROJECT_ID"
  # The HTTPS/HTTP Weaviate endpoint you created during provisioning.
  HTTP_ENDPOINT_NAME="https://your.weaviate.endpoint.com"

  # Your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
  # For example, "MyCollectionName"
  # Note that the first letter needs to be capitalized.
  # Otherwise, Weaviate will capitalize it for you.
  WEAVIATE_COLLECTION_NAME="MyCollectionName"

  # The resource name of your Weaviate API Key your Secret.
  SECRET_NAME="MyWeaviateApiKeySecret"
  # The Secret Manager resource name containing the API Key for your Weaviate endpoint.
  # For example, projects/{project}/secrets/{secret}/versions/latest
  APIKEY_SECRET_VERSION="projects/${PROJECT_ID}/secrets/${SECRET_NAME}/versions/latest"

  # Select a Corpus display name.
  CORPUS_DISPLAY_NAME="SpecialCorpus"

  # Call CreateRagCorpus API and set all Vector DB Config parameters for Weaviate to create a new corpus associated to your selected Weaviate collection.
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora \
  -d '{
        "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
        "rag_vector_db_config" : {
                "weaviate": {
                      "http_endpoint": '\""${HTTP_ENDPOINT_NAME}"\"',
                      "collection_name": '\""${WEAVIATE_COLLECTION_NAME}"\"'
                },
          "api_auth" : {
                  "api_key_config": {
                        "api_key_secret_version": '\""${APIKEY_SECRET_VERSION}"\"'
                  }
          }
        }
    }'

  # TODO(developer): Update the variables.
  # Get operation_id returned in CreateRagCorpus.
  OPERATION_ID="your-operation-id"

  # Poll Operation status until done = true in the response.
  curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

  # Call ListRagCorpora API to verify the RAG corpus is created successfully.
  curl -sS -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora"

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/[rag_corpus_id]"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=corpus_name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="Hello World!",
    similarity_top_k=10,  # Optional
    vector_distance_threshold=0.5,  # Optional
)
print(response)
# Example response:
# contexts {
#   contexts {
#     source_uri: "gs://your-bucket-name/file.txt"
#     text: "....
#   ....

使用混合搜索和 RAG 引擎生成有依据的回答

以下示例展示了如何使用混合搜索和 RAG 引擎进行基于事实的生成。

REST

  # TODO(developer): Update the variables.
  PROJECT_ID = "YOUR_PROJECT_ID"
  # The HTTPS/HTTP Weaviate endpoint you created during provisioning.
  HTTP_ENDPOINT_NAME="https://your.weaviate.endpoint.com"

  # Your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
  # For example, "MyCollectionName"
  # Note that the first letter needs to be capitalized.
  # Otherwise, Weaviate will capitalize it for you.
  WEAVIATE_COLLECTION_NAME="MyCollectionName"

  # The resource name of your Weaviate API Key your Secret.
  SECRET_NAME="MyWeaviateApiKeySecret"
  # The Secret Manager resource name containing the API Key for your Weaviate endpoint.
  # For example, projects/{project}/secrets/{secret}/versions/latest
  APIKEY_SECRET_VERSION="projects/${PROJECT_ID}/secrets/${SECRET_NAME}/versions/latest"

  # Select a Corpus display name.
  CORPUS_DISPLAY_NAME="SpecialCorpus"

  # Call CreateRagCorpus API and set all Vector DB Config parameters for Weaviate to create a new corpus associated to your selected Weaviate collection.
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora \
  -d '{
        "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
        "rag_vector_db_config" : {
                "weaviate": {
                      "http_endpoint": '\""${HTTP_ENDPOINT_NAME}"\"',
                      "collection_name": '\""${WEAVIATE_COLLECTION_NAME}"\"'
                },
          "api_auth" : {
                  "api_key_config": {
                        "api_key_secret_version": '\""${APIKEY_SECRET_VERSION}"\"'
                  }
          }
        }
    }'

  # TODO(developer): Update the variables.
  # Get operation_id returned in CreateRagCorpus.
  OPERATION_ID="your-operation-id"

  # Poll Operation status until done = true in the response.
  curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

  # Call ListRagCorpora API to verify the RAG corpus is created successfully.
  curl -sS -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora"

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档


from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=corpus_name,
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            similarity_top_k=3,  # Optional
            vector_distance_threshold=0.5,  # Optional
        ),
    )
)

rag_model = GenerativeModel(
    model_name="gemini-1.5-flash-001", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)
# Example response:
#   The sky appears blue due to a phenomenon called Rayleigh scattering.
#   Sunlight, which contains all colors of the rainbow, is scattered
#   by the tiny particles in the Earth's atmosphere....
#   ...

后续步骤