使用 Weaviate 資料庫搭配 Vertex AI RAG 引擎

本頁面說明如何將 RAG Engine 語料庫連結至 Weaviate 資料庫。

您也可以使用這本筆記本 RAG Engine with Weaviate

您可以搭配使用 RAG Engine 與 Weaviate 資料庫執行個體 (開放原始碼資料庫),建立索引並執行以向量為基礎的相似度搜尋。相似度搜尋可找出與您要搜尋的文字相似的文字片段,這需要使用嵌入模型。嵌入模型會為每個要比較的文字片段產生向量資料。相似度搜尋功能會用於擷取依據的語意脈絡,以便從 LLM 傳回最準確的內容。

使用 RAG Engine 時,您可以繼續使用全代管向量資料庫執行個體,但您必須負責佈建。RAG Engine 會使用向量資料庫進行儲存、索引管理和搜尋。

注意事項

使用 Weaviate 資料庫前,請先考量下列步驟:

  1. 您必須建立、設定及部署 Weaviate 資料庫執行個體和集合。按照「建立 Weaviate 集合」一文中的操作說明,根據結構定義設定集合。
  2. 您必須提供 Weaviate API 金鑰,RAG Engine 才能與 Weaviate 資料庫互動。RAG 引擎支援以 API 金鑰為基礎的 AuthNAuthZ,可連線至 Weaviate 資料庫,並支援 HTTPS 連線。
  3. RAG Engine 不會儲存及管理 Weaviate API 金鑰。請改用下列方法:
    1. 將金鑰儲存在 Google Cloud Secret Manager。
    2. 授予專案服務帳戶存取密鑰的權限。
    3. 提供密鑰資源名稱,讓 RAG Engine 能夠存取。
    4. 與 Weaviate 資料庫互動時,RAG Engine 會使用您的服務帳戶存取密鑰資源。
  4. RAG Engine 語料庫和 Weaviate 集合之間存在一對一對應關係。RAG 檔案會儲存在 Weaviate 資料庫集合中。呼叫 CreateRagCorpus API 或 UpdateRagCorpus API 時,RAG 語料庫會與資料庫集合建立關聯。
  5. 除了以稠密嵌入為基礎的語意搜尋,RAG 引擎也支援混合型搜尋,方法是透過 Weaviate 資料庫。您也可以在混合搜尋中,調整稠密和稀疏向量相似度的權重。

佈建 Weaviate 資料庫

如要搭配使用 Weaviate 資料庫與 RAG Engine,請先完成下列步驟:

  1. 設定及部署 Weaviate 資料庫執行個體。
  2. 準備 HTTPS 端點。
  3. 建立 Weaviate 集合。
  4. 使用 API 金鑰,透過 AuthNAuthZ 佈建 Weaviate。
  5. 佈建 RAG Engine 服務帳戶。

設定及部署 Weaviate 資料庫執行個體

請務必按照 Weaviate 官方指南快速入門導覽課程操作。不過,您可以選擇使用Google Cloud 市集指南

只要 Weaviate 端點可供存取,您就能在任何地方設定 Weaviate 執行個體,並在專案中設定及部署。接著,您就可以全面管理 Weaviate 資料庫執行個體。

由於 RAG Engine 不會參與 Weaviate 資料庫執行個體生命週期的任何階段,因此您有責任授予 RAG Engine 權限,讓該引擎可以在 Weaviate 資料庫中儲存及搜尋資料。此外,您也必須負責確保 RAG Engine 可以使用資料庫中的資料。舉例來說,如果您變更資料,RAG 引擎不會對因這些變更而導致的任何非預期行為負責。

準備 HTTPS 端點

在 Weaviate 佈建期間,請務必建立 HTTPS 端點。雖然系統支援 HTTP 連線,但我們建議 RAG Engine 和 Weaviate 資料庫流量使用 HTTPS 連線。

建立 Weaviate 集合

由於 RAG Engine 語料庫和 Weaviate 集合具有一對一對應關係,因此您必須先在 Weaviate 資料庫中建立集合,再將集合與 RAG Engine 語料庫建立關聯。呼叫 CreateRagCorpus API 或 UpdateRagCorpus API 時,系統會進行這項一次性關聯。

在 Weaviate 中建立集合時,必須使用下列結構定義:

屬性名稱 資料類型
fileId text
corpusId text
chunkId text
chunkDataType text
chunkData text
fileOriginalUri text

使用 API 金鑰,透過 AuthNAuthZ 佈建 Weaviate

如要佈建 Weaviate API 金鑰,請完成下列步驟:

  1. 建立 Weaviate API 金鑰。
  2. 使用 Weaviate API 金鑰設定 Weaviate。
  3. 將 Weaviate API 金鑰儲存在 Secret Manager。

建立 API 金鑰

RAG 引擎只能使用 API 金鑰進行驗證和授權,藉此連線至 Weaviate 資料庫執行個體。您必須按照 Weaviate 官方驗證指南操作,在 Weaviate 資料庫執行個體中設定以 API 金鑰為準的驗證。

如果建立 Weaviate API 金鑰時,需要與 RAG 引擎相關聯的身分資訊,您必須先建立第一個語料庫,並使用 RAG 引擎服務帳戶做為身分。

將 API 金鑰儲存在 Secret Manager

API 金鑰含有具敏感性的個人識別資訊 (SPII),因此須遵守法律規定。如果 SPII 資料外洩或遭到濫用,可能會對個人造成重大風險或傷害。為盡量降低個人使用 RAG 引擎時的風險,請勿儲存及管理 API 金鑰,並避免分享未加密的 API 金鑰。

如要保護 SPII,請採取下列做法:

  1. 將 API 金鑰儲存在 Secret Manager。
  2. 授予 RAG Engine 服務帳戶密鑰的權限,並在密鑰資源層級管理存取權控管。
    1. 前往專案權限
    2. 啟用「包含 Google 提供的角色授權」選項。
    3. 找出服務帳戶,格式為

      service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com

    4. 編輯服務帳戶主體。
    5. 將「Secret Manager 密鑰存取者」角色新增至服務帳戶。
  3. 在建立或更新 RAG 語料庫時,將密鑰資源名稱傳遞至 RAG Engine,並儲存密鑰資源名稱。

當您向 Weaviate 資料庫執行個體發出 API 要求時,RAG Engine 會使用每個服務帳戶,從專案中讀取 Secret Manager 內對應密鑰資源的 API 金鑰。

佈建 RAG Engine 服務帳戶

在專案中建立第一個資源時,RAG Engine 會建立專屬服務帳戶。您可以在專案的 IAM 頁面中找到服務帳戶。服務帳戶採用下列格式:

service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com

例如 service-123456789@gcp-sa-vertex-rag.iam.gserviceaccount.com

與 Weaviate 資料庫整合時,服務帳戶會用於下列情況:

  • 您可以使用服務帳戶產生 Weaviate API 金鑰,以進行驗證。在某些情況下,產生 API 金鑰不需要任何使用者資訊,也就是說,產生 API 金鑰時不需要服務帳戶。
  • 您可以在 Weaviate 資料庫中將服務帳戶與 API 金鑰繫結,藉此設定驗證 (AuthN) 和授權 (AuthZ)。不過,服務帳戶並非必要。
  • 您可以在專案中將 API 金鑰儲存在 Secret Manager,並授予服務帳戶這些密鑰資源的權限。
  • RAG Engine 會使用服務帳戶,從專案的 Secret Manager 存取 API 金鑰。

設定 Google Cloud 控制台環境

按一下即可瞭解如何設定環境

選取下列其中一個分頁,瞭解如何設定環境:

Python

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

    If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

  8. 執行下列指令,安裝或更新 Python 適用的 Vertex AI SDK:

    pip3 install --upgrade "google-cloud-aiplatform>=1.38"
        
  9. Node.js

      Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

      In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

      Go to project selector

      Enable the Vertex AI API.

      Enable the API

      In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

      Go to project selector

      Enable the Vertex AI API.

      Enable the API

    1. In the Google Cloud console, activate Cloud Shell.

      Activate Cloud Shell

      At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

    2. If you're using a local shell, then create local authentication credentials for your user account:

      gcloud auth application-default login

      You don't need to do this if you're using Cloud Shell.

      If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

    3. 執行下列指令,安裝或更新 Node.js 適用的 Vertex AI SDK:

      npm install @google-cloud/vertexai
          
    4. Java

        Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

        In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

        Go to project selector

        Enable the Vertex AI API.

        Enable the API

        In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

        Go to project selector

        Enable the Vertex AI API.

        Enable the API

      1. In the Google Cloud console, activate Cloud Shell.

        Activate Cloud Shell

        At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

      2. If you're using a local shell, then create local authentication credentials for your user account:

        gcloud auth application-default login

        You don't need to do this if you're using Cloud Shell.

        If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

      3. 如要將 google-cloud-vertexai 新增為依附元件,請為您的環境新增適當的程式碼:

        搭配 BOM 使用 Maven

        pom.xml 中新增下列 HTML:

        <dependencyManagement>
          <dependencies>
            <dependency>
              <groupId>com.google.cloud</groupId>
              <artifactId>libraries-bom</artifactId>
              <version>26.32.0</version>
              <type>pom</type>
              <scope>import</scope>
            </dependency>
          </dependencies>
        </dependencyManagement>
        <dependencies>
          <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-vertexai</artifactId>
          </dependency>
        </dependencies>
                    

        不使用 BOM 的 Maven

        pom.xml 中新增下列 HTML:

        <dependency>
          <groupId>com.google.cloud</groupId>
          <artifactId>google-cloud-vertexai</artifactId>
          <version>0.4.0</version>
        </dependency>
                  

        Gradle without BOM

        Add the following to your build.gradle

        implementation 'com.google.cloud:google-cloud-vertexai:0.4.0'
      4. Go

          Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

          In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

          Go to project selector

          Enable the Vertex AI API.

          Enable the API

          In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

          Go to project selector

          Enable the Vertex AI API.

          Enable the API

        1. In the Google Cloud console, activate Cloud Shell.

          Activate Cloud Shell

          At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

        2. If you're using a local shell, then create local authentication credentials for your user account:

          gcloud auth application-default login

          You don't need to do this if you're using Cloud Shell.

          If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

        3. 查看可用的 Vertex AI API Go 封裝,判斷最符合專案需求的封裝:

          • 套件 cloud.google.com/go/vertexai (建議使用)

            vertexai 是由人為撰寫的套件,可提供常見功能和特性的存取權。

            建議大部分開發人員從這個套件著手,使用 Vertex AI API 建構服務。如要存取這個套件尚未涵蓋的功能,請改用自動產生的 aiplatform

          • 套件 cloud.google.com/go/aiplatform

            aiplatform」是系統自動產生的套件。

            這個套件適用於需要存取 Vertex AI API 功能和特性的專案,這些功能和特性目前尚未由人工撰寫的 vertexai 套件提供。

        4. 根據專案需求執行下列其中一個指令,安裝所需的 Go 套件:

          # Human authored package. Recommended for most developers.
          go get cloud.google.com/go/vertexai
              
          # Auto-generated package. go get cloud.google.com/go/aiplatform
        5. C#

            Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

            In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

            Go to project selector

            Enable the Vertex AI API.

            Enable the API

            In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

            Go to project selector

            Enable the Vertex AI API.

            Enable the API

          1. In the Google Cloud console, activate Cloud Shell.

            Activate Cloud Shell

            At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

          2. If you're using a local shell, then create local authentication credentials for your user account:

            gcloud auth application-default login

            You don't need to do this if you're using Cloud Shell.

            If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

          3. REST

              Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

              In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

              Go to project selector

              Enable the Vertex AI API.

              Enable the API

              In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

              Go to project selector

              Enable the Vertex AI API.

              Enable the API

            1. In the Google Cloud console, activate Cloud Shell.

              Activate Cloud Shell

              At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

            2. 輸入下列內容,設定環境變數。將 PROJECT_ID 替換為專案 ID。 Google Cloud
              MODEL_ID="gemini-2.0-flash-001"
              PROJECT_ID="PROJECT_ID"
                  
            3. 佈建端點:
              gcloud beta services identity create --service=aiplatform.googleapis.com --project=${PROJECT_ID}
                  
            4. 選用:如果您使用 Cloud Shell,且系統要求您授權 Cloud Shell,請按一下「Authorize」(授權)

準備 RAG 語料庫

如要存取 Weaviate 資料庫中的資料,RAG Engine 必須有權存取 RAG 語料庫。本節說明如何建立單一 RAG 語料庫和額外的 RAG 語料庫。

使用 CreateRagCorpusUpdateRagCorpus API

呼叫 CreateRagCorpusUpdateRagCorpus API 時,必須指定下列欄位:

  • rag_vector_db_config.weaviate:呼叫 CreateRagCorpus API 後,系統會選擇向量資料庫設定。向量資料庫設定包含所有設定欄位。如未設定 rag_vector_db_config.weaviate 欄位,系統預設會設定 rag_vector_db_config.rag_managed_db
  • weaviate.http_endpoint:在佈建 Weaviate 資料庫執行個體時,系統會建立 HTTPS 或 HTTP Weaviate 端點。
  • weaviate.collection_name:在 Weaviate 執行個體佈建期間建立的集合名稱。名稱開頭必須為大寫字母。
  • api_auth.api_key_config:這項設定指定使用 API 金鑰授權存取向量資料庫。
  • api_key_config.api_key_secret_version:儲存在 Secret Manager 中的密鑰資源名稱,內含 Weaviate API 金鑰。

您可以建立 RAG 語料庫,並將其與資料庫執行個體中的 Weaviate 集合建立關聯。不過,您可能需要服務帳戶來產生 API 金鑰,以及設定 Weaviate 資料庫執行個體。建立第一個 RAG 語料庫時,系統會產生服務帳戶。建立第一個 RAG 語料庫後,Weaviate 資料庫與 API 金鑰之間的關聯可能尚未準備就緒,無法用於建立另一個 RAG 語料庫。

萬一資料庫和金鑰尚未準備好與 RAG 語料庫建立關聯,請對 RAG 語料庫執行下列操作:

  1. rag_vector_db_config 中設定 weaviate 欄位。

    • 無法變更相關聯的向量資料庫。
    • http_endpointcollection_name 欄位留空。這兩個欄位日後都可以更新。
  2. 如果 API 金鑰未儲存在 Secret Manager,則可將 api_auth 欄位留空。呼叫 UpdateRagCorpus API 時,您可以更新 api_auth 欄位。Weaviate 要求完成下列事項:

    1. api_auth 欄位中設定 api_key_config
    2. 在 Secret Manager 中設定 Weaviate API 金鑰的 api_key_secret_versionapi_key_secret_version 欄位使用下列格式:

      projects/{project}/secrets/{secret}/versions/{version}

  3. 如果您指定只能設定一次的欄位 (例如 http_endpointcollection_name),就無法變更這些欄位,除非刪除 RAG 語料庫,然後重新建立。其他欄位 (例如 API 金鑰欄位 api_key_secret_version) 也可以更新。

  4. 呼叫 UpdateRagCorpus 時,您可以設定 vector_db 欄位。vector_db 應由 CreateRagCorpus API 呼叫設為 weaviate。否則系統會選擇「RAG 管理的資料庫」選項,這是預設選項。呼叫 UpdateRagCorpusAPI 時,無法變更這個選項。呼叫 UpdateRagCorpus 時,如果 vector_db 欄位已部分設定,您可以更新標示為「可變更」 (也稱為「可變動」) 的欄位。

下表列出程式碼中使用的可變動和不可變動欄位。WeaviateConfig

欄位名稱 可變動或不可變動
http_endpoint 設定後即無法變更
collection_name 設定後即無法變更
api_key_authentication 可以變更

建立第一個 RAG 語料庫

如果 RAG Engine 服務帳戶不存在,請執行下列操作:

  1. 在 RAG Engine 中建立 RAG 語料庫,並使用空白的 Weaviate 設定,這會啟動 RAG Engine 佈建程序,建立服務帳戶。
  2. 為 RAG Engine 服務帳戶選擇名稱,格式如下:

    service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com

    例如 service-123456789@gcp-sa-vertex-rag.iam.gserviceaccount.com

  3. 使用服務帳戶存取專案 Secret Manager 中儲存的密鑰,其中包含 Weaviate API 金鑰。
  4. Weaviate 佈建完成後,請取得下列資訊:
    • Weaviate HTTPS 或 HTTP 端點。
    • Weaviate 集合的名稱。
  5. 呼叫 CreateRagCorpus API,建立具有空白 Weaviate 設定的 RAG 語料庫,並呼叫 UpdateRagCorpus API,使用下列資訊更新 RAG 語料庫:
    • Weaviate HTTPS 或 HTTP 端點。
    • Weaviate 集合的名稱。
    • API 金鑰資源名稱。

建立另一個 RAG 語料庫

如果 RAG Engine 服務帳戶存在,請執行下列操作:

  1. 專案權限取得 RAG Engine 服務帳戶。
  2. 啟用「包含 Google 提供的角色授權」選項
  3. 為 RAG Engine 服務帳戶選擇名稱,格式如下:

    service-{project number}@gcp-sa-vertex-rag.iam.gserviceaccount.com

  4. 使用服務帳戶存取專案 Secret Manager 中儲存的密鑰,其中包含 Weaviate API 金鑰。
  5. 在 Weaviate 佈建期間,請取得下列資訊:
    • Weaviate HTTPS 或 HTTP 端點。
    • Weaviate 集合的名稱。
  6. 在 RAG Engine 中建立 RAG 語料庫,然後執行下列任一操作,與 Weaviate 集合建立連結:
    1. 發出 CreateRagCorpus API 呼叫,建立已填入 Weaviate 設定的 RAG 語料庫 (建議使用這個選項)。
    2. 發出 CreateRagCorpus API 呼叫,建立 Weaviate 設定為空白的 RAG 語料庫,然後發出 UpdateRagCorpus API 呼叫,使用下列資訊更新 RAG 語料庫:
      • Weaviate 資料庫 HTTP 端點
      • Weaviate 集合名稱
      • API 金鑰

範例

本節提供程式碼範例,說明如何設定 Weaviate 資料庫、Secret Manager、RAG 語料庫和 RAG 檔案。我們也提供程式碼範例,說明如何匯入檔案、擷取內容、生成內容,以及刪除 RAG 語料庫和 RAG 檔案。

如要使用 Model Garden RAG API 筆記本,請參閱「搭配 Llama 3 使用 Weaviate」。

設定 Weaviate 資料庫

這個程式碼範例示範如何設定 Weaviate 資料和 Secret Manager。

REST

# TODO(developer): Update the variables.
# The HTTPS/HTTP Weaviate endpoint you created during provisioning.
HTTP_ENDPOINT_NAME="https://your.weaviate.endpoint.com"

# Your Weaviate API Key.
WEAVIATE_API_KEY="example-api-key"

# Select your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
# For example, "MyCollectionName"
# Note that the first letter needs to be capitalized.
# Otherwise, Weavaite will capitalize it for you.
WEAVIATE_COLLECTION_NAME="MyCollectionName"

# Create a collection in Weaviate which includes the required schema fields shown below.
echo '{
  "class": "'${WEAVIATE_COLLECTION_NAME}'",
  "properties": [
    { "name": "fileId", "dataType": [ "string" ] },
    { "name": "corpusId", "dataType": [ "string" ] },
    { "name": "chunkId", "dataType": [ "string" ] },
    { "name": "chunkDataType", "dataType": [ "string" ] },
    { "name": "chunkData", "dataType": [ "string" ] },
    { "name": "fileOriginalUri", "dataType": [ "string" ] }
  ]
}' | curl \
    -X POST \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer "${WEAVIATE_API_KEY} \
    -d @- \
    ${HTTP_ENDPOINT_NAME}/v1/schema

設定 Secret Manager

如要設定 Secret Manager,您必須啟用 Secret Manager 並設定權限。

建立密鑰

如要啟用 Secret Manager,請按照下列步驟操作:

控制台

  1. 前往「Secret Manager」頁面。

    前往 Secret Manager

  2. 按一下「+ 建立密鑰」

  3. 輸入密鑰的名稱。密鑰名稱只能由英文字母 (A 至 Z)、數字 (0 至 9)、破折號 (-) 和底線 (_) 組成。

  4. 您可以選擇是否要指定下列欄位:

    1. 如要上傳含有密鑰的檔案,請按一下「瀏覽」
    2. 詳閱複製政策
    3. 如要手動管理密鑰位置,請勾選「手動管理此密鑰的位置」。至少必須選取一個地區。
    4. 選取加密選項。
    5. 如要手動設定輪播週期,請勾選「設定輪播週期」
    6. 如要指定發布或訂閱主題來接收事件通知,請按一下「新增主題」
    7. 根據預設,密鑰不會過期。如要設定到期日,請勾選「設定到期日」
    8. 系統預設會在收到要求後刪除密鑰版本。如要延遲刪除密鑰版本,請勾選「設定延遲刪除時長」
    9. 如要使用標籤整理密鑰並加以分類,請按一下「+ 新增標籤」
    10. 如要使用註解將非識別的中繼資料附加到密鑰,請按一下「+ 新增註解」
  5. 按一下「建立密鑰」

REST

# Create a secret in SecretManager.
curl "https://secretmanager.googleapis.com/v1/projects/${PROJECT_ID}/secrets?secretId=${SECRET_NAME}" \
    --request "POST" \
    --header "authorization: Bearer $(gcloud auth print-access-token)" \
    --header "content-type: application/json" \
    --data "{\"replication\": {\"automatic\": {}}}"

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。

# Import the Secret Manager client library.
from google.cloud import secretmanager


def create_secret(
    project_id: str, secret_id: str, ttl: Optional[str] = None
) -> secretmanager.Secret:
    """
    Create a new secret with the given name. A secret is a logical wrapper
    around a collection of secret versions. Secret versions hold the actual
    secret material.

     Args:
        project_id (str): The project ID where the secret is to be created.
        secret_id (str): The ID to assign to the new secret. This ID must be unique within the project.
        ttl (Optional[str]): An optional string that specifies the secret's time-to-live in seconds with
                             format (e.g., "900s" for 15 minutes). If specified, the secret
                             versions will be automatically deleted upon reaching the end of the TTL period.

    Returns:
        secretmanager.Secret: An object representing the newly created secret, containing details like the
                              secret's name, replication settings, and optionally its TTL.

    Example:
        # Create a secret with automatic replication and no TTL
        new_secret = create_secret("my-project", "my-new-secret")

        # Create a secret with a TTL of 30 days
        new_secret_with_ttl = create_secret("my-project", "my-timed-secret", "7776000s")
    """

    # Create the Secret Manager client.
    client = secretmanager.SecretManagerServiceClient()

    # Build the resource name of the parent project.
    parent = f"projects/{project_id}"

    # Create the secret.
    response = client.create_secret(
        request={
            "parent": parent,
            "secret_id": secret_id,
            "secret": {"replication": {"automatic": {}}, "ttl": ttl},
        }
    )

    # Print the new secret name.
    print(f"Created secret: {response.name}")

設定權限

您必須授予服務帳戶 Secret Manager 權限。

控制台

  1. 在 Google Cloud 控制台的「IAM & Admin」(IAM 與管理) 部分中,找出您的服務帳戶,然後點選鉛筆圖示進行編輯。

  2. 在「角色」欄位中,選取「Secret Manager Secret Accessor」

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。

def iam_grant_access(
    project_id: str, secret_id: str, member: str
) -> iam_policy_pb2.SetIamPolicyRequest:
    """
    Grant the given member access to a secret.
    """

    # Import the Secret Manager client library.
    from google.cloud import secretmanager

    # Create the Secret Manager client.
    client = secretmanager.SecretManagerServiceClient()

    # Build the resource name of the secret.
    name = client.secret_path(project_id, secret_id)

    # Get the current IAM policy.
    policy = client.get_iam_policy(request={"resource": name})

    # Add the given member with access permissions.
    policy.bindings.add(role="roles/secretmanager.secretAccessor", members=[member])

    # Update the IAM Policy.
    new_policy = client.set_iam_policy(request={"resource": name, "policy": policy})

    # Print data about the secret.
    print(f"Updated IAM policy on {secret_id}")

新增密鑰版本

REST

# TODO(developer): Update the variables.
# Select a resource name for your Secret, which contains your API Key.
SECRET_NAME="MyWeaviateApiKeySecret"

# Your Weaviate API Key.
WEAVIATE_API_KEY="example-api-key"
# Encode your WEAVIATE_API_KEY using base 64.
SECRET_DATA=$(echo ${WEAVIATE_API_KEY} | base64)

# Create a new version of your secret which uses SECRET_DATA as payload
curl "https://secretmanager.googleapis.com/v1/projects/${PROJECT_ID}/secrets/${SECRET_NAME}:addVersion" \
    --request "POST" \
    --header "authorization: Bearer $(gcloud auth print-access-token)" \
    --header "content-type: application/json" \
    --data "{\"payload\": {\"data\": \"${SECRET_DATA}\"}}"

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。

from google.cloud import secretmanager
import google_crc32c  # type: ignore


def add_secret_version(
    project_id: str, secret_id: str, payload: str
) -> secretmanager.SecretVersion:
    """
    Add a new secret version to the given secret with the provided payload.
    """

    # Create the Secret Manager client.
    client = secretmanager.SecretManagerServiceClient()

    # Build the resource name of the parent secret.
    parent = client.secret_path(project_id, secret_id)

    # Convert the string payload into a bytes. This step can be omitted if you
    # pass in bytes instead of a str for the payload argument.
    payload_bytes = payload.encode("UTF-8")

    # Calculate payload checksum. Passing a checksum in add-version request
    # is optional.
    crc32c = google_crc32c.Checksum()
    crc32c.update(payload_bytes)

    # Add the secret version.
    response = client.add_secret_version(
        request={
            "parent": parent,
            "payload": {
                "data": payload_bytes,
                "data_crc32c": int(crc32c.hexdigest(), 16),
            },
        }
    )

    # Print the new secret version name.
    print(f"Added secret version: {response.name}")

搭配 Llama 3 使用 Weaviate

Model Garden RAG API 筆記本示範如何搭配使用 Vertex AI SDK for Python、Weaviate 語料庫和 Llama 3 模型。如要使用筆記本,請完成下列步驟:

  1. 設定 Weaviate 資料庫

  2. 設定 Secret Manager

  3. 使用 Model Garden RAG API 筆記本

如需更多範例,請參閱「範例」。

建立 RAG 語料庫

這個程式碼範例示範如何建立 RAG 語料庫,並將 Weaviate 執行個體設為向量資料庫。

REST

  # TODO(developer): Update the variables.
  PROJECT_ID = "YOUR_PROJECT_ID"
  # The HTTPS/HTTP Weaviate endpoint you created during provisioning.
  HTTP_ENDPOINT_NAME="https://your.weaviate.endpoint.com"

  # Your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
  # For example, "MyCollectionName"
  # Note that the first letter needs to be capitalized.
  # Otherwise, Weaviate will capitalize it for you.
  WEAVIATE_COLLECTION_NAME="MyCollectionName"

  # The resource name of your Weaviate API Key your Secret.
  SECRET_NAME="MyWeaviateApiKeySecret"
  # The Secret Manager resource name containing the API Key for your Weaviate endpoint.
  # For example, projects/{project}/secrets/{secret}/versions/latest
  APIKEY_SECRET_VERSION="projects/${PROJECT_ID}/secrets/${SECRET_NAME}/versions/latest"

  # Select a Corpus display name.
  CORPUS_DISPLAY_NAME="SpecialCorpus"

  # Call CreateRagCorpus API and set all Vector DB Config parameters for Weaviate to create a new corpus associated to your selected Weaviate collection.
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora \
  -d '{
        "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
        "rag_vector_db_config" : {
                "weaviate": {
                      "http_endpoint": '\""${HTTP_ENDPOINT_NAME}"\"',
                      "collection_name": '\""${WEAVIATE_COLLECTION_NAME}"\"'
                },
          "api_auth" : {
                  "api_key_config": {
                        "api_key_secret_version": '\""${APIKEY_SECRET_VERSION}"\"'
                  }
          }
        }
    }'

  # TODO(developer): Update the variables.
  # Get operation_id returned in CreateRagCorpus.
  OPERATION_ID="your-operation-id"

  # Poll Operation status until done = true in the response.
  curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

  # Call ListRagCorpora API to verify the RAG corpus is created successfully.
  curl -sS -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora"

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# weaviate_http_endpoint = "weaviate-http-endpoint"
# weaviate_collection_name = "weaviate-collection-name"
# weaviate_api_key_secret_manager_version = "projects/{PROJECT_ID}/secrets/{SECRET_NAME}/versions/latest"
# display_name = "test_corpus"
# description = "Corpus Description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure embedding model (Optional)
embedding_model_config = rag.EmbeddingModelConfig(
    publisher_model="publishers/google/models/text-embedding-004"
)

# Configure Vector DB
vector_db = rag.Weaviate(
    weaviate_http_endpoint=weaviate_http_endpoint,
    collection_name=weaviate_collection_name,
    api_key=weaviate_api_key_secret_manager_version,
)

corpus = rag.create_corpus(
    display_name=display_name,
    description=description,
    embedding_model_config=embedding_model_config,
    vector_db=vector_db,
)
print(corpus)
# Example response:
# RagCorpus(name='projects/1234567890/locations/us-central1/ragCorpora/1234567890',
# display_name='test_corpus', description='Corpus Description', embedding_model_config=...
# ...

使用 RAG 檔案

RAG API 可處理檔案上傳、匯入、列出和刪除作業。

REST

使用任何要求資料之前,請先替換以下項目:

  • PROJECT_ID:。
  • LOCATION:處理要求的區域。
  • RAG_CORPUS_IDRagCorpus 資源的 ID。
  • INPUT_FILE:本機檔案的路徑。
  • FILE_DISPLAY_NAMERagFile 的顯示名稱。
  • RAG_FILE_DESCRIPTIONRagFile 的說明。

HTTP 方法和網址:

POST https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload

JSON 要求主體:

{
 "rag_file": {
  "display_name": "FILE_DISPLAY_NAME",
  "description": "RAG_FILE_DESCRIPTION"
 }
}

如要傳送要求,請選擇以下其中一個選項:

curl

將要求主體儲存在名為 INPUT_FILE 的檔案中,然後執行下列指令:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @INPUT_FILE \
"https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"

PowerShell

將要求主體儲存在名為 INPUT_FILE 的檔案中,然後執行下列指令:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile INPUT_FILE `
-Uri "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload" | Select-Object -Expand Content
成功的回應會傳回 RagFile 資源。RagFile.name 欄位的最後一個元件是伺服器產生的 rag_file_id

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Python API 參考說明文件


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# path = "path/to/local/file.txt"
# display_name = "file_display_name"
# description = "file description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_file = rag.upload_file(
    corpus_name=corpus_name,
    path=path,
    display_name=display_name,
    description=description,
)
print(rag_file)
# RagFile(name='projects/[PROJECT_ID]/locations/us-central1/ragCorpora/1234567890/ragFiles/09876543',
#  display_name='file_display_name', description='file description')

匯入 RAG 檔案

您可以從雲端硬碟或 Cloud Storage 匯入檔案和資料夾。

REST

使用 response.metadata 即可在 SDK 的 response 物件中查看部分失敗、要求時間和回應時間。

使用任何要求資料之前,請先替換以下項目:

  • PROJECT_ID:。
  • LOCATION:處理要求的區域。
  • RAG_CORPUS_IDRagCorpus 資源的 ID。
  • GCS_URIS:Cloud Storage 位置清單。範例:gs://my-bucket1, gs://my-bucket2
  • DRIVE_RESOURCE_ID:雲端硬碟資源的 ID。範例:
    • https://drive.google.com/file/d/ABCDE
    • https://drive.google.com/corp/drive/u/0/folders/ABCDEFG
  • DRIVE_RESOURCE_TYPE:雲端硬碟資源類型。選項:
    • RESOURCE_TYPE_FILE - 檔案
    • RESOURCE_TYPE_FOLDER - 資料夾
  • CHUNK_SIZE:(選用) 每個分塊應有的權杖數量。
  • CHUNK_OVERLAP:(選填) 區塊之間重疊的符記數量。

HTTP 方法和網址:

POST https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import

JSON 要求主體:

{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": GCS_URIS
    },
    "google_drive_source": {
      "resource_ids": {
        "resource_id": DRIVE_RESOURCE_ID,
        "resource_type": DRIVE_RESOURCE_TYPE
      },
    }
  }
}

如要傳送要求,請選擇以下其中一個選項:

curl

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"

PowerShell

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content
成功的回應會傳回 ImportRagFilesOperationMetadata 資源。 <0x0A

下列範例示範如何從 Cloud Storage 匯入檔案。在 ImportRagFiles 索引程序期間,使用 max_embedding_requests_per_min 控制欄位限制 RAG 引擎呼叫嵌入模型的速度。這個欄位的預設值為每分鐘 1000 次呼叫。

// Cloud Storage bucket/file location.
// Such as "gs://rag-e2e-test/"
GCS_URIS=YOUR_GCS_LOCATION

// Enter the QPM rate to limit RAG's access to your embedding model
// Example: 1000
EMBEDDING_MODEL_QPM_RATE=MAX_EMBEDDING_REQUESTS_PER_MIN_LIMIT

// ImportRagFiles
// Import a single Cloud Storage file or all files in a Cloud Storage bucket.
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": '\""${GCS_URIS}"\"'
    },
    "rag_file_chunking_config": {
      "chunk_size": 512
    },
    "max_embedding_requests_per_min": '"${EMBEDDING_MODEL_QPM_RATE}"'
  }
}'

// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID=OPERATION_ID
poll_op_wait ${OPERATION_ID}

下列範例說明如何從雲端硬碟匯入檔案。在 ImportRagFiles 索引程序期間,使用 max_embedding_requests_per_min 控制欄位限制 RAG 引擎呼叫嵌入模型的速率。這個欄位的預設值為每分鐘 1000 次呼叫。

// Google Drive folder location.
FOLDER_RESOURCE_ID=YOUR_GOOGLE_DRIVE_FOLDER_RESOURCE_ID

// Enter the QPM rate to limit RAG's access to your embedding model
// Example: 1000
EMBEDDING_MODEL_QPM_RATE=MAX_EMBEDDING_REQUESTS_PER_MIN_LIMIT

// ImportRagFiles
// Import all files in a Google Drive folder.
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID, FOLDER_RESOURCE_ID
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "google_drive_source": {
      "resource_ids": {
        "resource_id": '\""${FOLDER_RESOURCE_ID}"\"',
        "resource_type": "RESOURCE_TYPE_FOLDER"
      }
    },
    "max_embedding_requests_per_min": '"${EMBEDDING_MODEL_QPM_RATE}"'
  }
}'

// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID=OPERATION_ID
poll_op_wait ${OPERATION_ID}

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Python API 參考說明文件


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# paths = ["https://drive.google.com/file/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.import_files(
    corpus_name=corpus_name,
    paths=paths,
    transformation_config=rag.TransformationConfig(
        rag.ChunkingConfig(chunk_size=512, chunk_overlap=100)
    ),
    import_result_sink="gs://sample-existing-folder/sample_import_result_unique.ndjson",  # Optional, this has to be an existing storage bucket folder, and file name has to be unique (non-existent).
    max_embedding_requests_per_min=900,  # Optional
)
print(f"Imported {response.imported_rag_files_count} files.")
# Example response:
# Imported 2 files.

取得 RAG 檔案

REST

使用任何要求資料之前,請先替換以下項目:

  • PROJECT_ID:。
  • LOCATION:處理要求的區域。
  • RAG_CORPUS_IDRagCorpus 資源的 ID。
  • RAG_FILE_IDRagFile 資源的 ID。

HTTP 方法和網址:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

如要傳送要求,請選擇以下其中一個選項:

curl

執行下列指令:

curl -X GET \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

執行下列指令:

$headers = @{  }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
成功的回應會傳回 RagFile 資源。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Python API 參考說明文件


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_file = rag.get_file(name=file_name)
print(rag_file)
# Example response:
# RagFile(name='projects/1234567890/locations/us-central1/ragCorpora/11111111111/ragFiles/22222222222',
# display_name='file_display_name', description='file description')

列出 RAG 檔案

REST

使用任何要求資料之前,請先替換以下項目:

  • PROJECT_ID:。
  • LOCATION:處理要求的區域。
  • RAG_CORPUS_IDRagCorpus 資源的 ID。
  • PAGE_SIZE:標準清單頁面大小。如要調整每頁傳回的 RagFiles 數量,請更新 page_size 參數。
  • PAGE_TOKEN:標準清單頁面符記。通常是透過先前 VertexRagDataService.ListRagFiles 呼叫的 ListRagFilesResponse.next_page_token 取得。

HTTP 方法和網址:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

如要傳送要求,請選擇以下其中一個選項:

curl

執行下列指令:

curl -X GET \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

執行下列指令:

$headers = @{  }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
您應該會收到執行成功的狀態碼 (2xx),以及指定 RAG_CORPUS_ID 下的 RagFiles 清單。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Python API 參考說明文件


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

files = rag.list_files(corpus_name=corpus_name)
for file in files:
    print(file.display_name)
    print(file.name)
# Example response:
# g-drive_file.txt
# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/222222222222
# g_cloud_file.txt
# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/333333333333

刪除 RAG 檔案

REST

使用任何要求資料之前,請先替換以下項目:

  • PROJECT_ID:。
  • LOCATION:處理要求的區域。
  • RAG_CORPUS_IDRagCorpus 資源的 ID。
  • RAG_FILE_IDRagFile 資源的 ID。格式:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}

HTTP 方法和網址:

DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

如要傳送要求,請選擇以下其中一個選項:

curl

執行下列指令:

curl -X DELETE \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

執行下列指令:

$headers = @{  }

Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
成功的回應會傳回 DeleteOperationMetadata 資源。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Python API 參考說明文件


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag.delete_file(name=file_name)
print(f"File {file_name} deleted.")
# Example response:
# Successfully deleted the RagFile.
# File projects/1234567890/locations/us-central1/ragCorpora/1111111111/ragFiles/2222222222 deleted.

擷取背景資訊

使用者提問或提供提示時,RAG 中的擷取元件會搜尋知識庫,找出與查詢相關的資訊。

REST

使用任何要求資料之前,請先替換以下項目:

  • LOCATION:處理要求的區域。
  • PROJECT_ID:。
  • RAG_CORPUS_RESOURCERagCorpus 資源的名稱。格式:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
  • VECTOR_DISTANCE_THRESHOLD:只傳回向量距離小於閾值的上下文。
  • TEXT:用於取得相關脈絡的查詢文字。
  • SIMILARITY_TOP_K:要擷取的重要上下文數量。

HTTP 方法和網址:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts

JSON 要求主體:

{
 "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE",
    },
    "vector_distance_threshold": 0.8
  },
  "query": {
   "text": "TEXT",
   "similarity_top_k": SIMILARITY_TOP_K
  }
 }

如要傳送要求,請選擇以下其中一個選項:

curl

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"

PowerShell

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content
您應該會收到執行成功的狀態碼 (2xx) 和相關 RagFiles 清單。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Python API 參考說明文件


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/[rag_corpus_id]"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=corpus_name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="Hello World!",
    rag_retrieval_config=rag.RagRetrievalConfig(
        top_k=10,
        filter=rag.utils.resources.Filter(vector_distance_threshold=0.5),
    ),
)
print(response)
# Example response:
# contexts {
#   contexts {
#     source_uri: "gs://your-bucket-name/file.txt"
#     text: "....
#   ....

生成內容

預測會控管生成內容的 LLM 方法。

REST

使用任何要求資料之前,請先替換以下項目:

  • PROJECT_ID:。
  • LOCATION:處理要求的區域。
  • MODEL_ID:用於生成內容的大型語言模型。範例:gemini-2.5-flash
  • GENERATION_METHOD:用於生成內容的 LLM 方法。選項:generateContentstreamGenerateContent
  • INPUT_PROMPT:傳送至 LLM 的文字,用於生成內容。請嘗試使用與上傳的 RAG 檔案相關的提示。
  • RAG_CORPUS_RESOURCERagCorpus 資源的名稱。格式:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
  • SIMILARITY_TOP_K:(選用) 要擷取的重要上下文數量。
  • VECTOR_DISTANCE_THRESHOLD:(選填) 傳回向量距離小於閾值的上下文。

HTTP 方法和網址:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD

JSON 要求主體:

{
 "contents": {
  "role": "user",
  "parts": {
    "text": "INPUT_PROMPT"
  }
 },
 "tools": {
  "retrieval": {
   "disable_attribution": false,
   "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE",
    },
    "similarity_top_k": SIMILARITY_TOP_K,
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
   }
  }
 }
}

如要傳送要求,請選擇以下其中一個選項:

curl

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"

PowerShell

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content
成功的回應會傳回附有引文的生成內容。 <0

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Python API 參考說明文件


from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=corpus_name,
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag.RagRetrievalConfig(
                top_k=10,
                filter=rag.utils.resources.Filter(vector_distance_threshold=0.5),
            ),
        ),
    )
)

rag_model = GenerativeModel(
    model_name="gemini-2.0-flash-001", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)
# Example response:
#   The sky appears blue due to a phenomenon called Rayleigh scattering.
#   Sunlight, which contains all colors of the rainbow, is scattered
#   by the tiny particles in the Earth's atmosphere....
#   ...

Weaviate 資料庫支援混合搜尋,可結合語意和關鍵字搜尋,提升搜尋結果的關聯性。在擷取搜尋結果時,系統會結合語意 (密集向量) 和關鍵字比對 (稀疏向量) 的相似度分數,產生最終排名結果。

使用 RAG Engine 檢索 API 進行混合搜尋

以下範例說明如何使用 RAG 引擎檢索 API 啟用混合搜尋。

REST

  # TODO(developer): Update the variables.
  PROJECT_ID = "YOUR_PROJECT_ID"
  # The HTTPS/HTTP Weaviate endpoint you created during provisioning.
  HTTP_ENDPOINT_NAME="https://your.weaviate.endpoint.com"

  # Your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
  # For example, "MyCollectionName"
  # Note that the first letter needs to be capitalized.
  # Otherwise, Weaviate will capitalize it for you.
  WEAVIATE_COLLECTION_NAME="MyCollectionName"

  # The resource name of your Weaviate API Key your Secret.
  SECRET_NAME="MyWeaviateApiKeySecret"
  # The Secret Manager resource name containing the API Key for your Weaviate endpoint.
  # For example, projects/{project}/secrets/{secret}/versions/latest
  APIKEY_SECRET_VERSION="projects/${PROJECT_ID}/secrets/${SECRET_NAME}/versions/latest"

  # Select a Corpus display name.
  CORPUS_DISPLAY_NAME="SpecialCorpus"

  # Call CreateRagCorpus API and set all Vector DB Config parameters for Weaviate to create a new corpus associated to your selected Weaviate collection.
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora \
  -d '{
        "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
        "rag_vector_db_config" : {
                "weaviate": {
                      "http_endpoint": '\""${HTTP_ENDPOINT_NAME}"\"',
                      "collection_name": '\""${WEAVIATE_COLLECTION_NAME}"\"'
                },
          "api_auth" : {
                  "api_key_config": {
                        "api_key_secret_version": '\""${APIKEY_SECRET_VERSION}"\"'
                  }
          }
        }
    }'

  # TODO(developer): Update the variables.
  # Get operation_id returned in CreateRagCorpus.
  OPERATION_ID="your-operation-id"

  # Poll Operation status until done = true in the response.
  curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

  # Call ListRagCorpora API to verify the RAG corpus is created successfully.
  curl -sS -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora"

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Python API 參考說明文件


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/[rag_corpus_id]"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=corpus_name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="Hello World!",
    rag_retrieval_config=rag.RagRetrievalConfig(
        top_k=10,
        filter=rag.utils.resources.Filter(vector_distance_threshold=0.5),
    ),
)
print(response)
# Example response:
# contexts {
#   contexts {
#     source_uri: "gs://your-bucket-name/file.txt"
#     text: "....
#   ....

使用混合型搜尋和 RAG Engine 建立有基準的生成內容

以下範例說明如何使用混合型搜尋和 RAG 引擎,生成有基準的內容。

REST

  # TODO(developer): Update the variables.
  PROJECT_ID = "YOUR_PROJECT_ID"
  # The HTTPS/HTTP Weaviate endpoint you created during provisioning.
  HTTP_ENDPOINT_NAME="https://your.weaviate.endpoint.com"

  # Your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
  # For example, "MyCollectionName"
  # Note that the first letter needs to be capitalized.
  # Otherwise, Weaviate will capitalize it for you.
  WEAVIATE_COLLECTION_NAME="MyCollectionName"

  # The resource name of your Weaviate API Key your Secret.
  SECRET_NAME="MyWeaviateApiKeySecret"
  # The Secret Manager resource name containing the API Key for your Weaviate endpoint.
  # For example, projects/{project}/secrets/{secret}/versions/latest
  APIKEY_SECRET_VERSION="projects/${PROJECT_ID}/secrets/${SECRET_NAME}/versions/latest"

  # Select a Corpus display name.
  CORPUS_DISPLAY_NAME="SpecialCorpus"

  # Call CreateRagCorpus API and set all Vector DB Config parameters for Weaviate to create a new corpus associated to your selected Weaviate collection.
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora \
  -d '{
        "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
        "rag_vector_db_config" : {
                "weaviate": {
                      "http_endpoint": '\""${HTTP_ENDPOINT_NAME}"\"',
                      "collection_name": '\""${WEAVIATE_COLLECTION_NAME}"\"'
                },
          "api_auth" : {
                  "api_key_config": {
                        "api_key_secret_version": '\""${APIKEY_SECRET_VERSION}"\"'
                  }
          }
        }
    }'

  # TODO(developer): Update the variables.
  # Get operation_id returned in CreateRagCorpus.
  OPERATION_ID="your-operation-id"

  # Poll Operation status until done = true in the response.
  curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

  # Call ListRagCorpora API to verify the RAG corpus is created successfully.
  curl -sS -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora"

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Python API 參考說明文件


from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=corpus_name,
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag.RagRetrievalConfig(
                top_k=10,
                filter=rag.utils.resources.Filter(vector_distance_threshold=0.5),
            ),
        ),
    )
)

rag_model = GenerativeModel(
    model_name="gemini-2.0-flash-001", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)
# Example response:
#   The sky appears blue due to a phenomenon called Rayleigh scattering.
#   Sunlight, which contains all colors of the rainbow, is scattered
#   by the tiny particles in the Earth's atmosphere....
#   ...

後續步驟