本頁面由 Cloud Translation API 翻譯而成。

使用排名 API 改善搜尋和 RAG 品質

在 AI 應用程式的檢索增強生成 (RAG) 體驗中，您可以根據查詢對一組文件進行排序。

排序 API 會接收文件清單，並根據文件與查詢的相關程度重新排序。與僅考量文件和查詢語意相似度的嵌入內容相比，Ranking API 可提供精確的分數，指出文件回答特定查詢的程度。擷取初始候選文件集後，您可以使用排名 API 提升搜尋結果品質。

排序 API 是無狀態的，因此不需要在呼叫 API 前為文件建立索引。您只需要傳遞查詢和文件即可。因此，這項 API 非常適合用來重新排序 Vector Search 和其他搜尋解決方案中的文件。

本頁說明如何使用排序 API，根據查詢對一組文件進行排序。

用途

排名 API 的主要用途是提升搜尋結果的品質。

不過，在需要找出與使用者查詢最相關的內容時，排名 API 就能派上用場。舉例來說，排名 API 可協助您執行下列操作：

找出適合提供給 LLM 做為基礎的內容
提升現有搜尋體驗的關聯性
找出文件中的相關章節

以下流程說明如何使用排序 API，提升分塊文件的結果品質：

使用 Document AI Layout Parser API 將一組文件分割為幾個分塊。
使用 Embeddings API 為每個區塊建立嵌入。
將嵌入載入 Vector Search 或其他搜尋解決方案。
查詢搜尋索引，並擷取最相關的區塊。
使用排序 API 重新排序相關區塊。

輸入資料

排名 API 需要下列輸入內容：

您要為記錄排序的查詢。

例如：
```
"query": "Why is the sky blue?"
```

與查詢相關的一組記錄。記錄會以物件陣列的形式提供。每筆記錄可包含專屬 ID、標題和文件內容。每筆記錄都必須包含標題、內容或兩者。每筆記錄支援的權杖數量上限取決於使用的模型版本。舉例來說，模型最多支援 512 個權杖 (版本 003)，而版本 004 則支援 1024 個權杖。如果標題和內容的總長度超過模型的符記限制，系統會截斷多餘的內容。每個要求最多可包含 200 筆記錄。

舉例來說，記錄陣列看起來大致如下。實際上，陣列中會包含更多記錄，內容也會長很多：

"records": [
   {
       "id": "1",
       "title": "The Color of the Sky: A Poem",
       "content": "A canvas stretched across the day,\nWhere sunlight learns to dance and play.\nBlue, a hue of scattered light,\nA gentle whisper, soft and bright."
   },
   {
       "id": "2",
       "title": "The Science of a Blue Sky",
       "content": "The sky appears blue due to a phenomenon called Rayleigh scattering. Sunlight is comprised of all the colors of the rainbow. Blue light has shorter wavelengths than other colors, and is thus scattered more easily."
   }
]

選用：您希望 Ranking API 傳回的記錄數量上限。根據預設，系統會傳回所有記錄，但您可以使用 topN 欄位減少傳回的記錄。無論設定的值為何，所有記錄都會進行排名。

舉例來說，這會傳回排名前 10 的記錄：
```
"topN": 10,
```
選用：這項設定可指定您只要 API 傳回的記錄 ID，還是也要傳回記錄標題和內容。根據預設，系統會傳回完整記錄。設定這項屬性的主要原因，是為了縮減回應酬載的大小。

舉例來說，如果設為 true，系統只會傳回記錄 ID，不會傳回標題或內容：
```
"ignoreRecordDetailsInResponse": true,
```
選用：模型名稱。這會指定用於文件排序的模型。如果未指定模型，則會使用 semantic-ranker-default@latest，這會自動指向最新可用模型。如要指向特定模型，請指定「支援的型號」中列出的其中一個模型名稱，例如 semantic-ranker-512-003。

在以下範例中，model 會設為 semantic-ranker-default@latest。也就是說，排名 API 一律會使用最新的可用模型。
```
"model": "semantic-ranker-default@latest"
```

輸出資料

排名 API 會傳回已排序的記錄清單，並提供下列輸出內容：

分數：介於 0 和 1 之間的浮點值，表示記錄的相關性。
ID：記錄的專屬 ID。
如果提出要求，則會提供完整物件：ID、標題和內容。

例如：

{
    "records": [
        {
            "id": "2",
            "score": 0.98,
            "title": "The Science of a Blue Sky",
            "content": "The sky appears blue due to a phenomenon called Rayleigh scattering. Sunlight is comprised of all the colors of the rainbow. Blue light has shorter wavelengths than other colors, and is thus scattered more easily."
        },
        {
            "id": "1",
            "score": 0.64,
            "title": "The Color of the Sky: A Poem",
            "content": "A canvas stretched across the day,\nWhere sunlight learns to dance and play.\nBlue, a hue of scattered light,\nA gentle whisper, soft and bright."
        }
    ]
}

根據查詢結果排序 (或重新排序) 一組記錄

一般來說，您會向排序 API 提供查詢和一組與該查詢相關的記錄，這些記錄已透過其他方法 (例如關鍵字搜尋或向量搜尋) 排序。接著，您可以使用排名 API 提升排名品質，並判斷分數，指出每筆記錄與查詢的關聯性。

取得查詢和結果記錄。確認每筆記錄都有 ID，且包含標題、內容或兩者。

每筆記錄支援的權杖數量上限取決於模型版本。版本 003 以下的機型 (例如 semantic-ranker-512-003) 支援每筆記錄 512 個權杖。從 004 版開始，這項限制會提高至 1024 個權杖。如果標題和內容的總長度超過模型的符記限制，系統會截斷多餘的內容。
使用下列程式碼呼叫 rankingConfigs.rank 方法：

REST

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-H "X-Goog-User-Project: PROJECT_ID" \
"https://discoveryengine.googleapis.com/v1/projects/PROJECT_ID/locations/global/rankingConfigs/default_ranking_config:rank" \
-d '{
"model": "semantic-ranker-default@latest",
"query": "QUERY",
"records": [
    {
        "id": "RECORD_ID_1",
        "title": "TITLE_1",
        "content": "CONTENT_1"
    },
    {
        "id": "RECORD_ID_2",
        "title": "TITLE_2",
        "content": "CONTENT_2"
    },
    {
        "id": "RECORD_ID_3",
        "title": "TITLE_3",
        "content": "CONTENT_3"
    }
]
}'

更改下列內容：

PROJECT_ID：您的 Google Cloud 專案 ID。
QUERY：用來對記錄進行排名和評分的查詢。
RECORD_ID_n：用於識別記錄的專屬字串。
TITLE_n：記錄的標題。
CONTENT_n：記錄內容。

如需這個方法的一般資訊，請參閱 rankingConfigs.rank。

按一下即可查看 cURL 指令和回應範例。

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    -H "X-Goog-User-Project: my-project-123" \
    "https://discoveryengine.googleapis.com/v1/projects/my-project-123/locations/global/rankingConfigs/default_ranking_config:rank" \
    -d '{
        "model": "semantic-ranker-default@latest",
        "query": "what is Google gemini?",
        "records": [
            {
                "id": "1",
                "title": "Gemini",
                "content": "The Gemini zodiac symbol often depicts two figures standing side-by-side."
            },
            {
                "id": "2",
                "title": "Gemini",
                "content": "Gemini is a cutting edge large language model created by Google."
            },
            {
                "id": "3",
                "title": "Gemini Constellation",
                "content": "Gemini is a constellation that can be seen in the night sky."
            }
        ]
    }'

{
    "records": [
        {
            "id": "2",
            "title": "Gemini",
            "content": "Gemini is a cutting edge large language model created by Google.",
            "score": 0.97
        },
        {
            "id": "3",
            "title": "Gemini Constellation",
            "content": "Gemini is a constellation that can be seen in the night sky.",
            "score": 0.18
        },
        {
            "id": "1",
            "title": "Gemini",
            "content": "The Gemini zodiac symbol often depicts two figures standing side-by-side.",
            "score": 0.05
        }
    ]
}

Python

詳情請參閱 AI Applications Python API 參考說明文件。

如要向 AI Applications 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

from google.cloud import discoveryengine_v1 as discoveryengine

# TODO(developer): Uncomment these variables before running the sample.
# project_id = "YOUR_PROJECT_ID"

client = discoveryengine.RankServiceClient()

# The full resource name of the ranking config.
# Format: projects/{project_id}/locations/{location}/rankingConfigs/default_ranking_config
ranking_config = client.ranking_config_path(
    project=project_id,
    location="global",
    ranking_config="default_ranking_config",
)
request = discoveryengine.RankRequest(
    ranking_config=ranking_config,
    model="semantic-ranker-default@latest",
    top_n=10,
    query="What is Google Gemini?",
    records=[
        discoveryengine.RankingRecord(
            id="1",
            title="Gemini",
            content="The Gemini zodiac symbol often depicts two figures standing side-by-side.",
        ),
        discoveryengine.RankingRecord(
            id="2",
            title="Gemini",
            content="Gemini is a cutting edge large language model created by Google.",
        ),
        discoveryengine.RankingRecord(
            id="3",
            title="Gemini Constellation",
            content="Gemini is a constellation that can be seen in the night sky.",
        ),
    ],
)

response = client.rank(request=request)

# Handle the response
print(response)

支援的模型

以下是可用的模型。

模型名稱	最新型號 (`semantic-ranker-default@latest`)	輸入	脈絡窗口	發布日期	停售日期
`semantic-ranker-default-004`	是	文字 (25 種語言)	1024	2025 年 4 月 9 日	待定
`semantic-ranker-fast-004`	否	文字 (25 種語言)	1024	2025 年 4 月 9 日	待定
`semantic-ranker-default-003`	否	文字 (25 種語言)	512	2024 年 9 月 10 日	待定
`semantic-ranker-default-002`	否	文字 (僅限英文)	512	2024 年 6 月 3 日	待定

後續步驟

瞭解如何搭配其他 RAG API 使用排序方法，根據非結構化資料生成有根據的答案。