自 2025 年 4 月 29 日起，Gemini 1.5 Pro 和 Gemini 1.5 Flash 模型將無法用於先前未使用這些模型的專案，包括新專案。詳情請參閱「模型版本和生命週期」。

本頁面由 Cloud Translation API 翻譯而成。

記錄要求與回應

Vertex AI 可以記錄 Gemini 和支援的合作夥伴模型的要求和回應樣本。記錄會儲存至 BigQuery 表格，方便您查看及分析。本頁面說明如何為基礎基礎模型和微調模型設定要求/回應記錄。

支援的記錄 API 方法

凡是使用 generateContent 或 streamGenerateContent 的 Gemini 模型，都支援要求/回應記錄。

系統也支援下列使用 rawPredict 或 streamrawPredict 的合作夥伴模型：

Anthropic Claude

基礎模型的要求/回應記錄

您可以使用 REST API 或 Python SDK，為基礎基礎模型設定要求/回應記錄。記錄設定可能需要幾分鐘才會生效。

啟用要求/回應記錄

選取下列任一分頁標籤，查看如何為基礎模型啟用要求/回應記錄。

對於 Anthropic 模型，記錄設定僅支援 REST。透過 REST API 啟用記錄設定，方法是將發布者設為 anthropic，並將模型名稱設為支援的 Claude 模型之一。

Python SDK

這個方法可用於建立或更新 PublisherModelConfig。

publisher_model = GenerativeModel('gemini-2.0-pro-001')

# Set logging configuration
publisher_model.set_request_response_logging_config(
    enabled=True,
    sampling_rate=1.0,
    bigquery_destination="bq://PROJECT_ID.DATASET_NAME.TABLE_NAME",
    enable_otel_logging=True
    )

REST API

使用 setPublisherModelConfig 建立或更新 PublisherModelConfig：

使用任何要求資料之前，請先替換以下項目：

ENDPOINT_PREFIX：模型資源的區域，後接 -。例如 us-central1-。如要使用全域端點，請留空。模型支援的所有地區都支援要求/回應記錄。
PROJECT_ID：您的專案 ID。
LOCATION：模型資源的區域。如要使用全域端點，請輸入 global。
PUBLISHER：發布者名稱。例如：google。
MODEL：基礎模型名稱。例如：gemini-2.0-flash-001。
SAMPLING_RATE：如要降低儲存空間費用，您可以設定介於 0 到 1 之間的數字，定義要記錄的要求比例。舉例來說，值為 1 會記錄所有要求，值為 0.1 則會記錄 10% 的要求。
BQ_URI：用於記錄的 BigQuery 資料表。如果您只指定專案名稱，系統會建立名為 logging_ENDPOINT_DISPLAY_NAME\_ENDPOINT_ID 的新資料集，其中 ENDPOINT_DISPLAY_NAME 遵循 BigQuery 命名規則。如未指定資料表名稱，系統會建立名為 request_response_logging 的新資料表。

HTTP 方法和網址：

POST https://ENDPOINT_PREFIXaiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig

JSON 要求主體：

{
  "publisherModelConfig": {
     "loggingConfig": {
       "enabled": true,
       "samplingRate": SAMPLING_RATE,
       "bigqueryDestination": {
         "outputUri": "BQ_URI"
       },
       "enableOtelLogging": true
     }
   }
 }

如要傳送要求，請選擇以下其中一個選項：

curl

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI，或使用 Cloud Shell，自動登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://ENDPOINT_PREFIXaiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://ENDPOINT_PREFIXaiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

回應

{
  "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.SetPublisherModelConfigOperationMetadata",
    "genericMetadata": {
      "createTime": "2025-03-11T22:42:54.283184Z",
      "updateTime": "2025-03-11T22:42:54.283184Z"
    }
  }
}

取得記錄設定

使用 REST API 取得基礎模型的請求/回應記錄設定。

REST API

使用 fetchPublisherModelConfig 取得要求/回應記錄設定：

使用任何要求資料之前，請先替換以下項目：

PROJECT_ID：您的專案 ID。
LOCATION：模型資源的位置。
PUBLISHER：發布者名稱。例如：google。
MODEL：基礎模型名稱。例如：gemini-2.0-flash-001。

HTTP 方法和網址：

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:fetchPublisherModelConfig

如要傳送要求，請選擇以下其中一個選項：

curl

執行下列指令：

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:fetchPublisherModelConfig"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:fetchPublisherModelConfig" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

回應

{
  "loggingConfig": {
    "enabled": true,
    "samplingRate": 1,
    "bigqueryDestination": {
      "outputUri": "bq://output-uri"
    },
    "enableOtelLogging": true
  }
}

停用記錄

使用 REST API 或 Python SDK，停用基礎模型的請求/回應記錄功能。

Python SDK

publisher_model.set_request_response_logging_config(
  enabled=False,
  sampling_rate=0,
  bigquery_destination=''
  )

REST API

使用 setPublisherModelConfig 停用記錄功能：

使用任何要求資料之前，請先替換以下項目：

PROJECT_ID：您的專案 ID。
LOCATION：模型資源的位置。
PUBLISHER：發布者名稱。例如：google。
MODEL：基礎模型名稱。例如：gemini-2.0-flash-001。

HTTP 方法和網址：

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig

JSON 要求主體：

{
  "publisherModelConfig": {
     "loggingConfig": {
       "enabled": false
     }
  }
}

如要傳送要求，請選擇以下其中一個選項：

curl

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

回應

{
  "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.SetPublisherModelConfigOperationMetadata",
    "genericMetadata": {
      "createTime": "2025-03-11T22:42:54.283184Z",
      "updateTime": "2025-03-11T22:42:54.283184Z"
    }
  }
}

微調模型的要求/回應記錄

您可以使用 REST API 或 Python SDK，為微調模型設定要求/回應記錄。

啟用要求/回應記錄

如要瞭解如何為微調模型啟用要求/回應記錄，請選取下列任一分頁標籤。

Python SDK

這個方法可用於更新端點的要求/回應記錄設定。

tuned_model = GenerativeModel("projects/PROJECT_ID/locations/REGION/endpoints/ENDPOINT_ID")

# Set logging configuration
tuned_model.set_request_response_logging_config(
    enabled=True,
    sampling_rate=1.0,
    bigquery_destination="bq://PROJECT_ID.DATASET_NAME.TABLE_NAME",
    enable_otel_logging=True
    )

REST API

您只能在透過 projects.locations.endpoints.create 建立端點時啟用要求/回應記錄，或是透過 projects.locations.endpoints.patch 修補現有端點時啟用。

系統會在端點層級記錄要求和回應，因此傳送至同一端點下任何已部署模型的要求都會記錄下來。

建立或修補端點時，請在端點資源的 predictRequestResponseLoggingConfig 欄位中填入下列項目：

enabled：設為 True 即可啟用要求/回應記錄。
samplingRate：如要降低儲存空間費用，您可以設定介於 0 到 1 之間的數字，定義要記錄的要求比例。舉例來說，值為 1 會記錄所有要求，值為 0.1 則會記錄 10% 的要求。
BigQueryDestination：用於記錄的 BigQuery 資料表。如果只指定專案名稱，系統會建立名為 logging_ENDPOINT_DISPLAY_NAME_ENDPOINT_ID 的新資料集，其中 ENDPOINT_DISPLAY_NAME 須符合 BigQuery 命名規則。如果未指定資料表名稱，系統會建立名為 request_response_logging 的新資料表。
enableOtelLogging：設為 true 可啟用 OpenTelemetry (OTEL) 記錄，以及預設的要求/回應記錄。

如要查看 BigQuery 資料表結構定義，請參閱記錄資料表結構定義。

以下為設定範例：

{
  "predictRequestResponseLoggingConfig": {
    "enabled": true,
    "samplingRate": 0.5,
    "bigqueryDestination": {
      "outputUri": "bq://PROJECT_ID.DATASET_NAME.TABLE_NAME"
    },
    "enableOtelLogging": true
  }
}

取得記錄設定

使用 REST API 取得微調模型的請求/回應記錄設定。

REST API

使用任何要求資料之前，請先替換以下項目：

PROJECT_ID：您的專案 ID。
LOCATION：端點資源的位置。
MODEL：基礎模型名稱。例如：gemini-2.0-flash-001。
ENDPOINT_ID：端點的 ID。

HTTP 方法和網址：

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID

如要傳送要求，請選擇以下其中一個選項：

curl

執行下列指令：

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

回應

{
  "loggingConfig": {
    "enabled": true,
    "samplingRate": 1,
    "bigqueryDestination": {
      "outputUri": "bq://output-uri"
    },
    "enableOtelLogging": true
  }
}

停用記錄設定

停用端點的要求/回應記錄設定。

Python SDK

tuned_model = GenerativeModel("projects/PROJECT_ID/locations/REGION/endpoints/ENDPOINT_ID")

# Set logging configuration
tuned_model.set_request_response_logging_config(
    enabled=False,
    sampling_rate=1.0,
    bigquery_destination="bq://PROJECT_ID.DATASET_NAME.TABLE_NAME",
    enable_otel_logging=False
    )

REST API

{
"predictRequestResponseLoggingConfig": {
  "enabled": false
}
}

記錄資料表結構定義

在 BigQuery 中，系統會使用下列結構定義記錄記錄：

欄位名稱	類型	附註
endpoint	STRING	部署調整後模型的端點資源名稱。
deployed_model_id	STRING	部署至端點的微調模型 ID。
logging_time	TIMESTAMP	執行記錄作業的時間。這大約是系統傳回回應的時間。
request_id	NUMERIC	系統根據 API 要求自動產生的整數要求 ID。
request_payload	STRING	適用於合作夥伴模型記錄，並可向後相容於 Vertex AI 端點要求與回覆記錄。
response_payload	STRING	適用於合作夥伴模型記錄，並可向後相容於 Vertex AI 端點要求與回覆記錄。
模型	STRING	模型資源名稱。
model_version	STRING	模型版本。Gemini 模型通常會使用「預設」值。
api_method	STRING	generateContent、streamGenerateContent、rawPredict、streamRawPredict
full_request	JSON	完整`GenerateContentRequest`。
full_response	JSON	完整`GenerateContentResponse`。
中繼資料	JSON	呼叫的任何中繼資料，包含要求延遲時間。
otel_log	JSON	OpenTelemetry 結構定義格式的記錄。只有在記錄設定中啟用 `otel_logging` 時，才能使用這個選項。

請注意，如果要求/回應配對超過 BigQuery 寫入 API 的 10 MB 資料列大小上限，系統就不會記錄。

後續步驟

估算線上預測記錄的價格。
使用 Google Cloud 控制台或 Vertex AI API 部署模型。
瞭解如何建立 BigQuery 資料表。

記錄要求與回應 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

支援的記錄 API 方法

基礎模型的要求/回應記錄

啟用要求/回應記錄

Python SDK

REST API

curl

PowerShell

回應

取得記錄設定

REST API

curl

PowerShell

回應

停用記錄

Python SDK

REST API

curl

PowerShell

回應

微調模型的要求/回應記錄

啟用要求/回應記錄

Python SDK

REST API

取得記錄設定

REST API

curl

PowerShell

回應

停用記錄設定

Python SDK

REST API

記錄資料表結構定義

後續步驟

記錄要求與回應