本頁說明如何使用 Vertex AI SDK 中的 GenAI Client,評估各種應用實例的生成式 AI 模型和應用程式。
事前準備
-
Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator
(
roles/resourcemanager.projectCreator
), which contains theresourcemanager.projects.create
permission. Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator
(
roles/resourcemanager.projectCreator
), which contains theresourcemanager.projects.create
permission. Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
安裝 Vertex AI SDK for Python:
!pip install google-cloud-aiplatform[evaluation]
設定憑證。如果您是在 Colaboratory 中執行本教學課程,請執行下列指令:
from google.colab import auth auth.authenticate_user()
如為其他環境,請參閱「向 Vertex AI 進行驗證」。
以 Pandas DataFrame 準備資料集:
import pandas as pd eval_df = pd.DataFrame({ "prompt": [ "Explain software 'technical debt' using a concise analogy of planting a garden.", "Write a Python function to find the nth Fibonacci number using recursion with memoization, but without using any imports.", "Write a four-line poem about a lonely robot, where every line must be a question and the word 'and' cannot be used.", "A drawer has 10 red socks and 10 blue socks. In complete darkness, what is the minimum number of socks you must pull out to guarantee you have a matching pair?", "An AI discovers a cure for a major disease, but the cure is based on private data it analyzed without consent. Should the cure be released? Justify your answer." ] })
使用
run_inference()
生成模型回覆:eval_dataset = client.evals.run_inference( model="gemini-2.5-flash", src=eval_df, )
呼叫
EvaluationDataset
物件上的.show()
,即可將推論結果視覺化,並檢查模型輸出內容、原始提示和參照:eval_dataset.show()
使用預設的
GENERAL_QUALITY
adaptive rubric-based metric 評估模型回應:eval_result = client.evals.evaluate(dataset=eval_dataset)
呼叫
EvaluationResult
物件上的.show()
,即可以圖表呈現評估結果,並顯示摘要指標和詳細結果:eval_result.show()
生成回覆
使用 run_inference()
為資料集生成模型回應:
下圖顯示評估資料集,內含提示和對應的生成回覆:
執行評估作業
執行 evaluate()
來評估模型回覆:
下圖顯示評估報表,其中列出每個提示/回應組合的摘要指標和詳細結果。
清除所用資源
本教學課程不會建立任何 Vertex AI 資源。