このページは Cloud Translation API によって翻訳されました。

チュートリアル: Vertex AI SDK の生成 AI クライアントを使用して評価を行う

このページでは、Vertex AI SDK の GenAI クライアントを使用して、さまざまなユースケースで生成 AI モデルとアプリケーションを評価する方法について説明します。

始める前に

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.
Vertex AI SDK for Python をインストールします。
```
!pip install google-cloud-aiplatform[evaluation]
```
認証情報を設定します。このチュートリアルを Colaboratory で実行している場合は、次のコマンドを実行します。
```
from google.colab import auth
auth.authenticate_user()
```
他の環境については、Vertex AI に対する認証をご覧ください。

回答を生成する

run_inference() を使用して、データセットのモデルレスポンスを生成します。

import pandas as pd

eval_df = pd.DataFrame({
  "prompt": [
      "Explain software 'technical debt' using a concise analogy of planting a garden.",
      "Write a Python function to find the nth Fibonacci number using recursion with memoization, but without using any imports.",
      "Write a four-line poem about a lonely robot, where every line must be a question and the word 'and' cannot be used.",
      "A drawer has 10 red socks and 10 blue socks. In complete darkness, what is the minimum number of socks you must pull out to guarantee you have a matching pair?",
      "An AI discovers a cure for a major disease, but the cure is based on private data it analyzed without consent. Should the cure be released? Justify your answer."
  ]
})

run_inference() を使用してモデルのレスポンスを生成します。

eval_dataset = client.evals.run_inference(
  model="gemini-2.5-flash",
  src=eval_df,
)

EvaluationDataset オブジェクトで .show() を呼び出して、元のプロンプトと参照とともにモデルの出力を検査し、推論結果を可視化します。
```
eval_dataset.show()
```

次の図は、プロンプトとそれに対応する生成されたレスポンスを含む評価データセットを示しています。

プロンプトとレスポンスの列を含む評価データセットを示すテーブル。

評価を実行する

evaluate() を実行して、モデルのレスポンスを評価します。

デフォルトの GENERAL_QUALITY 適応型ルーブリックベースの指標を使用して、モデルの回答を評価します。
```
eval_result = client.evals.evaluate(dataset=eval_dataset)
```
EvaluationResult オブジェクトで .show() を呼び出して、要約指標と詳細な結果を表示することで、評価結果を可視化します。
```
eval_result.show()
```

次の画像は、評価レポートを示しています。このレポートには、各プロンプトとレスポンスのペアの概要指標と詳細な結果が表示されます。

各プロンプトとレスポンスのペアの詳細な結果とともに、要約指標を表示する評価レポート。

クリーンアップ

このチュートリアルでは、Vertex AI リソースは作成されません。

次のステップ

評価指標を定義する。