튜토리얼: Vertex AI SDK의 생성형 AI 클라이언트를 사용한 평가 수행

이 페이지에서는 Vertex AI SDK의 GenAI 클라이언트를 사용하여 다양한 사용 사례에서 생성형 AI 모델과 애플리케이션을 평가하는 방법을 보여줍니다.

시작하기 전에

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Vertex AI SDK for Python을 설치합니다.

!pip install google-cloud-aiplatform[evaluation]

사용자 인증 정보를 설정합니다. Colaboratory에서 이 튜토리얼을 실행하는 경우 다음을 실행합니다.
```
from google.colab import auth
auth.authenticate_user()
```
다른 환경의 경우 Vertex AI에 인증을 참고하세요.

응답 생성

run_inference()를 사용하여 데이터 세트의 모델 응답을 생성합니다.

Pandas DataFrame으로 데이터 세트를 준비합니다.

import pandas as pd

eval_df = pd.DataFrame({
  "prompt": [
      "Explain software 'technical debt' using a concise analogy of planting a garden.",
      "Write a Python function to find the nth Fibonacci number using recursion with memoization, but without using any imports.",
      "Write a four-line poem about a lonely robot, where every line must be a question and the word 'and' cannot be used.",
      "A drawer has 10 red socks and 10 blue socks. In complete darkness, what is the minimum number of socks you must pull out to guarantee you have a matching pair?",
      "An AI discovers a cure for a major disease, but the cure is based on private data it analyzed without consent. Should the cure be released? Justify your answer."
  ]
})

run_inference()를 사용하여 모델 응답을 생성합니다.

eval_dataset = client.evals.run_inference(
  model="gemini-2.5-flash",
  src=eval_df,
)

EvaluationDataset 객체에서 .show()를 호출하여 모델의 출력과 함께 원래 프롬프트 및 참조를 검사하여 추론 결과를 시각화합니다.
```
eval_dataset.show()
```

다음 이미지는 프롬프트와 이에 상응하는 생성된 응답이 포함된 평가 데이터 세트를 보여줍니다.

프롬프트와 응답 열이 있는 평가 데이터 세트를 보여주는 테이블

평가 실행

evaluate()를 실행하여 모델 응답을 평가합니다.

기본 GENERAL_QUALITY 적응형 기준표 기반 측정항목을 사용하여 모델 응답을 평가합니다.
```
eval_result = client.evals.evaluate(dataset=eval_dataset)
```
EvaluationResult 객체에서 .show()를 호출하여 요약 측정항목과 세부 결과를 표시하여 평가 결과를 시각화합니다.
```
eval_result.show()
```

다음 이미지는 요약 측정항목과 각 프롬프트-응답 쌍의 세부 결과를 보여주는 평가 보고서를 보여줍니다.

각 프롬프트-응답 쌍의 세부 결과와 함께 요약 측정항목을 표시하는 평가 보고서

삭제

이 튜토리얼에서는 Vertex AI 리소스를 만들지 않습니다.

다음 단계

평가 측정항목 정의