Diese Seite wurde von der Cloud Translation API übersetzt.

Anleitung: Bewertung mit dem Python SDK durchführen

Auf dieser Seite erfahren Sie, wie Sie mit dem Vertex AI SDK for Python eine modellbasierte Bewertung mit dem Gen AI Evaluation Service vornehmen können.

Hinweise

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.
Installieren Sie das Vertex AI SDK for Python mit der Abhängigkeit vom Gen AI Evaluation Service:
```
!pip install google-cloud-aiplatform[evaluation]
```

Anmeldedaten einrichten Wenn Sie diese Kurzanleitung in Colaboratory ausführen, führen Sie Folgendes aus:

from google.colab import auth
auth.authenticate_user()

Informationen zur Authentifizierung in anderen Umgebungen finden Sie unter Bei Vertex AI authentifizieren.

Bibliotheken importieren

Importieren Sie Ihre Bibliotheken und richten Sie Ihr Projekt und Ihren Standort ein.

import pandas as pd

import vertexai
from vertexai.evaluation import EvalTask, PointwiseMetric, PointwiseMetricPromptTemplate
from google.cloud import aiplatform

PROJECT_ID = "PROJECT_ID"
LOCATION = "LOCATION"
EXPERIMENT_NAME = "EXPERIMENT_NAME"

vertexai.init(
    project=PROJECT_ID,
    location=LOCATION,
)

EXPERIMENT_NAME darf nur kleingeschriebene alphanumerische Zeichen und Bindestriche enthalten und darf maximal 127 Zeichen lang sein.

Auswertungsstatistiken auf Grundlage Ihrer Kriterien einrichten

Mit der folgenden Messwertdefinition wird die Textqualität bewertet, die von einem Large Language Model anhand von zwei Kriterien generiert wird: Fluency und Entertaining. Im Code wird mit diesen beiden Kriterien ein Messwert namens custom_text_quality definiert:

custom_text_quality = PointwiseMetric(
    metric="custom_text_quality",
    metric_prompt_template=PointwiseMetricPromptTemplate(
        criteria={
            "fluency": (
                "Sentences flow smoothly and are easy to read, avoiding awkward"
                " phrasing or run-on sentences. Ideas and sentences connect"
                " logically, using transitions effectively where needed."
            ),
            "entertaining": (
                "Short, amusing text that incorporates emojis, exclamations and"
                " questions to convey quick and spontaneous communication and"
                " diversion."
            ),
        },
        rating_rubric={
            "1": "The response performs well on both criteria.",
            "0": "The response is somewhat aligned with both criteria",
            "-1": "The response falls short on both criteria",
        },
    ),
)

Dataset vorbereiten

Fügen Sie den folgenden Code hinzu, um Ihr Dataset vorzubereiten:

responses = [
    # An example of good custom_text_quality
    "Life is a rollercoaster, full of ups and downs, but it's the thrill that keeps us coming back for more!",
    # An example of medium custom_text_quality
    "The weather is nice today, not too hot, not too cold.",
    # An example of poor custom_text_quality
    "The weather is, you know, whatever.",
]

eval_dataset = pd.DataFrame({
    "response" : responses,
})

Bewertung mit Ihrem Dataset ausführen

Führen Sie die Bewertung aus:

eval_task = EvalTask(
    dataset=eval_dataset,
    metrics=[custom_text_quality],
    experiment=EXPERIMENT_NAME
)

pointwise_result = eval_task.evaluate()

Sehen Sie sich die Auswertungsergebnisse für jede Antwort im metrics_table-Pandas-DataFrame an:

pointwise_result.metrics_table

Bereinigen

Mit den folgenden Schritten vermeiden Sie, dass Ihrem Google Cloud -Konto die auf dieser Seite verwendeten Ressourcen in Rechnung gestellt werden:

Löschen Sie die von der Bewertung erstellte ExperimentRun:

aiplatform.ExperimentRun(
    run_name=pointwise_result.metadata["experiment_run"],
    experiment=pointwise_result.metadata["experiment"],
).delete()