A partire dal 29 aprile 2025, i modelli Gemini 1.5 Pro e Gemini 1.5 Flash non sono disponibili nei progetti che non li hanno mai utilizzati, inclusi i nuovi progetti. Per maggiori dettagli, vedi Versioni e ciclo di vita dei modelli.
Mantieni tutto organizzato con le raccolte
Salva e classifica i contenuti in base alle tue preferenze.
Tutorial: esegui la valutazione utilizzando l'SDK Python
Questa pagina mostra come eseguire una valutazione basata su modelli con Gen AI evaluation service utilizzando l'SDK Vertex AI Python.
Prima di iniziare
Sign in to your Google Cloud account. If you're new to
Google Cloud,
create an account to evaluate how our products perform in
real-world scenarios. New customers also get $300 in free credits to
run, test, and deploy workloads.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Tieni presente che EXPERIMENT_NAME può contenere solo caratteri alfanumerici minuscoli e trattini, fino a un massimo di 127 caratteri.
Configura le metriche di valutazione in base ai tuoi criteri
La seguente definizione di metrica valuta la qualità del testo generato da un modello linguistico di grandi dimensioni in base a due criteri: Fluency e Entertaining. Il codice definisce una metrica chiamata custom_text_quality utilizzando questi due criteri:
custom_text_quality=PointwiseMetric(metric="custom_text_quality",metric_prompt_template=PointwiseMetricPromptTemplate(criteria={"fluency":("Sentences flow smoothly and are easy to read, avoiding awkward"" phrasing or run-on sentences. Ideas and sentences connect"" logically, using transitions effectively where needed."),"entertaining":("Short, amusing text that incorporates emojis, exclamations and"" questions to convey quick and spontaneous communication and"" diversion."),},rating_rubric={"1":"The response performs well on both criteria.","0":"The response is somewhat aligned with both criteria","-1":"The response falls short on both criteria",},),)
Preparare il set di dati
Aggiungi il seguente codice per preparare il set di dati:
responses=[# An example of good custom_text_quality"Life is a rollercoaster, full of ups and downs, but it's the thrill that keeps us coming back for more!",# An example of medium custom_text_quality"The weather is nice today, not too hot, not too cold.",# An example of poor custom_text_quality"The weather is, you know, whatever.",]eval_dataset=pd.DataFrame({"response":responses,})
[[["Facile da capire","easyToUnderstand","thumb-up"],["Il problema è stato risolto","solvedMyProblem","thumb-up"],["Altra","otherUp","thumb-up"]],[["Difficile da capire","hardToUnderstand","thumb-down"],["Informazioni o codice di esempio errati","incorrectInformationOrSampleCode","thumb-down"],["Mancano le informazioni o gli esempi di cui ho bisogno","missingTheInformationSamplesINeed","thumb-down"],["Problema di traduzione","translationIssue","thumb-down"],["Altra","otherDown","thumb-down"]],["Ultimo aggiornamento 2025-09-04 UTC."],[],[],null,["# Tutorial: Perform evaluation using the Python SDK\n=================================================\n\n| To see an example of Getting started with the Vertex AI Python SDK for Gen AI evaluation service,\n| run the \"Getting Started with the Vertex AI Python SDK for Gen AI evaluation service\" notebook in one of the following\n| environments:\n|\n| [Open in Colab](https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/evaluation/intro_to_gen_ai_evaluation_service_sdk.ipynb)\n|\n|\n| \\|\n|\n| [Open in Colab Enterprise](https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fevaluation%2Fintro_to_gen_ai_evaluation_service_sdk.ipynb)\n|\n|\n| \\|\n|\n| [Open\n| in Vertex AI Workbench](https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fevaluation%2Fintro_to_gen_ai_evaluation_service_sdk.ipynb)\n|\n|\n| \\|\n|\n| [View on GitHub](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/evaluation/intro_to_gen_ai_evaluation_service_sdk.ipynb)\n\nThis page shows you how to perform a model-based evaluation with Gen AI evaluation service using the Vertex AI SDK for Python.\n\nBefore you begin\n----------------\n\n1. Sign in to your Google Cloud account. If you're new to Google Cloud, [create an account](https://console.cloud.google.com/freetrial) to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.\n\n In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n\n\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project). \n In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n\n\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n2. Install the Vertex AI SDK for Python with Gen AI evaluation service dependency:\n\n !pip install google-cloud-aiplatform[evaluation]\n\n3. Set up your credentials. If you are running this quickstart in Colaboratory, run the following:\n\n from google.colab import auth\n auth.authenticate_user()\n\n For other environments, refer to [Authenticate to Vertex AI](/vertex-ai/docs/authentication#client-libraries).\n\nImport libraries\n----------------\n\nImport your libraries and set up your project and location. \n\n```python\nimport pandas as pd\n\nimport vertexai\nfrom vertexai.evaluation import EvalTask, PointwiseMetric, PointwiseMetricPromptTemplate\nfrom google.cloud import aiplatform\n\nPROJECT_ID = \"\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e\"\nLOCATION = \"\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e\"\nEXPERIMENT_NAME = \"\u003cvar translate=\"no\"\u003eEXPERIMENT_NAME\u003c/var\u003e\"\n\nvertexai.init(\n project=PROJECT_ID,\n location=LOCATION,\n)\n```\n\nNote that \u003cvar translate=\"no\"\u003eEXPERIMENT_NAME\u003c/var\u003e can only contain lowercase alphanumeric characters and hyphens, up to a maximum of 127 characters.\n\nSet up evaluation metrics based on your criteria\n------------------------------------------------\n\nThe following metric definition evaluates the text quality generated from a large language model based on two criteria: `Fluency` and `Entertaining`. The code defines a metric called `custom_text_quality` using those two criteria: \n\n custom_text_quality = PointwiseMetric(\n metric=\"custom_text_quality\",\n metric_prompt_template=PointwiseMetricPromptTemplate(\n criteria={\n \"fluency\": (\n \"Sentences flow smoothly and are easy to read, avoiding awkward\"\n \" phrasing or run-on sentences. Ideas and sentences connect\"\n \" logically, using transitions effectively where needed.\"\n ),\n \"entertaining\": (\n \"Short, amusing text that incorporates emojis, exclamations and\"\n \" questions to convey quick and spontaneous communication and\"\n \" diversion.\"\n ),\n },\n rating_rubric={\n \"1\": \"The response performs well on both criteria.\",\n \"0\": \"The response is somewhat aligned with both criteria\",\n \"-1\": \"The response falls short on both criteria\",\n },\n ),\n )\n\nPrepare your dataset\n--------------------\n\nAdd the following code to prepare your dataset: \n\n responses = [\n # An example of good custom_text_quality\n \"Life is a rollercoaster, full of ups and downs, but it's the thrill that keeps us coming back for more!\",\n # An example of medium custom_text_quality\n \"The weather is nice today, not too hot, not too cold.\",\n # An example of poor custom_text_quality\n \"The weather is, you know, whatever.\",\n ]\n\n eval_dataset = pd.DataFrame({\n \"response\" : responses,\n })\n\nRun evaluation with your dataset\n--------------------------------\n\nRun the evaluation: \n\n eval_task = EvalTask(\n dataset=eval_dataset,\n metrics=[custom_text_quality],\n experiment=EXPERIMENT_NAME\n )\n\n pointwise_result = eval_task.evaluate()\n\nView the evaluation results for each response in the `metrics_table` Pandas DataFrame: \n\n pointwise_result.metrics_table\n\nClean up\n--------\n\n\nTo avoid incurring charges to your Google Cloud account for\nthe resources used on this page, follow these steps.\n\nDelete the `ExperimentRun` created by the evaluation: \n\n aiplatform.ExperimentRun(\n run_name=pointwise_result.metadata[\"experiment_run\"],\n experiment=pointwise_result.metadata[\"experiment\"],\n ).delete()\n\nWhat's next\n-----------\n\n- [Define your evaluation metrics](/vertex-ai/generative-ai/docs/models/determine-eval).\n\n- [Prepare your evaluation dataset](/vertex-ai/generative-ai/docs/models/evaluation-dataset)."]]