Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see
Model versions and lifecycle.
Evaluate model performance
Stay organized with collections
Save and categorize content based on your preferences.
This sample code demonstrates how to evaluate the performance of a GenAI model. It showcases how to define the evaluation specification, evaluate the model, and retrieve the evaluation metrics.
Explore further
For detailed documentation that includes this code sample, see the following:
Code sample
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[],null,["# Evaluate model performance\n\nThis sample code demonstrates how to evaluate the performance of a GenAI model. It showcases how to define the evaluation specification, evaluate the model, and retrieve the evaluation metrics.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Run a computation-based evaluation pipeline](/vertex-ai/generative-ai/docs/models/computation-based-eval-pipeline)\n\nCode sample\n-----------\n\n### Python\n\n\nBefore trying this sample, follow the Python setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Python API\nreference documentation](/python/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import os\n\n from google.auth import default\n\n import https://cloud.google.com/python/docs/reference/vertexai/latest/\n from vertexai.preview.language_models import (\n https://cloud.google.com/python/docs/reference/vertexai/latest/vertexai.preview.language_models.EvaluationTextClassificationSpec.html,\n TextGenerationModel,\n )\n\n PROJECT_ID = os.getenv(\"GOOGLE_CLOUD_PROJECT\")\n\n\n def evaluate_model() -\u003e object:\n \"\"\"Evaluate the performance of a generative AI model.\"\"\"\n\n # Set credentials for the pipeline components used in the evaluation task\n credentials, _ = default(scopes=[\"https://www.googleapis.com/auth/cloud-platform\"])\n\n https://cloud.google.com/python/docs/reference/vertexai/latest/.init(project=PROJECT_ID, location=\"us-central1\", credentials=credentials)\n\n # Create a reference to a generative AI model\n model = TextGenerationModel.from_pretrained(\"text-bison@002\")\n\n # Define the evaluation specification for a text classification task\n task_spec = EvaluationTextClassificationSpec(\n ground_truth_data=[\n \"gs://cloud-samples-data/ai-platform/generative_ai/llm_classification_bp_input_prompts_with_ground_truth.jsonl\"\n ],\n class_names=[\"nature\", \"news\", \"sports\", \"health\", \"startups\"],\n target_column_name=\"ground_truth\",\n )\n\n # Evaluate the model\n eval_metrics = model.evaluate(task_spec=task_spec)\n print(eval_metrics)\n # Example response:\n # ...\n # PipelineJob run completed.\n # Resource name: projects/123456789/locations/us-central1/pipelineJobs/evaluation-llm-classification-...\n # EvaluationClassificationMetric(label_name=None, auPrc=0.53833705, auRoc=0.8...\n\n return eval_metrics\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=generativeaionvertexai)."]]