自 2025 年 4 月 29 日起,Gemini 1.5 Pro 和 Gemini 1.5 Flash 模型將無法用於先前未使用這些模型的專案,包括新專案。詳情請參閱「
模型版本和生命週期」。
評估模型效能
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
這段範例程式碼示範如何評估 GenAI 模型的效能。這個範例會展示如何定義評估規格、評估模型,以及擷取評估指標。
深入探索
如需包含這個程式碼範例的詳細說明文件,請參閱下列內容:
程式碼範例
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],[],[],[],null,["# Evaluate model performance\n\nThis sample code demonstrates how to evaluate the performance of a GenAI model. It showcases how to define the evaluation specification, evaluate the model, and retrieve the evaluation metrics.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Run a computation-based evaluation pipeline](/vertex-ai/generative-ai/docs/models/computation-based-eval-pipeline)\n\nCode sample\n-----------\n\n### Python\n\n\nBefore trying this sample, follow the Python setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Python API\nreference documentation](/python/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import os\n\n from google.auth import default\n\n import https://cloud.google.com/python/docs/reference/vertexai/latest/\n from vertexai.preview.language_models import (\n https://cloud.google.com/python/docs/reference/vertexai/latest/vertexai.preview.language_models.EvaluationTextClassificationSpec.html,\n TextGenerationModel,\n )\n\n PROJECT_ID = os.getenv(\"GOOGLE_CLOUD_PROJECT\")\n\n\n def evaluate_model() -\u003e object:\n \"\"\"Evaluate the performance of a generative AI model.\"\"\"\n\n # Set credentials for the pipeline components used in the evaluation task\n credentials, _ = default(scopes=[\"https://www.googleapis.com/auth/cloud-platform\"])\n\n https://cloud.google.com/python/docs/reference/vertexai/latest/.init(project=PROJECT_ID, location=\"us-central1\", credentials=credentials)\n\n # Create a reference to a generative AI model\n model = TextGenerationModel.from_pretrained(\"text-bison@002\")\n\n # Define the evaluation specification for a text classification task\n task_spec = EvaluationTextClassificationSpec(\n ground_truth_data=[\n \"gs://cloud-samples-data/ai-platform/generative_ai/llm_classification_bp_input_prompts_with_ground_truth.jsonl\"\n ],\n class_names=[\"nature\", \"news\", \"sports\", \"health\", \"startups\"],\n target_column_name=\"ground_truth\",\n )\n\n # Evaluate the model\n eval_metrics = model.evaluate(task_spec=task_spec)\n print(eval_metrics)\n # Example response:\n # ...\n # PipelineJob run completed.\n # Resource name: projects/123456789/locations/us-central1/pipelineJobs/evaluation-llm-classification-...\n # EvaluationClassificationMetric(label_name=None, auPrc=0.53833705, auRoc=0.8...\n\n return eval_metrics\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=generativeaionvertexai)."]]