Mulai 29 April 2025, model Gemini 1.5 Pro dan Gemini 1.5 Flash tidak tersedia di project yang belum pernah menggunakan model ini, termasuk project baru. Untuk mengetahui detailnya, lihat
Versi dan siklus proses model.
Mengevaluasi performa model
Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Kode contoh ini menunjukkan cara mengevaluasi performa model AI generatif. Contoh ini menunjukkan cara menentukan spesifikasi evaluasi, mengevaluasi model, dan mengambil metrik evaluasi.
Mempelajari lebih lanjut
Untuk dokumentasi mendetail yang menyertakan contoh kode ini, lihat artikel berikut:
Contoh kode
Kecuali dinyatakan lain, konten di halaman ini dilisensikan berdasarkan Lisensi Creative Commons Attribution 4.0, sedangkan contoh kode dilisensikan berdasarkan Lisensi Apache 2.0. Untuk mengetahui informasi selengkapnya, lihat Kebijakan Situs Google Developers. Java adalah merek dagang terdaftar dari Oracle dan/atau afiliasinya.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],[],[],[],null,["# Evaluate model performance\n\nThis sample code demonstrates how to evaluate the performance of a GenAI model. It showcases how to define the evaluation specification, evaluate the model, and retrieve the evaluation metrics.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Run a computation-based evaluation pipeline](/vertex-ai/generative-ai/docs/models/computation-based-eval-pipeline)\n\nCode sample\n-----------\n\n### Python\n\n\nBefore trying this sample, follow the Python setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Python API\nreference documentation](/python/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import os\n\n from google.auth import default\n\n import https://cloud.google.com/python/docs/reference/vertexai/latest/\n from vertexai.preview.language_models import (\n https://cloud.google.com/python/docs/reference/vertexai/latest/vertexai.preview.language_models.EvaluationTextClassificationSpec.html,\n TextGenerationModel,\n )\n\n PROJECT_ID = os.getenv(\"GOOGLE_CLOUD_PROJECT\")\n\n\n def evaluate_model() -\u003e object:\n \"\"\"Evaluate the performance of a generative AI model.\"\"\"\n\n # Set credentials for the pipeline components used in the evaluation task\n credentials, _ = default(scopes=[\"https://www.googleapis.com/auth/cloud-platform\"])\n\n https://cloud.google.com/python/docs/reference/vertexai/latest/.init(project=PROJECT_ID, location=\"us-central1\", credentials=credentials)\n\n # Create a reference to a generative AI model\n model = TextGenerationModel.from_pretrained(\"text-bison@002\")\n\n # Define the evaluation specification for a text classification task\n task_spec = EvaluationTextClassificationSpec(\n ground_truth_data=[\n \"gs://cloud-samples-data/ai-platform/generative_ai/llm_classification_bp_input_prompts_with_ground_truth.jsonl\"\n ],\n class_names=[\"nature\", \"news\", \"sports\", \"health\", \"startups\"],\n target_column_name=\"ground_truth\",\n )\n\n # Evaluate the model\n eval_metrics = model.evaluate(task_spec=task_spec)\n print(eval_metrics)\n # Example response:\n # ...\n # PipelineJob run completed.\n # Resource name: projects/123456789/locations/us-central1/pipelineJobs/evaluation-llm-classification-...\n # EvaluationClassificationMetric(label_name=None, auPrc=0.53833705, auRoc=0.8...\n\n return eval_metrics\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=generativeaionvertexai)."]]