The metric used for running evaluations.
Optional. The aggregation metrics to use.
metric_spec
Union type
metric_spec
can be only one of the following:The spec for a pre-defined metric.
Spec for an LLM based metric.
Spec for pointwise metric.
Spec for pairwise metric.
Spec for exact match metric.
Spec for bleu metric.
Spec for rouge metric.
JSON representation |
---|
{ "aggregationMetrics": [ enum ( |
LLMBasedMetricSpec
Specification for an LLM based metric.
rubrics_source
Union type
rubrics_source
can be only one of the following:rubricGroupKey
string
Use a pre-defined group of rubrics associated with the input. Refers to a key in the rubricGroups map of EvaluationInstance.
Dynamically generate rubrics using this specification.
Dynamically generate rubrics using a predefined spec.
metricPromptTemplate
string
Required. Template for the prompt sent to the judge model.
systemInstruction
string
Optional. System instructions for the judge model.
Optional. Optional configuration for the judge LLM (Autorater).
Optional. Optional additional configuration for the metric.
JSON representation |
---|
{ // rubrics_source "rubricGroupKey": string, "rubricGenerationSpec": { object ( |
AggregationMetric
The aggregation metrics supported by EvaluationService.EvaluateDataset.
Enums | |
---|---|
AGGREGATION_METRIC_UNSPECIFIED |
Unspecified aggregation metric. |
AVERAGE |
Average aggregation metric. Not supported for Pairwise metric. |
MODE |
Mode aggregation metric. |
STANDARD_DEVIATION |
Standard deviation aggregation metric. Not supported for pairwise metric. |
VARIANCE |
Variance aggregation metric. Not supported for pairwise metric. |
MINIMUM |
Minimum aggregation metric. Not supported for pairwise metric. |
MAXIMUM |
Maximum aggregation metric. Not supported for pairwise metric. |
MEDIAN |
Median aggregation metric. Not supported for pairwise metric. |
PERCENTILE_P90 |
90th percentile aggregation metric. Not supported for pairwise metric. |
PERCENTILE_P95 |
95th percentile aggregation metric. Not supported for pairwise metric. |
PERCENTILE_P99 |
99th percentile aggregation metric. Not supported for pairwise metric. |