- Resource: ModelEvaluation
- ClassificationEvaluationMetrics
- ConfidenceMetricsEntry
- ConfusionMatrix
- Row
- Methods
Resource: ModelEvaluation
Evaluation results of a model.
JSON representation | |
---|---|
{
"name": string,
"annotationSpecId": string,
"displayName": string,
"createTime": string,
"evaluatedExampleCount": number,
"classificationEvaluationMetrics": {
object( |
Fields | |
---|---|
name |
Output only. Resource name of the model evaluation. Format:
|
annotationSpecId |
Output only. The ID of the annotation spec that the model evaluation applies to. The The ID is empty for the overall model evaluation. |
displayName |
Output only. The value of The displayName is empty for the overall model evaluation. |
createTime |
Output only. Timestamp when this model evaluation was created. A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: |
evaluatedExampleCount |
Output only. The number of examples used for model evaluation, i.e. for which ground truth from time of model creation is compared against the predicted annotations created by the model. For overall ModelEvaluation (i.e. with annotationSpecId not set) this is the total number of all examples used for evaluation. Otherwise, this is the count of examples that according to the ground truth were annotated by the |
classificationEvaluationMetrics |
Evaluation metrics for classification models. |
ClassificationEvaluationMetrics
Model evaluation metrics for classification problems. These metrics only describe the quality of predictions where the type is set to segment_classification
. For information on the prediction type, see BatchPredictRequest.params
.
JSON representation | |
---|---|
{ "auPrc": number, "baseAuPrc": number, "auRoc": number, "logLoss": number, "confidenceMetricsEntry": [ { object( |
Fields | |
---|---|
auPrc |
Output only. The Area Under Precision-Recall Curve metric. Micro-averaged for the overall evaluation. |
baseAuPrc |
Output only. The Area Under Precision-Recall Curve metric based on priors. Micro-averaged for the overall evaluation. Deprecated. |
auRoc |
Output only. The Area Under Receiver Operating Characteristic curve metric. Micro-averaged for the overall evaluation. |
logLoss |
Output only. The Log Loss metric. |
confidenceMetricsEntry[] |
Output only. Metrics for each confidenceThreshold in 0.00,0.05,0.10,...,0.95,0.96,0.97,0.98,0.99 and positionThreshold = INT32_MAX_VALUE. ROC and precision-recall curves, and other aggregated metrics are derived from them. The confidence metrics entries may also be supplied for additional values of positionThreshold, but from these no aggregated metrics are computed. |
confusionMatrix |
Output only. Confusion matrix of the evaluation. Only set for MULTICLASS classification problems where number of labels is no more than 10. Only set for model level evaluation, not for evaluation per label. |
annotationSpecId[] |
Output only. The annotation spec ids used for this evaluation. |
ConfidenceMetricsEntry
Metrics for a single confidence threshold.
JSON representation | |
---|---|
{ "confidenceThreshold": number, "positionThreshold": number, "recall": number, "precision": number, "falsePositiveRate": number, "f1Score": number, "recallAt1": number, "precisionAt1": number, "falsePositiveRateAt1": number, "f1ScoreAt1": number, "truePositiveCount": string, "falsePositiveCount": string, "falseNegativeCount": string, "trueNegativeCount": string } |
Fields | |
---|---|
confidenceThreshold |
Output only. Metrics are computed with an assumption that the model never returns predictions with score lower than this value. |
positionThreshold |
Output only. Metrics are computed with an assumption that the model always returns at most this many predictions (ordered by their score, descendingly), but they all still need to meet the confidenceThreshold. |
recall |
Output only. Recall (True Positive Rate) for the given confidence threshold. |
precision |
Output only. Precision for the given confidence threshold. |
falsePositiveRate |
Output only. False Positive Rate for the given confidence threshold. |
f1Score |
Output only. The harmonic mean of recall and precision. |
recallAt1 |
Output only. The Recall (True Positive Rate) when only considering the label that has the highest prediction score and not below the confidence threshold for each example. |
precisionAt1 |
Output only. The precision when only considering the label that has the highest prediction score and not below the confidence threshold for each example. |
falsePositiveRateAt1 |
Output only. The False Positive Rate when only considering the label that has the highest prediction score and not below the confidence threshold for each example. |
f1ScoreAt1 |
Output only. The harmonic mean of |
truePositiveCount |
Output only. The number of model created labels that match a ground truth label. |
falsePositiveCount |
Output only. The number of model created labels that do not match a ground truth label. |
falseNegativeCount |
Output only. The number of ground truth labels that are not matched by a model created label. |
trueNegativeCount |
Output only. The number of labels that were not created by the model, but if they would, they would not match a ground truth label. |
ConfusionMatrix
Confusion matrix of the model running the classification.
JSON representation | |
---|---|
{
"annotationSpecId": [
string
],
"row": [
{
object( |
Fields | |
---|---|
annotationSpecId[] |
Output only. IDs of the annotation specs used in the confusion matrix. |
row[] |
Output only. Rows in the confusion matrix. The number of rows is equal to the size of |
Row
Output only. A row in the confusion matrix.
JSON representation | |
---|---|
{ "exampleCount": [ number ] } |
Fields | |
---|---|
exampleCount[] |
Output only. Value of the specific cell in the confusion matrix. The number of values each row has (i.e. the length of the row) is equal to the length of the |
Methods |
|
---|---|
|
Gets a model evaluation. |
|
Lists model evaluations. |