此页面由 Cloud Translation API 翻译。

评估注释存储区

本页面介绍如何使用 projects.locations.datasets.annotationStores.evaluate 方法评估由机器学习算法生成的注释记录的质量。

概览

evaluate 方法将一个注释存储区 (eval_store) 中的注释记录与描述同一资源的手动注释评估依据注释存储区 (golden_store) 进行比较。注释资源在每个储存区的 AnnotationSource 中定义。

eval_store 或 golden_store 中的注释记录可以由 projects.locations.datasets.annotationStores.annotations.create 单独生成，也可以通过执行以下操作生成：

使用 AnnotationConfig 对象调用 datasets.deidentify
调用 projects.locations.datasets.annotationStores.import

评估要求

要执行评估，必须满足以下所有条件：

在 eval_store 中，定义在 AnnotationSource 中的每个注释资源都只能为每个注释类型创建一条注释记录：
SensitiveTextAnnotation 必须存储从注释资源获取的 quote。如果您使用 datasets.deidentify 生成了注释记录，请将 AnnotationConfig 中的 store_quote 设置为 true。

评估输出

evaluate 方法向 BigQuery 报告评估指标。此方法会在指定的 BigQuery 表中使用以下架构输出行：

字段名称	类型	Mode	说明
`opTimestamp`	`TIMESTAMP`	`NULLABLE`	调用该方法的时间戳
`opName`	`STRING`	`NULLABLE`	评估长时间运行的操作 (LRO) 的名称
`evalStore`	`STRING`	`NULLABLE`	`eval_store` 的名称
`goldenStore`	`STRING`	`NULLABLE`	`golden_store` 的名称
`goldenCount`	`INTEGER`	`NULLABLE`	`golden_store` 中的注释记录数
`matchedCount`	`INTEGER`	`NULLABLE`	`eval_store` 与 `golden_store` 中匹配的注释记录数。
`averageResults`	`RECORD`	`NULLABLE`	所有 infoType 的平均结果数
`averageResults.` `sensitiveTextMetrics`	`RECORD`	`NULLABLE`	`SensitiveTextAnnotation` 的平均结果数
`averageResults.` `sensitiveTextMetrics.` `truePositives`	`INTEGER`	`NULLABLE`	正确的预测数
`averageResults.` `sensitiveTextMetrics.` `falsePositives`	`INTEGER`	`NULLABLE`	错误的预测数
`averageResults.` `sensitiveTextMetrics.` `falseNegatives`	`INTEGER`	`NULLABLE`	错过的预测数
`averageResults.` `sensitiveTextMetrics.` `precision`	`FLOAT`	`NULLABLE`	`truePositives / (truePositives + falsePositives)` 起始范围为 `[0..1]`，其中 `1.0` 表示所有正确的预测
`averageResults.` `sensitiveTextMetrics.` `recall`	`FLOAT`	`NULLABLE`	`truePositives / (truePositives + falseNegatives)` 起始范围为 `[0..1]`，其中 `1.0` 表示不存在缺失的预测
`averageResults.` `sensitiveTextMetrics.` `fScore`	`FLOAT`	`NULLABLE`	`2 * precision * recall / (precision + recall)` 是精确率和召回率的调和平均数，起始范围是 `[0..1]`，其中 `1.0` 表示完美预测
`infoResults`	`RECORD`	`REPEATED`	与 `averageResults` 类似，但按 infoType 进行了细分
`infoResults.` `sensitiveTextMetrics`	`RECORD`	`NULLABLE`	`SensitiveTextAnnotation` 的 infoType 结果
`infoResults.` `sensitiveTextMetrics.` `infoType`	`STRING`	`NULLABLE`	infoType 类别
`infoResults.` `sensitiveTextMetrics.` `truePositives`	`INTEGER`	`NULLABLE`	正确的预测数
`infoResults.` `sensitiveTextMetrics.` `falsePositives`	`INTEGER`	`NULLABLE`	错误的预测数
`infoResults.` `sensitiveTextMetrics.` `falseNegatives`	`INTEGER`	`NULLABLE`	错过的预测数
`infoResults.` `sensitiveTextMetrics.` `precision`	`FLOAT`	`NULLABLE`	`truePositives / (truePositives + falsePositives)` 起始范围为 `[0..1]`，其中 `1.0` 表示所有正确的预测
`infoResults.` `sensitiveTextMetrics.` `recall`	`FLOAT`	`NULLABLE`	`truePositives / (truePositives + falseNegatives)` 起始范围为 `[0..1]`，其中 `1.0` 表示不存在缺失的预测
`infoResults.` `sensitiveTextMetrics.` `fScore`	`FLOAT`	`NULLABLE`	`2 * precision * recall / (precision + recall)` 是精确率和召回率的调和平均数，起始范围是 `[0..1]`，其中 `1.0` 表示完美预测

如需了解该方法的详细定义，请参阅 EvaluateAnnotationStore。

评估注释存储区

概览

评估要求

评估输出

另请参阅