Represents evaluation result of a conversation model.
JSON representation
{"name": string,"displayName": string,"evaluationConfig": {object (EvaluationConfig)},"createTime": string,"rawHumanEvalTemplateCsv": string,// Union field metrics can be only one of the following:"smartReplyMetrics": {object (SmartReplyMetrics)}// End of list of possible types for union field metrics.}
Fields
name
string
The resource name of the evaluation. Format: projects/<Project ID>/conversationModels/<Conversation Model
ID>/evaluations/<Evaluation ID>
displayName
string
Optional. The display name of the model evaluation. At most 64 bytes long.
Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".
rawHumanEvalTemplateCsv
string
Output only. Human eval template in csv format. It takes real-world conversations provided through input dataset, generates example suggestions for customer to verify quality of the model. For Smart Reply, the generated csv file contains columns of Context, (Suggestions,Q1,Q2)*3, Actual reply. Context contains at most 10 latest messages in the conversation prior to the current suggestion. Q1: "Would you send it as the next message of agent?" Evaluated based on whether the suggest is appropriate to be sent by agent in current context. Q2: "Does the suggestion move the conversation closer to resolution?" Evaluated based on whether the suggestion provide solutions, or answers customer's question or collect information from customer to resolve the customer's issue. Actual reply column contains the actual agent reply sent in the context.
Union field metrics. Metrics details. metrics can be only one of the following:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-03-05 UTC."],[[["`ConversationModelEvaluation` represents the evaluation results of a conversation model, providing insights into its performance."],["The JSON representation of `ConversationModelEvaluation` includes fields like `name`, `displayName`, `evaluationConfig`, `createTime`, and `rawHumanEvalTemplateCsv`, detailing the evaluation's attributes and results."],["The `rawHumanEvalTemplateCsv` is generated in a csv format containing the context of a conversation, with generated suggestions to assess the model quality, and contains an actual reply column."],["The evaluation metrics can include `smartReplyMetrics`, specifically for models designed for smart replies."],["There are three methods for managing conversation model evaluations: creating, getting, and listing evaluations of conversation models."]]],[]]