Stay organized with collections
Save and categorize content based on your preferences.
Concepts
Following are some concepts and features used in this product:
Concept
Definition
Review
The process of visually comparing the extracted field values against actual values in the document and correcting any incorrect extractions, or adding missed extracted fields missed by the DocAI processors.
Labeler
The human that reviews the extracted document. The customer can use their own workforce (Bring-your-own-labeler or BYOL) or use Google labelers for HITL Review.
Task
A queue of extracted documents that labelers review. A processor generates a single task when configured for HITL Review.
Labeler Workbench
The UI used by a Labeler to review documents. The UI presents documents from the queue, that the labeler can review, correct and either submit or reject.
BYOL labelers need to have a Google Workforce or Gmail account to access the labeling UI.
Labelers can access the Workbench through a link sent via email from the Labeling Manager upon task assignment.
Answer Time
This is the time taken by a labeler to process a document. The Labeler Workbench tracks document submission time and presents efficiency analytics (e.g. for each labeler document review).
Labeling Manager
One or more labeling managers are assigned to a pool of labelers, so that they can:
Add or remove labelers to labeler pools.
Assign or unassign tasks to a labeler. All tasks in the project are accessible to a labeler manager. They may change task assignments to labelers based on the changing priorities of tasks.
Pause tasks so that labelers can work on the next tasks assigned to them.
In the BYOL scenario, Labeling Managers are provided by the customer.
When Google labelers are used, Google provides the Labeling Manager.
UI used by a Labeling Manager to manage labeler pools and task assignments. Open console.
Enqueue, Answered, Completed, Rejected Documents in a Task
A task is a continual workflow. A document goes through the following states:
Enqueued - As documents are processed by the processor, they're enqueued (added) to the HITL task.
Answered - when a document is reviewed, corrected and submitted by a Labeler, it is completed and saved in the customer's configured Cloud Storage bucket.
Completed - when a document is answered by all Labelers if the task has replication activated (multiple labelers working on each document in the task). When the task has no replication ( reviewed by a single labeler), Answered is the same as Completed.
Rejected - a document may be rejected if it is an invalid document (different doc-type, forged, etc) or poor quality (glare, edge cut off, etc).
Single Task per Processor
We do not support multiple tasks per processor. If customers need to process a single document type (invoices, for example) in different tasks, they can configure multiple processors with HITL Review.
Task Assignment vs Labeler Pools
Labeling Manager adds labelers to a pool. Once added, any labelers from the pool can be assigned to a task.
Note, "Labeler pool" is not to be confused with the "group" of labelers assigned to a task. A Pool is managed at a Project level and is used to determine labeler access to the analytics and the tasks. Any labeler from the pool can be assigned to one of more tasks in the Project.
Labeler Pool
A pool of labelers is created at a project level and not to be confused with task assignments. The Labeling Manager can assign any Any labeler Any labeler assigned to a task, so that multiple labelers can review documents in parallel and complete the task quicker. A labeler pool can be assigned to any task in the project by the customer.
Validation filters and thresholds
Extracted fields have a confidence score (0-100) representing the confidence that the DocAI extraction is accurate. Customers can configure the validation threshold for each field, so that only pages with fields that are below this validation threshold are enqueued for review, ; fields above the threshold are not enqueued.
There are 3 types of validation filters customers can configure:
Field-level filter - select the important fields that need to be reviewed and specify a confidence threshold for each field. If this threshold is set at 100% for any field, all pages containing this field are sent for review.
Document-level filter - select an overall document-level confidence threshold. If any field is below the threshold, the entire page is sent for review. If this threshold is set at 100%, all documents predicted are sent for review.
No filter - every document posted to the HITL end-point is sent for review.
The Labeling Manager gets analytics for each Task and each Labeler, including Enqueued, Answered, Skipped, Completed, Average Handling Time/document and total Answer time.
Analytics are accessed in the Analytics tab of the Labeling Manager Console.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-01 UTC."],[[["Document AI Human-in-the-Loop (HITL) is being deprecated and will no longer be available on Google Cloud after January 16, 2025, with new customers not being allowlisted."],["Review is the process of comparing extracted field values against actual document values, correcting errors, or adding missing extractions."],["A Labeler is the individual responsible for reviewing the extracted document data, and can be either part of the customer's workforce (BYOL) or provided by Google."],["A Labeling Manager is responsible for adding or removing labelers, assigning or unassigning tasks to labelers, and pausing tasks as needed, utilizing the Labeling Manager Console to do so."],["Tasks are a queue of documents for labelers to review, and each document goes through states such as Enqueued, Answered, Completed, or Rejected, depending on the labeler's actions and the document's validity."]]],[]]