Document AI Workbench Preview
Extract data from any document by creating custom ML models that are specific to your business needs. With Document AI Workbench you can achieve a higher degree of accuracy for extracting data from unstructured documents by training or uptraining machine learning models.
-
Create your own custom model with one click
-
Uptrain existing models to improve performance
-
Evaluate, iterate, and choose the best model
-
New Google Cloud customers get $300 in free credits to fully explore
Benefits
Achieve higher document processing accuracy
Experience fewer errors and enjoy faster and more accurate document processing workflows with custom models that are trained on your business documents.
Process a wide range of document types
Now you don't need to depend on pretrained models for your document processing needs. You can work with a wide range of documents that are not supported by Google’s pretrained models.
Train models without machine learning skills
With Document AI Workbench even business users who do not have extensive machine learning skills can get started training or uptraining models with a friendly user interface.
Key features
Key features
Uptrain existing models to improve performance
Uptraining means that you begin with a prebuilt ML model, and then train this model with your own data to improve its accuracy for your organization’s documents. With Document AI Workbench you don’t have to start from scratch each time you need custom document extraction.
Create Custom Document Extractor models for your documents
Create Custom Document Extractors (CDE) that are specific to your documents, such as checkboxes, and are trained and evaluated with your data for higher accuracy and performance. You can also use the trained model on additional documents.
Manage and label datasets to prepare for training
A labeled dataset of documents is required to train, uptrain, or evaluate an ML model. With Document AI Workbench you can apply labels from your model schema to imported documents in your dataset. If available, you can use an existing version of your model to get a head start on labeling. You can also outsource and manage the labeling of your documents to a team of labeling specialists in your organization or a third party.
Train, evaluate, and deploy models to production
Easily train and deploy your custom ML models for document processing. Document AI generates evaluation metrics, such as precision and recall, to help you determine the predictive performance of your models. These evaluation metrics are generated by comparing the entities returned by the models (the predictions) against the annotations in the test documents.
"Document AI Workbench is helping us expand document automation more quickly and effectively. By using this new product, we have been able to train our own document parser models in a fraction of the time and with less resources. We feel this will help us realize important operational improvements for our business and help us serve our customers much better."
Daniel Ordaz Palacios, Global Head Business Process & Operations
Documentation
Documentation
Uptrain a specialized processor
Uptraining means that you begin with a pretrained model, and then train this model with your own data to improve its accuracy. Find out how in this guide.
Create a Custom Document Extractor
Learn how to use Document AI Workbench to create and train a Custom Document Extractor that processes W-2 (US tax form) documents (as an example).
Create a labeled dataset
A labeled dataset of documents is required to train, uptrain, or evaluate an ML model. Learn how to create a dataset, import documents, and define a schema.
Label documents
Learn how to apply labels from your model schema to imported documents in your dataset.
Train or uptrain ML models
See how you can train a new custom document processing model from scratch or uptrain an existing ML model for document processing tasks specific to your needs.
Evaluate model performance
An evaluation is automatically run whenever you train or uptrain a model. See how to run a manual evaluation to get updated metrics after modifying the test set.
Use cases
Use cases
Workbench can handle documents with printed or handwritten text, tables, and other nested entities, checkboxes, and more. Workbench can use document images whether they were professionally scanned or captured in a quick photo. You can import data in multiple formats, such as PDFs, common images, and JSON documents.
Instead of having to pay to spin up servers and wait while models are trained, you can create and evaluate ML models for free. You simply pay as you go once processors are deployed and used to extract data from documents.
With one click, train a model via uptraining or from scratch. If you are working with a document type similar in layout and schema to an existing document processor, then uptrain the relevant processor to get accurate results faster. If there is no relevant processor available for the document you’re trying to process, then create a model from scratch.
You can configure Human in the Loop to review and correct predictions with confidence levels below your threshold. With human review, you can correct or confirm output before you use it in production and can leverage the corrected data to train the model and improve the accuracy of future predictions.
Pricing
Pay for what you use
With Document AI Workbench, you pay only for hosting and
prediction; there is no cost for importing data or training.