Document AI: A unified AI agent for your document processing needs
Sudheera Vanguri
Head of Product - Language & Document AI
Because of challenges like these, in 2020, we launched Document AI, an AI agent that lets organizations apply machine learning (ML) to their hardest document automation problems. Since then, we have introduced specialized models to extract data for industry-specific use cases such as mortgage processing and procurement. With the launch of Document AI Workbench and Document AI Warehouse at Google Cloud Next ‘22, we’ve continued to take significant steps in our mission to help organizations simplify and automate document processing. Let’s double click on each of these announcements.
Custom document processing with Document AI Workbench
With Document AI Workbench, organizations can process documents by creating custom ML models that are specific to their business needs and extract unstructured data with a high degree of accuracy. Thanks to the user-friendly interface, even business users who do not have extensive ML skills can get started training or uptraining models.
Moreover, if an organization wants to transfer learning from pretrained models and enhance a model further to, say, include new fields, users can now do so by what we call “uptraining.” The uptraining feature is especially valuable for the most common yet complex use cases because it helps to save time and resources, so businesses don’t have to start from scratch. Uptraining for the invoice, purchase order (PO), contracts, W2, 1099-R, payslip, and 1040 pre-trained models unlocks new possibilities for improving accuracy, adding new language support, and schema customization.
We’re continuing to invest in these pretrained models. At Next’22, we announced an update to our invoice and expense pre-trained models with improvements to normalization and line item entities detection, as well as new ID proofing capabilities via a flexible API designed to spot fake, altered, or doctored ID documents. We’ve also added support for five new languages across invoice and expense models, in addition to the 12 previously-supported languages, and expanded availability in Canada and Australia regions, in addition to previously-supported US, EU, and Singapore regions.
According to Daan De Groodt, Managing Director, Deloitte Consulting LLP, Document AI Workbench “is poised to be a game changer, because we can now uptrain various text documents and forms utilizing powerful Google Machine Learning models to get the desired accuracy creating greater time and resource efficiencies for our clients.”
And customers are already seeing benefits. Libeo used Document AI to uptrain an invoice parser with 1,600 documents and increase its testing accuracy from 75.6% to 83.9%. “Thanks to uptraining, the Document AI results now beat the results of a competitor and will help Libeo save ~20% on the overall cost for model training over the long run,” said Libeo chief technology officer, Pierre-Antoine Glandier.
Google-powered document search with Document AI Warehouse
With Document AI Warehouse we are bringing the best of Google’s semantic search to documents. Document AI Warehouse lets enterprises search, store, govern and manage documents and their AI-extracted data and metadata in a single platform. With Document AI Warehouse’s simple and intuitive web accessible user interface, users can explore, view, bulk update and organize documents into folders. Document AI Warehouse offers robust enterprise control and governance so you can control who has access at the document and folder levels and assign users and groups permissions to view, edit, manage (share, delete) documents. You can migrate, sync, or federate documents from other repositories, such as Microsoft SharePoint, Amazon S3, and IBM FileNet. Or if that’s not an option we simply index the content and any extracted/tagged metadata).
We also will consolidate a number of next-generation product enhancements on Document AI OCR and Form Parser by the end of this year - including deeper insights into document quality & semantics, a unified document OCR experience, expanded language coverage for Form Parser, and advanced tooling for model lifecycle management. Google's DeepMind team developed a new method that allows the creation of document parsing ML models for utility bills and purchase orders with 50%-70% less training data than what was previously needed for Document AI. We’re working on integrating this method into Document AI Workbench in the coming months.1
Getting started
I’m very excited about what the future holds for Document AI as a platform for businesses to simplify document automation. Learn more about all these exciting developments in my session at Next’22 or try out one of our offerings today.
1. Deepmind used thousands of Google's internal documents, such as utility bills and purchase orders from a variety of vendors to develop and evaluate this method. The performance will vary and depend on the evaluation dataset.