Extract data from, classify, and split documents by creating custom machine learning models that are specific to your business. With Document AI Workbench, you can automate business processes by easily training machine learning models for documents like insurance forms, bills of lading, and more.
Create your own custom model with one click
Uptrain existing models to improve performance
Extract structured data from, classify, and split documents
Extract data from and summarize documents using generative AI
New Google Cloud customers get $300 in free credits to fully explore
Experience fewer errors and enjoy faster and more accurate document processing workflows with custom models that are trained on your business documents.
Now you don't need to depend on pretrained models for your document processing needs. You can work with a wide range of documents that are not supported by Google’s pretrained models.
With Document AI Workbench even users who do not have extensive machine learning skills can get started training or uptraining models with a friendly user interface.
Uptraining means that you begin with a prebuilt ML model, and then train this model with your own data to improve its accuracy for your organization’s documents. Pre-built models offer a base model relevant to your document type and help automatically label data so you can build production ready models faster. Document AI generates evaluation metrics, such as precision and recall, to help you determine the predictive performance of your models. These evaluation metrics are generated by comparing the entities returned by the models (the predictions) against the annotations in the test documents.
Create custom extractors that are specific to your documents, and are trained and evaluated with your data for higher accuracy and performance. Once you create an initial model, use it to auto-label documents to train a production ready model faster. With a deployed model, extract structured data from documents to automate processes and unearth insights. Use generative AI to quickly improve custom models through prompts. Use this feature to deploy new models or auto-label datasets, saving time and cost. For example, quickly add a new field to your data by prompting a foundation model to add this new field to your data instead of having to label and train a new model. You can also use the same approach to auto label new datasets.
Create custom classifiers that identify a document type from a set of document classes. Classifying document types help businesses save time, effort, and money. For example, you can validate if users submit the right documents within an application. Also, you can classify documents to automate downstream processes such as choosing the best model to extract data from a document.
Create custom splitters to split and classify multiple documents within a single file. Custom splitter helps users sort and classify documents so they can validate if they have all the needed documents from an applicant. For example, a custom splitter can classify and identify the page numbers of a driver’s license, paystub, W-2, and bank statement (and others) within a single document. Furthermore, individually classified documents enable businesses to better automate downstream processes, including selecting the proper storage, analysis, or processing steps based on the document type like data extraction.
Use document summarizer to create summaries for both short and long documents. You can customize summaries based on your preference. For example, you can decide if you want your summaries to be long or short, paragraphs or bullet points etc. There is no model training required to access this function. However you have the ability to fine tune underlying models to improve the quality of summaries generated.
Workbench can handle documents with printed or handwritten text, tables, and other nested entities, checkboxes, and more. Workbench can use document images whether they were professionally scanned or captured in a quick photo. You can import data in multiple formats, such as PDFs, common images, and JSON documents.
Instead of having to pay to spin up servers and wait while models are trained, you can create and evaluate ML models for free. You simply pay as you go once processors are deployed and used to extract data from documents.
With one click, train a model via uptraining or from scratch. If you are working with a document type similar in layout and schema to an existing document processor, then uptrain the relevant processor to get accurate results faster. If there is no relevant processor available for the document you’re trying to process, then create a model from scratch.
With Document AI Workbench, you pay only for hosting and prediction; there is no cost for importing data or training.