Sample processor output

This page contains detailed information on output produced by processors offered by Document AI.

The files on this page are sample documents in a variety of structures and the raw outputs from the Document AI API in the Document format.

The fields returned in the response can be limited by using a FieldMask when making a processing request.

Digitize text

Processors Output samples

Enterprise Document OCR (Optical Character Recognition)

Category Digitize
Solution type General
Functions OCR, Quality Analysis
Release stage General availability
Access status Public
Full processor details Detailed entry
Sample input file
pretrained-ocr-v2.0-2023-06-02
Output Document JSON
Checkbox Extraction - Document JSON
Font Detection - Document JSON
Math OCR - Document JSON
pretrained-ocr-v1.2-2022-11-10
Output Document JSON
pretrained-ocr-v1.1-2022-09-12
Output Document JSON
pretrained-ocr-v1.0-2020-09-23
Output Document JSON

Summarizer

Category Digitize
Solution type Generative AI
Functions Summarize
Release stage Preview
Access status Public
Full processor details Detailed entry
Sample input file
pretrained-foundation-model-v1.0-2023-08-22
Output Document JSON

Extract documents

Processors Output samples

Custom Extractor

Category Extract
Solution type Custom
Functions OCR, Entity Extraction
Release stage General availability
Access status Public
Full processor details Detailed entry
Sample input file
Output Document JSON

Form Parser

Category Extract
Solution type General
Functions OCR, Form Parsing, Entity Extraction
Release stage General availability
Access status Public
Full processor details Detailed entry
Sample input file
pretrained-form-parser-v1.0-2020-09-23
Output Document JSON
pretrained-form-parser-v2.0-2022-11-10
Output Document JSON
pretrained-form-parser-v2.1-2023-06-26
Output Document JSON

Classify documents

Processors Output samples

Custom Classifier

Category Classify
Solution type Custom
Functions OCR, Classification
Release stage General availability
Access status Public
Full processor details Detailed entry
Sample input file
Output Document JSON

Custom Splitter

Category Classify
Solution type Custom
Functions OCR, Classification, Splitting
Release stage General availability
Access status Public
Full processor details Detailed entry
Sample input file
Output Document JSON

Explore legacy processors

Processors Output samples

Bank Statement Parser

Category Legacy
Solution type Lending
Functions OCR, Entity Extraction
Release stage General availability
Access status Limited
Full processor details Detailed entry
Sample input file
pretrained-bankstatement-v1.0-2021-08-08
Output Document JSON
pretrained-bankstatement-v1.1-2021-08-13
Output Document JSON
pretrained-bankstatement-v2.0-2021-12-10
Output Document JSON
pretrained-bankstatement-v3.0-2022-05-16
Output Document JSON
pretrained-bankstatement-v4.0-2023-07-31
Output Document JSON
pretrained-bankstatement-v5.0-2023-12-06
Output Document JSON

W2 Parser

Category Legacy
Solution type Lending
Functions OCR, Entity Extraction
Release stage General availability
Access status Limited
Full processor details Detailed entry
Sample input file
pretrained-w2-v1.0-2020-10-01
Output Document JSON
pretrained-w2-v1.1-2022-01-27
Output Document JSON
pretrained-w2-v1.2-2022-01-28
Output Document JSON
pretrained-w2-v2.0-2022-03-30
Output Document JSON
pretrained-w2-v2.1-2022-06-08
Output Document JSON

US Passport Parser

Category Legacy
Solution type Identity
Functions OCR, Entity Extraction
Release stage General availability
Access status Public
Full processor details Detailed entry
Sample input file
pretrained-us-passport-v1.0-2021-06-14
Output Document JSON

Utility Parser

Category Legacy
Solution type Procurement
Functions OCR, Entity Extraction
Release stage General availability
Access status Limited
Full processor details Detailed entry
Sample input file
pretrained-utility-v1.1-2021-04-09
Output Document JSON
pretrained-utility-v1.2-2022-12-15
Output Document JSON

Identity Document Proofing Parser

Category Legacy
Solution type Identity
Functions OCR, Quality Analysis
Release stage Preview
Access status Public
Full processor details Detailed entry
Sample input file
pretrained-id-proofing-v1.0-2022-10-03
Output Document JSON
pretrained-id-proofing-v1.1-2023-05-18
Output Document JSON
pretrained-id-proofing-v1.2-2023-10-04
Output Document JSON

US Driver License Parser

Category Legacy
Solution type Identity
Functions OCR, Entity Extraction
Release stage General availability
Access status Public
Full processor details Detailed entry
Sample input file
pretrained-us-driver-license-v1.0-2021-06-14
Output Document JSON

Expense Parser

Category Legacy
Solution type Procurement
Functions OCR, Entity Extraction
Release stage General availability
Access status Public
Full processor details Detailed entry
Sample input file
pretrained-expense-v1.1-2021-04-09
Output Document JSON
pretrained-expense-v1.2-2022-02-18
Output Document JSON
pretrained-expense-v1.3-2022-07-15
Output Document JSON
pretrained-expense-v1.4-2022-11-18
Output Document JSON

Invoice Parser

Category Legacy
Solution type Procurement
Functions OCR, Entity Extraction
Release stage General availability
Access status Public
Full processor details Detailed entry
Sample input file
pretrained-invoice-v1.1-2021-04-09
Output Document JSON
pretrained-invoice-v1.2-2022-02-18
Output Document JSON
pretrained-invoice-v1.3-2022-07-15
Output Document JSON
pretrained-invoice-v1.4-2022-10-21
Output Document JSON
pretrained-invoice-v1.5-2023-09-15
Output Document JSON
pretrained-invoice-v2.0-2023-12-06
Output Document JSON