Document AI overview

Document AI is a document understanding solution that takes unstructured data (documents, forms, etc.) and makes the data easier to understand, analyze, and consume by providing structure through content classification, entity extraction, advanced searching, and more.

Document AI uses machine learning and Google Cloud to help you create a scalable, cloud-based document understanding solution.

Using Document AI, you can:

  • Convert images to text
  • Classify documents
  • Analyze and extract entities

See the announcement of the Document AI solution in the Google Cloud Blog.

Document AI is a Service covered by Google's obligations set forth in the Data Processing and Security Terms.

Document AI processors

Document AI offers a growing list of processors (also called parsers or splitters, depending on their functionality) to extract information from specific document types.

Document AI currently offers the following document processors (separated by processor type):

General processors

Processor Description Public access Limited access
Document OCR (Optical Character Recognition) Identify and extract text in different types of documents.
Document Splitter Programmatically split documents on logical boundaries.
Form Parser Extract form elements such as text and checkboxes.
French Driver License Parser Extract fields such as names, document ID, date of birth, etc.
French National ID Parser Extract fields such as names, document ID, date of birth, etc.
Intelligent Document Quality Processor Perform quality assessment of a document based on its readability and get a quality score.
US Driver License Parser Extract fields such as names, document ID, date of birth, etc.
US Passport Parser Extract fields such as names, document ID, date of birth, etc.

Lending processors

Processor Description Public access Limited access
1003 Parser Extract over 50 fields from Fannie Mae Form 1003 (URLA).
1040 Parser Extract from Form 1040, including name, filing status, amounts, etc.
1040 Schedule C Parser Extract from Form 1040 Schedule C, including name, wages, etc.
1040 Schedule E Parser Extract from Form 1040 Schedule E, including name, expenses, etc.
1099-DIV Parser Extract from Form 1099-DIV, including account number, qualified dividends, federal income tax withheld, etc.
1099-G Parser Extract from Form 1099-G, including payer, recipient, etc.
1099-INT Parser Extract from Form 1099-INT, including payer, recipient, etc.
1099-MISC Parser Extract from Form 1099-MISC, including payer, recipient, amounts, etc.
1099-NEC Parser Extract from Form 1099-NEC, including payer, recipient, etc.
1099-R Parser Extract from Form 1099-R, including payer, recipient, etc.
SSA-1099 Parser Extract from Form SSA-1099, name, address, SSN, etc.
1065 Parser Extract from Form 1065, partnership name, address, assets, etc.
1120 Parser Extract from Form 1120, partnership name, address, assets, etc.
1120S Parser Extract from Form 1120S, name, address, assets, etc.
Bank Statement Parser Extract from bank statements including name, account, transactions, etc.
Lending Document Splitter & Classifier Identify documents in a large file and classify known lending document types.
Pay Slip Parser Extract from pay slips, including name, business, amounts, etc.
W2 Parser Extract from Form W2, including employee, employer, wages, etc.
W9 Parser Extract from Form W9 including name, address, TIN, etc.

Procurement processors

Processor Description Public access Limited access
Expense Parser Extract text and values from expense documents such as expense date, supplier name, total amount, and currency.
Invoice Parser Extract text and values from invoices such as invoice number, supplier name, invoice amount, tax amount, invoice date, due date.
Procurement Document Splitter Allows you to programmatically split these combined procurement documents on logical boundaries.
Utility Parser Extract text and values from utility bills such as supplier name and previous paid amount.

Using Document AI processors

Here are the major steps to use Document AI:

  1. Choose a processor that is suitable for your use case.

    For complete information on each processor, see the Full processor and detail list.

  2. Create a processor using the Cloud Console.

    Document AI creates a prediction endpoint where you can send your documents.

    For detailed instructions, see Creating a processor

  3. Send your document(s) for processing.

    Document AI processes the document(s) and returns one or more documents objects, which contain the extracted, structured information.

    For detailed instructions, see Sending a processing request.