Document AI basics

Document AI is a document understanding solution that takes unstructured data (documents, emails, etc.) and makes the data easier to understand, analyze, and consume by providing structure through content classification, entity extraction, advanced searching, and more.

Document AI uses machine learning and Google Cloud to help you create a scalable, cloud-based document understanding solution.

Using Document AI, you can:

  • Convert images to text
  • Classify documents
  • Analyze and extract entities

See the announcement of the Document AI solution in the Google Cloud Blog.

Document AI is a Service covered by Google's obligations set forth in the Data Processing and Security Terms.

Document AI processors

Document AI offers a growing list of processors (also called parsers or splitters, depending on their functionality) to extract information from specific document types.

Starting with the v1beta3 API endpoint, you can create multiple processor types and use the same code to extract information. For more information on processor types available, see the processors overview.

The following products share functionality with Document AI. However, these products perform a specific text and analysis function. Depending on your use case, these products may provide the specific functionality you need.

Convert images to text

You can convert content in images to text using Document AI's Document OCR, or Cloud Vision and AutoML Vision products. You can use the Document OCR or Cloud Vision API to perform optical character recognition (OCR) and handwriting recognition on images, PDF, and TIFF files. You can use AutoML Vision Object Detection to convert sections of images into text documents.

For OCR information using other Google Cloud products, see:

Cloud Vision API
AutoML Vision Object Detection

Classify documents

You can categorize and label documents using the Cloud Natural Language API and AutoML Natural Language products. You can use the Natural Language API to classify content using a generalized list of categories. You can use AutoML Natural Language Classification to create a custom machine learning model to classify content with your own category labels.

If your document is an image, you can use AutoML Vision Classification to classify image content. For example, images of invoices.

For more information, see:

Natural Language API
AutoML Natural Language Classification
AutoML Vision Classification

Analyze and extract entities

You can identify known entities in documents (proper nouns such as public figures, company branding, and so on) and entities that follow common patterns such as phone numbers and addresses with using Document AI's Form parser, or any of the specialized processors for your use case.

You can also use the Natural Language API and AutoML Natural Language products to extract entities. You can identify common, public entities using the Natural Language API. You can create a custom machine learning model to identify entities specific to your company or use case using AutoML Natural Language Entity Extraction.

For more information, see:

Natural Language API
AutoML Natural Language Entity Extraction

Find a partner

Take advantage of our growing partner ecosystem to help you create and manage your document understanding solution. For a list of partners and the services that they provide, see Document AI Partners.