Cloud Document Understanding AI

A document understanding solution takes unstructured data, such as documents, emails, and so on, and makes the data easier to understand, analyze, and consume by providing structure through content classification, entity extraction, advanced searching, and more.

Google's Document Understanding AI solution uses machine learning and the Google Cloud Platform (GCP) to help you create a scalable, cloud-based document understanding solution.

Document Understanding AI Workflow

Using Google's Document Understanding AI solution, you can:

See the announcement of the Cloud Document Understanding AI solution in the Google Cloud Blog.

Convert images to text

You can convert content in images to text using Google's Vision API and AutoML Vision products. You can use the Vision API to perform optical character recognition (OCR) and handwriting recognition on images, PDF, and TIFF files. You can use AutoML Vision Object Detection to convert sections of images into text documents.

For more information, see:

Vision API
AutoML Vision Object Detection

Classify documents

You can categorize and label documents using Google's Cloud Natural Language API and AutoML Natural Language products. You can use the Natural Language API to classify content using a generalized list of categories. You can use AutoML Natural Language Classification to create a custom machine learning model to classify content with your own category labels.

If your document is an image, you can use AutoML Vision Classification to classify image content. For example, images of invoices.

For more information, see:

Natural Language API
AutoML Natural Language Classification
AutoML Vision Classification

Analyze and extract entities

You can identify known entities in documents (proper nouns such as public figures, company branding, and so on) and entities that follow common patterns such as phone numbers and addresses with Google's Natural Language API and AutoML Natural Language products. You can identify common, public entities using the Natural Language API. You can create a custom machine learning model to identify entities specific to your company or use case using AutoML Natural Language Entity Extraction.

For more information, see:

Natural Language API
AutoML Natural Language Entity Extraction

Parse tables and extract key/value pairs from images (Alpha)

You can extract table data and key/value pairs from PDF and TIFF files using Google's Vision API. For example, you can extract metadata from invoice images and keep field names and values paired together.

These document parsing features are currently in private alpha. To request access to the private alpha, fill out the Request access form.

Create a Knowledge Base (Alpha)

You can create a knowledge base populated with your free-form or question/answer formatted text documents using Google's Document Understanding AI API. Once you have a knowledge base populated with content, you can search the content using semantic matching and question and answer style searches. You can also create a knowledge graph custom to your company or use case.

The Document Understanding AI API is currently in private alpha. To request access to the private alpha, fill out the Request access form.

Find a partner

Take advantage of our growing partner ecosystem to help you create and manage your document understanding solution. For a list of partners and the services that they provide, see Cloud Document Understanding AI Partners.

Bu sayfayı yararlı buldunuz mu? Lütfen görüşünüzü bildirin:

Şunun hakkında geri bildirim gönderin...

Document Understanding AI