This page shows how to get started with the Cloud Client Libraries for the Document AI Toolbox API. Read more about the client libraries for Cloud APIs, including the older Google API Client Libraries, in Client Libraries Explained.
Install the client library
Python
For more information, see Setting Up a Python Development Environment.
pip install --upgrade google-cloud-documentai-toolbox
Set up authentication
When you use client libraries, you use Application Default Credentials (ADC) to authenticate. For information about setting up ADC, see Provide credentials for Application Default Credentials. For information about using ADC with client libraries, see Authenticate using client libraries.
Use the client library
Document AI Toolbox is an SDK for Python that provides utility
functions for managing, manipulating, and extracting information from the document response.
It creates a "wrapped" document object from a processed document response from JSON files in
Cloud Storage, local JSON files, or output directly from the process_document()
method.
It can perform the following actions:
- Combine fragmented
Document
JSON files from Batch Processing into a single "wrapped" document. - Export shards as a unified
Document
. -
Get
Document
output from: - Access text from
Pages
,Lines
,Paragraphs
,FormFields
, andTables
without handlingLayout
information. - Search for a
Pages
containing a target string or matching a regular expression. - Search for
FormFields
by name. - Search for
Entities
by type. - Convert
Tables
to a Pandas Dataframe or CSV. - Insert
Entities
andFormFields
into a BigQuery table. - Split a PDF file based on output from a Splitter/Classifier processor.
- Extract image
Entities
fromDocument
bounding boxes. -
Convert
Documents
to and from commonly used formats:- Cloud Vision API
AnnotateFileResponse
- hOCR
- Third-party document processing formats
- Cloud Vision API
- Create batches of documents for processing from a Cloud Storage folder.
Code Samples
The following code samples demonstrate how to use Document AI Toolbox.