Document AI allows you to identify and extract text from documents in over 200 languages for printed text and 50 languages for handwritten text.
Before you can send a processing request for a document, you must first create a Document OCR processor. The type of processor you create and use for your request affects the output you receive*.
* | This information applies to API version v1beta3 or higher. |
Request document processing from a smaller file (<=5 pages for most processors) using the
process
method, and larger file requests (files with a large
number of pages) use the
batchProcess
method. The status of batch (asynchronous) requests can be checked using the operations
resource.
Processor details
File types supported | PDF, TIFF, GIF |
Maximum number of pages (online/synchronous) | 10 |
Maximum number of pages (offline/asynchronous/batch) | 500 |
Maximum file size | 20Mb |
Small file online processing
Synchronous ("online") requests target a document with a small number of pages and size. Synchronous requests immediately return a response inline.
The following code samples show you how to send a process request to a Document OCR processor.
v1beta3
Select the tab below for your language or environment:
REST & CMD LINE
This sample shows how to use the
process
method to request small document processing (<=5 pages). The example uses the access
token for a service account set up for the project using the Cloud SDK. For
instructions on installing the Cloud SDK, setting up a project with a service
account, and obtaining an access token, see Before you begin.
Before using any of the request data below, make the following replacements:
- LOCATION: one of the following regional processing options:
us
- United Stateseu
- European Union
- PROJECT_ID: Your GCP project ID.
- PROCESSOR_ID: the ID of your custom processor.
- MIME_TYPE: One of the valid
MIME type options:
application/pdf
image/gif
image/tiff
- IMAGE_CONTENT: Inline document content, represented as
a stream of bytes. For JSON representations, the base64
encoding (ASCII string) of your binary image data. This string should look similar to the
following string:
/9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
HTTP method and URL:
POST https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process
Request JSON body:
{ "document": { "mimeType": "MIME_TYPE", "content": "IMAGE_CONTENT" } }
To send your request, choose one of these options:
curl
Save the request body in a file called request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process
PowerShell
Save the request body in a file called request.json
,
and execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK
HTTP status code and the
response in JSON format. The response body contains an instance of
Document.
Java
Node.js
Python
Large file offline processing
Asynchronous ("offline") requests target longer documents. These types of requests start a long-running operations. When this operation finishes it stores output as a JSON file in a specified Cloud Storage bucket.
The following code samples show you how to send a batch process request to a Document OCR processor.
v1beta3
Select the tab below for your language or environment:
REST & CMD LINE
This sample shows how to send a POST
request to the
batchProcess
method for large document asynchronous processing. The example uses
the access token for a service account set up for the project using the Cloud SDK. For
instructions on installing the Cloud SDK, setting up a project with a service account, and
obtaining an access token, see Before you begin.
A batchProcess
request starts a long-running operation and
stores results in a Cloud Storage bucket. This sample also shows how to
get the status of this long-running operation after it has started.
Send the process request
Before using any of the request data below, make the following replacements:
- LOCATION: one of the following regional processing options:
us
- United Stateseu
- European Union
- PROJECT_ID: Your GCP project ID.
- PROCESSOR_ID: the ID of your custom processor.
- STORAGE_URI: The URI of the document you want to
process stored in a Cloud Storage bucket, including the
gs://
prefix. You must at least have read privileges to the file. Example:gs://cloud-samples-data/documentai/loan_form.pdf
- MIME_TYPE: One of the valid
MIME type options:
application/pdf
image/gif
image/tiff
- OUTPUT_BUCKET: A Cloud Storage
bucket/directory to save output files to, expressed in the following form:
gs://bucket/directory/
HTTP method and URL:
POST https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:batchProcess
Request JSON body:
{ "inputConfigs": [ { "gcsSource": "STORAGE_URI", "mimeType": "MIME_TYPE" } ], "outputConfig": { "gcsDestination": "OUTPUT_BUCKET" } }
To send your request, choose one of these options:
curl
Save the request body in a file called request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:batchProcess
PowerShell
Save the request body in a file called request.json
,
and execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:batchProcess" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID" }
If the request is successful, the Document AI returns the name for your operation.
Get the results
To get the results of your request, you must send a GET
request to
the operations
resource. The following shows how to send such a request.
Before using any of the request data below, make the following replacements:
- LOCATION: one of the following regional processing options:
us
- United Stateseu
- European Union
- PROJECT_ID: Your GCP project ID.
- OPERATION_ID: The ID of your operation. The ID is the last element of the name
of your operation. For example:
- operation name:
projects/PROJECT_ID/locations/LOCATION/operations/bc4e1d412863e626
- operation id:
bc4e1d412863e626
- operation name:
HTTP method and URL:
GET https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID
PowerShell
Execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/BUCKET_ID/locations/LOCATION/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.documentai.v1beta3.OperationMetadata", "state": "SUCCEEDED", "createTime": "2019-11-19T00:36:37.310474834Z", "updateTime": "2019-11-19T00:37:10.682615795Z" }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.documentai.v1beta3.BatchProcessDocumentsResponse", "responses": [ { "inputConfig": { "gcsSource": { "uri": "gs://INPUT_FILE" }, "mimeType": "application/pdf" }, "outputConfig": { "gcsDestination": { "uri": "gs://OUTPUT_BUCKET/" } } } ] } }
The response body contains an instance of
Document in its standard
format with any information relevant to batch processing (shardInfo
).
Java
Node.js
Python