Parsing invoices

This page describes how to process an invoice document that you want to parse.

Document AI can detect and parse text from PDF, TIFF, GIF files stored in Cloud Storage, including text that contains unstructured data in the form of invoice data.

You request invoice detection from a smaller file (<=5 pages) using the process method, and larger file requests (files with a large number of pages) use the batchProcess method. The status of batch (asynchronous) requests can be checked using the operations resources. Output from a batch request is written to a JSON file created in the specified Cloud Storage bucket.

Example invoice:

invoice example with API annotations

Small file online processing

Synchronous ("online") requests target a document with a small number of pages and size (<=5 pages, < 20MB) stored in Cloud Storage. Synchronous requests immediately return a response inline.

The following code samples show you how to process an invoice document.

REST & CMD LINE

This sample shows how to use the process method to request small document processing (<=5 pages, < 20MB). The example uses the access token for a service account set up for the project using the Cloud SDK. For instructions on installing the Cloud SDK, setting up a project with a service account, and obtaining an access token, see Before you begin.

The sample request body contains required fields (inputConfig) and the field to set invoice processing (documentType).

Before using any of the request data below, make the following replacements:

  • project-id: Your GCP project ID.
  • input-storage-file: The URI of the document you want to process stored in a Cloud Storage bucket, including the gs:// prefix. You must at least have read privileges to the file. Example:
    • gs://cloud-samples-data/documentai/invoice.pdf

HTTP method and URL:

POST https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:process

Request JSON body:

{
   "inputConfig":{
      "gcsSource":{
         "uri":"input-storage-file"
      },
      "mimeType":"application/pdf"
   },
   "documentType":"invoice"
}

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:process

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:process" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. The response body contains an instance of Document, with invoice entities and entityRelations identified.

Large file offline processing

Asynchronous ("offline") requests targets longer documents and allows you to set the number of pages in the output files. This request starts a long-running operation. When this operation finishes it stores output as a JSON file in a specified Cloud Storage bucket.

Asynchronous Invoice Parsing processing accepts PDF, TIFF, GIF files up to 10 pages. Attempting to process larger files returns an error.

The following code samples show you how to parse a large invoice document asynchronously.

REST & CMD LINE

This sample shows how to send a POST request to the batchProcess method for large document asynchronous processing. The example uses the access token for a service account set up for the project using the Cloud SDK. For instructions on installing the Cloud SDK, setting up a project with a service account, and obtaining an access token, see Before you begin.

The sample request body contains required fields (inputConfig, outputConfig) and the field to set invoice processing (documentType)

A batchProcess request starts a long-running operation and stores results in a Cloud Storage bucket. This sample also shows how to get the status of this long-running operation after it has started.

Send the process request

Before using any of the request data below, make the following replacements:

  • project-id: Your GCP project ID.
  • input-storage-file: The URI of the document you want to process stored in a Cloud Storage bucket, including the gs:// prefix. You must at least have read privileges to the file. Example:
    • gs://cloud-samples-data/documentai/invoice.pdf
  • output-storage-bucket: A Cloud Storage bucket/directory to save output files to, expressed in the following form:
    • gs://bucket/directory/
    The requesting user must have write permission to the bucket.

HTTP method and URL:

POST https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:batchProcess

Request JSON body:

{
  "requests": [
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "input-storage-file"
        },
        "mimeType": "application/pdf"
      },
      "outputConfig": {
        "pagesPerShard": 1,
        "gcsDestination": {
          "uri": "output-storage-bucket"
        }
      },
      "documentType": "invoice"
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:batchProcess

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:batchProcess" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/project-id/operations/operation-id"
}

If the request is successful, the Document AI returns the name for your operation.

Get the results

To get the results of your request, you must send a GET request to the operations resource. The following shows how to send such a request.

Before using any of the request data below, make the following replacements:

  • project-id: your GCP project ID
  • operation-id: ID of the operation returned from Document AI.

HTTP method and URL:

GET https://us-documentai.googleapis.com/v1beta2/projects/project-id/operations/operation-id

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://us-documentai.googleapis.com/v1beta2/projects/project-id/operations/operation-id

PowerShell

Execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://us-documentai.googleapis.com/v1beta2/projects/project-id/operations/operation-id" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/bucket-id/operations/4e2b314779b999b5",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.documentai.v1beta2.OperationMetadata",
    "state": "SUCCEEDED",
    "createTime": "2019-11-19T00:36:37.310474834Z",
    "updateTime": "2019-11-19T00:37:10.682615795Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.documentai.v1beta2.BatchProcessDocumentsResponse",
    "responses": [
      {
        "inputConfig": {
          "gcsSource": {
            "uri": "gs://input-file"
          },
          "mimeType": "application/pdf"
        },
        "outputConfig": {
          "gcsDestination": {
            "uri": "gs://output-bucket/"
          }
        }
      }
    ]
  }
}

Processing output should look similar to the following example. Processing output should look similar to the following example. The response body contains an instance of Document. In this Document format the response contains any information relevant to invoice processing (entities and entityRelations) and batch processing (shardInfo).

This output is for a publicly accessible PDF file (gs://cloud-samples-data/documentai/invoice.pdf), with one page per shard. This file is stored to the specified output Cloud Storage bucket.

output-page-1-to-1.json:

Annotation list

The following is a list of the exemplar annotations returned in a Document response object's entities and entityRelations fields.

This example list may be expanded in current or future API versions:

Field Annotations
entities
  • invoice_id
  • invoice_date
  • purchase_order
  • total_amount
  • total_tax_amount
  • amount_due
  • due_date
  • payment_terms
  • supplier_name
  • currency
  • receiver_name
  • receiver_address
  • delivery_date
  • supplier_address
  • supplier_tax_id
  • customer_tax_id
  • carrier
  • ship_to_name
  • ship_to_address
  • ship_from_name
  • ship_from_address
entityRelations
  • line_item/amount
  • line_item/unit_price
  • line_item/quantity
  • line_item/unit
  • line_item/description
  • line_item/product_code