Quickstart: Large file asynchronous processing

This quickstart introduces you to Document AI. In this quickstart, you set up your Google Cloud project and authorization and then make a request for Document AI to process a table in a PDF document.

Before you begin

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. In the Cloud Console, on the project selector page, select or create a Cloud project.

    Go to the project selector page

  3. Make sure that billing is enabled for your Google Cloud project. Learn how to confirm billing is enabled for your project.

  4. Enable the Cloud Document AI API.

    Enable the API

  5. Set up authentication:
    1. In the Cloud Console, go to the Create service account key page.

      Go to the Create Service Account Key page
    2. From the Service account list, select New service account.
    3. In the Service account name field, enter a name.
    4. From the Role list, select Project > Owner.

    5. Click Create. A JSON file that contains your key downloads to your computer.
  6. Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again.

  7. Install and initialize the Cloud SDK.

Create a Cloud Storage bucket

You must have a Cloud Storage bucket to store the output from processing the document. To create a new bucket in your project, run the following command, providing the bucket-name for your new bucket.

gsutil mb gs://bucket-name/

Assign permissions to your service account

Your service account must have permission to create objects in your Cloud Storage bucket. Use the following command to assign the roles/storage.objectAdmin role role to your service account.

gsutil iam ch serviceAccount:service-account-name:roles/storage.objectAdmin gs://bucket-name

Send the request to Document AI API

The following code sample shows you how to parse a table contained in a simple PDF document (shown below).

A table with three columns and six rows

REST & CMD LINE

Send the batch process request

The following shows how to send a POST request to the batchProcess method. The example uses the access token for your service account that you downloaded previously.

Before using any of the request data below, make the following replacements:

  • PROJECT_ID: Your GCP project ID.
  • OUTPUT_BUCKET: A Cloud Storage bucket/directory to save output files to, expressed in the following form:
    • gs://bucket/directory/
    The requesting user must have write permission to the bucket.

HTTP method and URL:

POST https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:batchProcess

Request JSON body:

{
  "requests": [
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "gs://cloud-samples-data/documentai/table_parsing_small.pdf"
        },
        "mimeType": "application/pdf"
      },
      "outputConfig": {
        "pagesPerShard": 1,
        "gcsDestination": {
          "uri": "output-storage-bucket"
        }
      },
      "documentType": "general",
      "tableExtractionParams": {
        "enabled": true,
        "tableBoundHints": [
          {
            "boundingBox": {
              "normalizedVertices": [
                {"x":0,"y":0},
                {"x":1,"y":0},
                {"x":1,"y":1},
                {"x":0,"y":1}
              ]
            }
          }
        ]
      }
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:batchProcess

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:batchProcess" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/project-id/operations/operation-id"
}

If the request is successful, the Document AI returns the name for your operation.

Get the results

Processing and parsing a table can take a long time. Because of this, the call to batchProcess is asynchronous and starts a long-running process. When you send your first request, the API returns an operation name to help you retrieve the results of the operation later.

To get the status or results of your request, you must send a GET request to the projects.documents resource. The following shows how to send such a request.

Before using any of the request data below, make the following replacements:

  • LOCATION: one of the following regional processing options:
    • us - United States
    • eu - European Union
  • PROJECT_ID: Your GCP project ID.
  • OPERATION_ID: The ID of your operation. The ID is the last element of the name of your operation. For example:
    • operation name: projects/PROJECT_ID/locations/LOCATION/operations/bc4e1d412863e626
    • operation id: bc4e1d412863e626

HTTP method and URL:

GET https://LOCATION-documentai.googleapis.com/v1beta2/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://LOCATION-documentai.googleapis.com/v1beta2/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID

PowerShell

Execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1beta2/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/BUCKET_ID/locations/LOCATION/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.documentai.v1beta2.OperationMetadata",
    "state": "SUCCEEDED",
    "createTime": "2019-11-19T00:36:37.310474834Z",
    "updateTime": "2019-11-19T00:37:10.682615795Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.documentai.v1beta2.BatchProcessDocumentsResponse",
    "responses": [
      {
        "inputConfig": {
          "gcsSource": {
            "uri": "gs://INPUT_FILE"
          },
          "mimeType": "application/pdf"
        },
        "outputConfig": {
          "gcsDestination": {
            "uri": "gs://OUTPUT_BUCKET/"
          }
        }
      }
    ]
  }
}

The output from processing the documents should look similar to the following example.

Clean up

To avoid unnecessary Google Cloud charges, use the Cloud Console to delete your project if you do not need it.

What's next