Quickstart: Large file asynchronous processing

This quickstart introduces you to Document AI. In this quickstart, you set up your Google Cloud project and authorization and then make a request for Document AI to process a table in a PDF document.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Enable the Cloud Document AI API.

    Enable the API

  5. Create a service account:

    1. In the Cloud Console, go to the Create service account page.

      Go to Create service account
    2. Select a project.
    3. In the Service account name field, enter a name. The Cloud Console fills in the Service account ID field based on this name.

      In the Service account description field, enter a description. For example, Service account for quickstart.

    4. Click Create.
    5. Click the Select a role field.

      Under Quick access, click Basic, then click Owner.

    6. Click Continue.
    7. Click Done to finish creating the service account.

      Do not close your browser window. You will use it in the next step.

  6. Create a service account key:

    1. In the Cloud Console, click the email address for the service account that you created.
    2. Click Keys.
    3. Click Add key, then click Create new key.
    4. Click Create. A JSON key file is downloaded to your computer.
    5. Click Close.
  7. Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again.

  8. Install and initialize the Cloud SDK.

Create a Cloud Storage bucket

You must have a Cloud Storage bucket to store the output from processing the document. To create a new bucket in your project, run the following command, providing the bucket-name for your new bucket.

gsutil mb gs://bucket-name/

Assign permissions to your service account

Your service account must have permission to create objects in your Cloud Storage bucket. Use the following command to assign the roles/storage.objectAdmin role role to your service account.

gsutil iam ch serviceAccount:service-account-name:roles/storage.objectAdmin gs://bucket-name

Send the request to Document AI API

The following code sample shows you how to parse a table contained in a simple PDF document (shown below).

A table with three columns and six rows

REST & CMD LINE

Send the batch process request

The following shows how to send a POST request to the batchProcess method. The example uses the access token for your service account that you downloaded previously.

Before using any of the request data below, make the following replacements:

  • PROJECT_ID: Your GCP project ID.
  • OUTPUT_BUCKET: A Cloud Storage bucket/directory to save output files to, expressed in the following form:
    • gs://bucket/directory/
    The requesting user must have write permission to the bucket.

HTTP method and URL:

POST https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:batchProcess

Request JSON body:

{
  "requests": [
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "gs://cloud-samples-data/documentai/table_parsing_small.pdf"
        },
        "mimeType": "application/pdf"
      },
      "outputConfig": {
        "pagesPerShard": 1,
        "gcsDestination": {
          "uri": "output-storage-bucket"
        }
      },
      "documentType": "general",
      "tableExtractionParams": {
        "enabled": true,
        "tableBoundHints": [
          {
            "boundingBox": {
              "normalizedVertices": [
                {"x":0,"y":0},
                {"x":1,"y":0},
                {"x":1,"y":1},
                {"x":0,"y":1}
              ]
            }
          }
        ]
      }
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:batchProcess

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-documentai.googleapis.com/v1beta2/projects/project-id/locations/us/documents:batchProcess" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/project-id/operations/operation-id"
}

If the request is successful, the Document AI returns the name for your operation.

Get the results

Processing and parsing a table can take a long time. Because of this, the call to batchProcess is asynchronous and starts a long-running process. When you send your first request, the API returns an operation name to help you retrieve the results of the operation later.

To get the status or results of your request, you must send a GET request to the projects.documents resource. The following shows how to send such a request.

Before using any of the request data below, make the following replacements:

  • LOCATION: one of the following regional processing options:
    • us - United States
    • eu - European Union
  • PROJECT_ID: Your GCP project ID.
  • OPERATION_ID: The ID of your operation. The ID is the last element of the name of your operation. For example:
    • operation name: projects/PROJECT_ID/locations/LOCATION/operations/bc4e1d412863e626
    • operation id: bc4e1d412863e626

HTTP method and URL:

GET https://LOCATION-documentai.googleapis.com/v1beta2/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://LOCATION-documentai.googleapis.com/v1beta2/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID

PowerShell

Execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1beta2/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/BUCKET_ID/locations/LOCATION/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.documentai.v1beta2.OperationMetadata",
    "state": "SUCCEEDED",
    "createTime": "2019-11-19T00:36:37.310474834Z",
    "updateTime": "2019-11-19T00:37:10.682615795Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.documentai.v1beta2.BatchProcessDocumentsResponse",
    "responses": [
      {
        "inputConfig": {
          "gcsSource": {
            "uri": "gs://INPUT_FILE"
          },
          "mimeType": "application/pdf"
        },
        "outputConfig": {
          "gcsDestination": {
            "uri": "gs://OUTPUT_BUCKET/"
          }
        }
      }
    ]
  }
}

The output from processing the documents should look similar to the following example.

Clean up

To avoid unnecessary Google Cloud charges, use the Cloud Console to delete your project if you do not need it.

What's next