Managing long-running operations (LROs)

This page describes how to manage the lifecycle of a Document AI API Long-Running Operation (LRO).

Long-Running Operations are returned when method calls might take a long time to complete. The Document AI API creates an LRO every time you call projects.locations.processors.batchProcess through the API or Client Libraries. The LRO tracks the status of the processing job.

You can use the operations methods that the Document AI API provides to check the status of LROs. You can also list, poll, or cancel LROs.

LROs are managed at the Google Cloud project and location level. When making a request to the API, include the Google Cloud project and the location in which the LRO is running.

The record of an LRO is kept for approximately 30 days after the LRO finishes, meaning that you cannot view or list an LRO after that point.

Getting details about a long-running operation

The following samples show how to get details about an LRO.

To get the status of and view details about an LRO, call the projects.locations.operations.get method.

REST & CMD LINE

Suppose that you receive the following response after calling projects.locations.processors.batchProcess:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/operations/OPERATION_ID"
}

The name value in the response shows that the Document AI API created an LRO named projects/PROJECT_NUMBER/locations/LOCATION/operations/OPERATION_ID.

You can also retrieve the LRO name by listing long-running operations.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your Google Cloud project ID.
  • LOCATION: the location where the LRO is running, for example:
    • us - United States
    • eu - European Union
  • OPERATION_ID: The ID of your operation. The ID is the last element of the name of your operation. For example:
    • Operation name: projects/PROJECT_ID/locations/LOCATION/operations/bc4e1d412863e626
    • Operation id: bc4e1d412863e626

HTTP method and URL:

GET https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID"

PowerShell

Execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.documentai.v1.BatchProcessMetadata",
    "state": "SUCCEEDED",
    "stateMessage": "Processed 1 document(s) successfully",
    "createTime": "TIMESTAMP",
    "updateTime": "TIMESTAMP",
    "individualProcessStatuses": [
      {
        "inputGcsSource": "INPUT_BUCKET_FOLDER/DOCUMENT1.ext",
        "status": {},
        "outputGcsDestination": "OUTPUT_BUCKET_FOLDER/OPERATION_ID/0",
        "humanReviewStatus": {
          "state": "ERROR",
          "stateMessage": "Sharded document protos are not supported for human review."
        }
      }
    ]
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.documentai.v1.BatchProcessResponse"
  }
}

Listing long-running operations

The following samples show how to list the LROs in a Google Cloud project and location.

To list the LROs in a Google Cloud project and location, call the projects.locations.operations.list method.

REST & CMD LINE

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your Google Cloud project ID.
  • LOCATION: the location where one or more LROs are running, for example:
    • us - United States
    • eu - European Union
  • FILTER: (Required) Query for LROs to return. Options:
    • TYPE: (Required) LRO type to list. Options:
      • BATCH_PROCESS_DOCUMENTS
      • CREATE_PROCESSOR_VERSION
      • DELETE_PROCESSOR
      • ENABLE_PROCESSOR
      • DISABLE_PROCESSOR
      • UPDATE_HUMAN_REVIEW_CONFIG
      • HUMAN_REVIEW_EVENT
      • CREATE_LABELER_POOL
      • UPDATE_LABELER_POOL
      • DELETE_LABELER_POOL
      • DEPLOY_PROCESSOR_VERSION
      • UNDEPLOY_PROCESSOR_VERSION
      • DELETE_PROCESSOR_VERSION
      • SET_DEFAULT_PROCESSOR_VERSION
      • EVALUATE_PROCESSOR_VERSION
      • EXPORT_PROCESSOR_VERSION
      • UPDATE_DATASET
      • IMPORT_DOCUMENTS
      • ANALYZE_HITL_DATA
      • BATCH_MOVE_DOCUMENTS
      • RESYNC_DATASET
      • BATCH_DELETE_DOCUMENTS

HTTP method and URL:

GET https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations?filter=TYPE=TYPE

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations?filter=TYPE=TYPE"

PowerShell

Execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations?filter=TYPE=TYPE" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "operations": [
    {
      "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.BatchProcessMetadata",
        "state": "SUCCEEDED",
        "stateMessage": "Processed 1 document(s) successfully",
        "createTime": "TIMESTAMP",
        "updateTime": "TIMESTAMP",
        "individualProcessStatuses": [
          {
            "inputGcsSource": "INPUT_BUCKET_FOLDER/DOCUMENT1.ext",
            "status": {},
            "outputGcsDestination": "OUTPUT_BUCKET_FOLDER/OPERATION_ID/0",
            "humanReviewStatus": {
              "state": "ERROR",
              "stateMessage": "Sharded document protos are not supported for human review."
            }
          }
        ]
      },
      "done": true,
      "response": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.BatchProcessResponse"
      }
    },
    ...
  ]
}

Polling a long-running operation

The following samples show how to poll the status of an LRO.

REST & CMD LINE

To poll an LRO, repeatedly call the projects.locations.operations.get method until the operation finishes. Use a backoff between each poll request, such as 10 seconds.

Before using any of the request data below, make the following replacements:

  • PROJECT_ID: your Google Cloud project ID
  • LOCATION: the location where the LRO is running
  • OPERATION_ID: the identifier for the LRO

HTTP method and URL:

GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID

To send your request, choose one of these options:

curl

Execute the following command to poll for the status of an LRO every 10 seconds:

while true; \
    do curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    "https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID"; \
    sleep 10; \
    done

You should receive a JSON response every 10 seconds. While the operation is running, the response will contain "state": "RUNNING". When the operation finishes, the response will contain "state": "SUCCEEDED" and "done": true.

PowerShell

Execute the following command to poll for the status of an LRO every ten seconds:

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Do {
  Invoke-WebRequest `
    -Method Get `
    -Headers $headers `
    -Uri "https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID" | Select-Object -Expand Content
sleep 10
}

while ($true)

You should receive a JSON response every 10 seconds. While the operation is running, the response will contain "state": "RUNNING". When the operation finishes, the response will contain "state": "SUCCEEDED" and "done": true.

Cancelling a long-running operation

The following samples show how to cancel an LRO while it is running.

To cancel an LRO, call the projects.locations.operations.cancel method.

REST & CMD LINE

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your Google Cloud project ID.
  • LOCATION: the location where the LRO is running, for example:
    • us - United States
    • eu - European Union
  • OPERATION_ID: The ID of your operation. The ID is the last element of the name of your operation. For example:
    • Operation name: projects/PROJECT_ID/locations/LOCATION/operations/bc4e1d412863e626
    • Operation id: bc4e1d412863e626

HTTP method and URL:

POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel

To send your request, choose one of these options:

curl

Execute the following command:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d "" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel "

PowerShell

Execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel " | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{}

If you attempt to cancel an operation that has already completed, you should receive the following error message:

    "error": {
        "code": 400,
        "message": "Operation has completed and cannot be cancelled: 'PROJECT_ID/locations/LOCATION/operations/OPERATION_ID'.",
        "status": "FAILED_PRECONDITION"
    }