Managing long-running operations (LROs)
Long-Running Operations are returned by batch
processing method calls because they take a longer time to complete than is appropriate
for an API response. This is so the calling thread is not held open while many
documents are processed. The Document AI API creates an LRO every time you
call projects.locations.processors.batchProcess
through the API or Client Libraries. The LRO tracks the status of the processing job.
You can use the operations methods
that the Document AI API provides to check the
status of LROs. You can
also list, poll,
or cancel LROs. Client libraries calling
async method poll internally, enabling callback. (For Python, await
is enabled.) They also
feature a timeout parameter. Within the main LRO returned by .batchProcess, an LRO
is created for each document (because batch page-count limits are much higher than
the sync process
call and can take significant time to process). When the main
LRO ends, the detailed status of each document LRO is provided.
LROs are managed at the Google Cloud project and location level. When making a request to the API, include the Google Cloud project and the location in which the LRO is running.
The record of an LRO is kept for approximately 30 days after the LRO finishes, meaning that you cannot view or list an LRO after that point.
Getting details about a long-running operation
The following samples show how to get details about an LRO.
REST
To get the status of and view details about an LRO, call the projects.locations.operations.get
method.
Suppose that you receive the following response after calling
projects.locations.processors.batchProcess
:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/operations/OPERATION_ID" }
The name
value in the response shows that the Document AI API
created an LRO named projects/PROJECT_NUMBER/locations/LOCATION/operations/OPERATION_ID
.
You can also retrieve the LRO name by listing long-running operations.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- LOCATION: the location where the LRO is running, for example:
us
- United Stateseu
- European Union
- OPERATION_ID: The ID of your operation. The ID is the last element of the name
of your operation. For example:
- Operation name:
projects/PROJECT_ID/locations/LOCATION/operations/bc4e1d412863e626
- Operation id:
bc4e1d412863e626
- Operation name:
HTTP method and URL:
GET https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.documentai.v1.BatchProcessMetadata", "state": "SUCCEEDED", "stateMessage": "Processed 1 document(s) successfully", "createTime": "TIMESTAMP", "updateTime": "TIMESTAMP", "individualProcessStatuses": [ { "inputGcsSource": "INPUT_BUCKET_FOLDER/DOCUMENT1.ext", "status": {}, "outputGcsDestination": "OUTPUT_BUCKET_FOLDER/OPERATION_ID/0", "humanReviewStatus": { "state": "ERROR", "stateMessage": "Sharded document protos are not supported for human review." } } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.documentai.v1.BatchProcessResponse" } }
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Listing long-running operations
The following samples show how to list the LROs in a Google Cloud project and location.
REST
To list the LROs in a Google Cloud project and location, call the projects.locations.operations.list
method.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- LOCATION: the location where one or more LROs are running, for example:
us
- United Stateseu
- European Union
- FILTER: (Required) Query for LROs to return. Options:
- TYPE: (Required) LRO type to list. Options:
BATCH_PROCESS_DOCUMENTS
CREATE_PROCESSOR_VERSION
DELETE_PROCESSOR
ENABLE_PROCESSOR
DISABLE_PROCESSOR
UPDATE_HUMAN_REVIEW_CONFIG
HUMAN_REVIEW_EVENT
CREATE_LABELER_POOL
UPDATE_LABELER_POOL
DELETE_LABELER_POOL
DEPLOY_PROCESSOR_VERSION
UNDEPLOY_PROCESSOR_VERSION
DELETE_PROCESSOR_VERSION
SET_DEFAULT_PROCESSOR_VERSION
EVALUATE_PROCESSOR_VERSION
EXPORT_PROCESSOR_VERSION
UPDATE_DATASET
IMPORT_DOCUMENTS
ANALYZE_HITL_DATA
BATCH_MOVE_DOCUMENTS
RESYNC_DATASET
BATCH_DELETE_DOCUMENTS
DELETE_DATA_LABELING_JOB
EXPORT_DOCUMENTS
HTTP method and URL:
GET https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations?filter=TYPE=TYPE
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations?filter=TYPE=TYPE"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations?filter=TYPE=TYPE" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "operations": [ { "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.documentai.v1.BatchProcessMetadata", "state": "SUCCEEDED", "stateMessage": "Processed 1 document(s) successfully", "createTime": "TIMESTAMP", "updateTime": "TIMESTAMP", "individualProcessStatuses": [ { "inputGcsSource": "INPUT_BUCKET_FOLDER/DOCUMENT1.ext", "status": {}, "outputGcsDestination": "OUTPUT_BUCKET_FOLDER/OPERATION_ID/0", "humanReviewStatus": { "state": "ERROR", "stateMessage": "Sharded document protos are not supported for human review." } } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.documentai.v1.BatchProcessResponse" } }, ... ] }
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Polling a long-running operation
The following samples show how to poll the status of an LRO.
REST
To poll an LRO, repeatedly call the projects.locations.operations.get
method until the operation finishes. Use a backoff between each poll request,
such as 10 seconds.
Before using any of the request data below, make the following replacements:
- PROJECT_ID: your Google Cloud project ID
- LOCATION: the location where the LRO is running
- OPERATION_ID: the identifier for the LRO
HTTP method and URL:
GET https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID
To send your request, choose one of these options:
curl
Execute the following command to poll for the status of an LRO every 10 seconds:
while true; \ do curl -X GET \ -H "Authorization: Bearer "$(gcloud auth print-access-token) \ "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID"; \ sleep 10; \ done
You should receive a JSON response every 10 seconds.
While the operation is running, the response will contain "state": "RUNNING"
.
When the operation finishes, the response will contain "state": "SUCCEEDED"
and "done": true
.
PowerShell
Execute the following command to poll for the status of an LRO every ten seconds:
$cred = gcloud auth print-access-token $headers = @{ Authorization = "Bearer $cred" } Do { Invoke-WebRequest ` -Method Get ` -Headers $headers ` -Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID" | Select-Object -Expand Content sleep 10 } while ($true)
You should receive a JSON response every 10 seconds.
While the operation is running, the response will contain "state": "RUNNING"
.
When the operation finishes, the response will contain "state": "SUCCEEDED"
and "done": true
.
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Cancelling a long-running operation
The following samples show how to cancel an LRO while it is running.
REST
To cancel an LRO, call the projects.locations.operations.cancel
method.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- LOCATION: the location where the LRO is running, for example:
us
- United Stateseu
- European Union
- OPERATION_ID: The ID of your operation. The ID is the last element of the name
of your operation. For example:
- Operation name:
projects/PROJECT_ID/locations/LOCATION/operations/bc4e1d412863e626
- Operation id:
bc4e1d412863e626
- Operation name:
HTTP method and URL:
POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel
To send your request, choose one of these options:
curl
Execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d "" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{}
"error": { "code": 400, "message": "Operation has completed and cannot be cancelled: 'PROJECT_ID/locations/LOCATION/operations/OPERATION_ID'.", "status": "FAILED_PRECONDITION" }
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.