Creating and managing processors
Before you can send documents to be processed, you must first create your own instance of a processor. This page provides details about creating and managing processors.
Available processors
To create a processor instance using API, you need to know the name for each processor type. Get this list dynamically (with the code below), because your access may change.
The publicly available processor types are:
Digitize processors
OCR_PROCESSOR
FORM_PARSER_PROCESSOR
LAYOUT_PARSER_PROCESSOR
Pretrained processors
BANK_STATEMENT_PROCESSOR
EXPENSE_PROCESSOR
FORM_W2_PROCESSOR
ID_PROOFING_PROCESSOR
INVOICE_PROCESSOR
PAYSTUB_PROCESSOR
US_DRIVER_LICENSE_PROCESSOR
US_PASSPORT_PROCESSOR
UTILITY_PROCESSOR
Extract / classify / split processors
CUSTOM_EXTRACTION_PROCESSOR
CUSTOM_CLASSIFICATION_PROCESSOR
CUSTOM_SPLITTING_PROCESSOR
SUMMARIZER_PROCESSOR
List processor types
Web UI
In the Google Cloud console, in the Document AI section, go to the Processor Gallery page.
View or search the list of processor types.
REST
This sample shows how to list the available processor types for your project using the fetchProcessorTypes
method.
Before using any of the request data, make the following replacements:
- LOCATION: your processor's location, for example:
us
- United Stateseu
- European Union
- PROJECT_ID: Your Google Cloud project ID.
HTTP method and URL:
GET https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:fetchProcessorTypes
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:fetchProcessorTypes"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:fetchProcessorTypes" | Select-Object -Expand Content
The response is a list of ProcessorType
which shows the available processor types, along with the category and available locations.
{ "processorTypes": [ [ ... { "name": "projects/PROJECT_ID/locations/LOCATION/processorTypes/FORM_PARSER_PROCESSOR", "type": "FORM_PARSER_PROCESSOR", "category": "GENERAL", "availableLocations": [ { "locationId": "eu" }, { "locationId": "us" } ], "allowCreation": true, "launchStage": "GA" }, { "name": "projects/PROJECT_ID/locations/LOCATION/processorTypes/OCR_PROCESSOR", "type": "OCR_PROCESSOR", "category": "GENERAL", "availableLocations": [ { "locationId": "eu" }, { "locationId": "us" } ], "allowCreation": true, "launchStage": "GA" }, { "name": "projects/PROJECT_ID/locations/LOCATION/processorTypes/INVOICE_PROCESSOR", "type": "INVOICE_PROCESSOR", "category": "SPECIALIZED", "availableLocations": [ { "locationId": "eu" }, { "locationId": "us" } ], "allowCreation": true, "launchStage": "GA" }, { "name": "projects/PROJECT_ID/locations/LOCATION/processorTypes/US_DRIVER_LICENSE_PROCESSOR", "type": "US_DRIVER_LICENSE_PROCESSOR", "category": "SPECIALIZED", "availableLocations": [ { "locationId": "us" }, { "locationId": "eu" } ], "allowCreation": true, "launchStage": "GA" }, ... ] }
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
For more information, see the Document AI Java API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Ruby
For more information, see the Document AI Ruby API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
For more information, see the Document AI C# API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Create a processor
Web UI
In the Google Cloud console, in the Document AI section, go to the Processor Gallery page.
View or search the processor gallery to find the processor you want to create.
Click on the processor type from the list you want to create.
In the side Create processor window specify a processor name.
Select your Region from the list.
Click Create to create your processor.
You will be taken to the processor's Overview tab, which contains information such as Name, ID, Type, and Prediction endpoint. Use this endpoint to send requests.
REST
This sample shows you how to create a new processor
using the processors.create
method.
Before using any of the request data, make the following replacements:
- LOCATION: your processor's location, for example:
us
- United Stateseu
- European Union
- PROJECT_ID: Your Google Cloud project ID.
- PROCESSOR_TYPE: Type of the Processor, for example:
OCR_PROCESSOR
FORM_PARSER_PROCESSOR
INVOICE_PROCESSOR
US_DRIVER_LICENSE_PROCESSOR
- DISPLAY_NAME: Display name for the processor.
HTTP method and URL:
POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors
Request JSON body:
{ "type": "PROCESSOR_TYPE", "displayName": "DISPLAY_NAME" }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK
HTTP status code and the response in JSON format. The response contains information about the newly created processor, such as the processEndpoint
and full processor name.
Both of these strings contain the unique processor ID (e.g. aa22ec60216f6ccc
) needed to send requests.
{ "name": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID", "type": "PROCESSOR_TYPE", "displayName": "DISPLAY_NAME", "state": "ENABLED", "processEndpoint": "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process", "createTime": "2022-03-02T22:50:31.395849Z", "defaultProcessorVersion": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/pretrained" }
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
For more information, see the Document AI Java API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Ruby
For more information, see the Document AI Ruby API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
For more information, see the Document AI C# API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Get a list of processors
Web UI
In the Google Cloud console, in the Document AI section, go to the Processors page.
The Processors page lists all of the processors along with their Name, Status, Region, and Type.
REST
This sample shows you how to list existing processors
using the processors.list
method.
Before using any of the request data, make the following replacements:
- LOCATION: your processor's location, for example:
us
- United Stateseu
- European Union
- PROJECT_ID: Your Google Cloud project ID.
HTTP method and URL:
GET https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors" | Select-Object -Expand Content
The response is a list of Processors
, which contains information about each processor
such as its name
, type
, state
, and other details.
{ "processors": [ { "name": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID", "type": "FORM_PARSER_PROCESSOR", "displayName": "DISPLAY_NAME", "state": "ENABLED", "processEndpoint": "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process", "createTime": "2022-03-02T22:33:54.938593Z", "defaultProcessorVersion": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/pretrained" } ] }
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
For more information, see the Document AI Java API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Ruby
For more information, see the Document AI Ruby API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
For more information, see the Document AI C# API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
View details about a processor
Web UI
In the Google Cloud console, in the Document AI section, go to the Processors page.
From the list of processors, click on the name of the processor that you want to view details for.
You will be taken to the processor's Overview tab, which contains information such as Name, ID, Type, and Prediction endpoint.
REST
This sample shows you how to get details about an existing Processor
using the processors.get
method.
Before using any of the request data, make the following replacements:
- LOCATION: your processor's location, for example:
us
- United Stateseu
- European Union
- PROJECT_ID: Your Google Cloud project ID.
- PROCESSOR_ID: the ID of your custom processor.
HTTP method and URL:
GET https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID" | Select-Object -Expand Content
The response is a Processor
, which contains information about the processor such as its name
, type
, state
, and other details.
{ "processors": [ { "name": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID", "type": "OCR_PROCESSOR", "displayName": "DISPLAY_NAME", "state": "ENABLED", "processEndpoint": "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process", "createTime": "2022-03-02T22:33:54.938593Z", "defaultProcessorVersion": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/pretrained" } ] }
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
For more information, see the Document AI Java API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Ruby
For more information, see the Document AI Ruby API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
For more information, see the Document AI C# API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Enable a processor
Web UI
In the Google Cloud console, in the Document AI section, go to the Processors page.
Next to your processor, in the Action menu
, click Enable processor.
REST
This sample shows you how to enable an existing Processor
using the processors.enable
method.
Before using any of the request data, make the following replacements:
- LOCATION: your processor's location, for example:
us
- United Stateseu
- European Union
- PROJECT_ID: Your Google Cloud project ID.
- PROCESSOR_ID: the ID of your custom processor.
HTTP method and URL:
POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:enable
To send your request, choose one of these options:
curl
Execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d "" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:enable"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:enable" | Select-Object -Expand Content
The response is a long running operation
{ "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION", "metadata": { "@type": "type.googleapis.com/google.cloud.documentai.v1.EnableProcessorMetadata", "commonMetadata": { "state": "RUNNING", "createTime": "2022-03-02T22:52:49.957096Z", "updateTime": "2022-03-02T22:52:50.175976Z", "resource": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID" } } }
To poll the long-running operation, call operations.get:
curl -X GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION \ -H "Authorization: Bearer "$(gcloud auth print-access-token) \ -H "X-Goog-User-Project: PROJECT_ID"
The
EnableProcessorMetadata
in the response indicates the state of the operation:{ "name": "projects/<project_id>/locations/<location>/operations/<operation>", "metadata": { "@type": "type.googleapis.com/google.cloud.documentai.v1.EnableProcessorMetadata", "commonMetadata": { "state": "SUCCEEDED", "createTime": "2022-03-02T22:52:49.957096Z", "updateTime": "2022-03-02T22:52:50.175976Z", "resource": "projects/<project_id>/locations/<location>/processors/<processor_id>" } }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.documentai.v1.EnableProcessorResponse" } }
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
For more information, see the Document AI Java API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Ruby
For more information, see the Document AI Ruby API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
For more information, see the Document AI C# API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Disable a processor
Web UI
In the Google Cloud console, in the Document AI section, go to the Processors page.
- Next to your processor, in the Action menu , click Disable processor.
REST
This sample shows you how to disable an existing Processor
using the processors.disable
method.
Before using any of the request data, make the following replacements:
- LOCATION: your processor's location, for example:
us
- United Stateseu
- European Union
- PROJECT_ID: Your Google Cloud project ID.
- PROCESSOR_ID: the ID of your custom processor.
HTTP method and URL:
POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:disable
To send your request, choose one of these options:
curl
Execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d "" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:disable"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:disable" | Select-Object -Expand Content
The response is a long running operation
{ "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION", "metadata": { "@type": "type.googleapis.com/google.cloud.documentai.v1.DisableProcessorMetadata", "commonMetadata": { "state": "RUNNING", "createTime": "2022-03-02T22:52:49.957096Z", "updateTime": "2022-03-02T22:52:50.175976Z", "resource": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID" } } }
To poll the long-running operation, call operations.get:
curl -X GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION \ -H "Authorization: Bearer "$(gcloud auth print-access-token) \ -H "X-Goog-User-Project: PROJECT_ID"
The
DisableProcessorMetadata
in the response indicates the state of the operation:{ "name": "projects/<project_id>/locations/<location>/operations/<operation>", "metadata": { "@type": "type.googleapis.com/google.cloud.documentai.v1.DisableProcessorMetadata", "commonMetadata": { "state": "SUCCEEDED", "createTime": "2022-03-02T22:52:49.957096Z", "updateTime": "2022-03-02T22:52:50.175976Z", "resource": "projects/<project_id>/locations/<location>/processors/<processor_id>" } }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.documentai.v1.DisableProcessorResponse" } }
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
For more information, see the Document AI Java API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Ruby
For more information, see the Document AI Ruby API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
For more information, see the Document AI C# API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Delete a processor
Web UI
In the Google Cloud console, in the Document AI section, go to the Processors page.
- Next to your processor, in the Action menu , click Delete processor.
REST
This sample shows you how to delete an existing Processor
using the processors.delete
method.
Before using any of the request data, make the following replacements:
- LOCATION: your processor's location, for example:
us
- United Stateseu
- European Union
- PROJECT_ID: Your Google Cloud project ID.
- PROCESSOR_ID: the ID of your custom processor.
HTTP method and URL:
DELETE https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID" | Select-Object -Expand Content
The response is a long-running operation
{ "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION", "metadata": { "@type": "type.googleapis.com/google.cloud.documentai.v1.DeleteProcessorMetadata", "commonMetadata": { "state": "RUNNING", "createTime": "2022-03-02T22:52:49.957096Z", "updateTime": "2022-03-02T22:52:50.175976Z", "resource": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID" } } }
To poll the long-running operation, call operations.get:
curl -X GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION \ -H "Authorization: Bearer "$(gcloud auth print-access-token) \ -H "X-Goog-User-Project: PROJECT_ID"
The
DeleteProcessorMetadata
in the response indicates the state of the operation:{ "name": "projects/<project_id>/locations/<location>/operations/<operation>", "metadata": { "@type": "type.googleapis.com/google.cloud.documentai.v1.DeleteProcessorMetadata", "commonMetadata": { "state": "SUCCEEDED", "createTime": "2022-03-02T22:52:49.957096Z", "updateTime": "2022-03-02T22:52:50.175976Z", "resource": "projects/<project_id>/locations/<location>/processors/<processor_id>" } }, "done": true, "response": { "@type": "type.googleapis.com/google.protobuf.Empty" } }
Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
For more information, see the Document AI Go API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
For more information, see the Document AI Java API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Ruby
For more information, see the Document AI Ruby API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
For more information, see the Document AI C# API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.