Creating and managing processors

Before you can send documents to be processed, you must first create a processor. This page provides details about creating and managing processors.

Create a processor

Console

  1. In the Google Cloud console, in the Document AI section, go to the Processors page.

    Go to the Processors page

  2. Select Create processor.

  3. Click on the processor type from the list you want to create.

  4. In the side Create processor window specify a processor name.

  5. Select your Region from the list.

  6. Click Create to create your processor.

    You will be taken to the processor's Overview tab, which contains information such as Name, ID, Type, and Prediction endpoint. Use this endpoint to send requests.

REST & CMD LINE

  1. List the available processor types for your project using fetchProcessorTypes. For example:

    curl -X GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:fetchProcessorTypes \
         -H 'Authorization: Bearer '$(gcloud auth print-access-token)
    
  2. The response is a list of ProcessorType which shows the available processor types, along with the category and available locations.

    [
      ...
      {
        "name": "projects/<project_id>/locations/<location>/processorTypes/FORM_PARSER_PROCESSOR",
        "type": "FORM_PARSER_PROCESSOR",
        "category": "GENERAL",
        "availableLocations": [
          {
            "locationId": "eu"
          },
          {
            "locationId": "us"
          }
        ],
        "allowCreation": true,
        "launchStage": "GA"
      },
      {
        "name": "projects/<project_id>/locations/<location>/processorTypes/OCR_PROCESSOR",
        "type": "OCR_PROCESSOR",
        "category": "GENERAL",
        "availableLocations": [
          {
            "locationId": "eu"
          },
          {
            "locationId": "us"
          }
        ],
        "allowCreation": true,
        "launchStage": "GA"
      },
      {
        "name": "projects/<project_id>/locations/<location>/processorTypes/INVOICE_PROCESSOR",
        "type": "INVOICE_PROCESSOR",
        "category": "SPECIALIZED",
        "availableLocations": [
          {
            "locationId": "eu"
          },
          {
            "locationId": "us"
          }
        ],
        "allowCreation": true,
        "launchStage": "GA"
      },
      {
        "name": "projects/<project_id>/locations/<location>/processorTypes/US_DRIVER_LICENSE_PROCESSOR",
        "type": "US_DRIVER_LICENSE_PROCESSOR",
        "category": "SPECIALIZED",
        "availableLocations": [
          {
            "locationId": "us"
          },
          {
            "locationId": "eu"
          }
        ],
        "launchStage": "EARLY_ACCESS"
      },
      ...
    ]
    
  3. Create a file called create.json and populate it with a processor. For example:

    {
      "name": "processor-name",
      "type": "OCR_PROCESSOR",
      "displayName": "processor display name"
    }
    
  4. Call processors.create. For example:

    curl -X POST https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors \
         -H 'Authorization: Bearer '$(gcloud auth print-access-token) \
         -H 'Content-Type: application/json' --data @create.json
    

    The response contains information about the newly created processor, such as the processEndpoint. You can use this endpoint to send requests.

    {
      "name": "projects/<project_id>/locations/<location>/processors/<processor_id>",
      "type": "OCR_PROCESSOR",
      "displayName": "processor display name",
      "state": "ENABLED",
      "processEndpoint": "https://<location>-documentai.googleapis.com/v1/projects/<project_id>/locations/<location>/processors/<processor_id>:process",
      "createTime": "2022-03-02T22:50:31.395849Z",
      "defaultProcessorVersion": "projects/<project_id>/locations/<location>/processors/<processor_id>/processorVersions/pretrained"
    }
    

Python

from google.cloud import documentai_v1beta3


def sample_create_processor():
    # Create a client
    client = documentai_v1beta3.DocumentProcessorServiceClient()

    # Initialize request argument(s)
    request = documentai_v1beta3.CreateProcessorRequest(
        parent="parent_value",
    )

    # Make the request
    response = client.create_processor(request=request)

    # Handle the response
    print(response)

Get a list of processors

Console

In the Google Cloud console, in the Document AI section, go to the Processors page.

Go to the Processors page

The Processors page lists all of the processors along with their Name, Status, Region, and Type.

REST & CMD LINE

  1. Call processors.list. For example:

    curl -X GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/ \
         -H 'Authorization: Bearer '$(gcloud auth print-access-token)
    
  2. The response is a list of processors, which contains information about each processor such as its name, type, state, and other details.

    {
      "processors": [
        {
          "name": "projects/<project_id>/locations/<location>/processors/<processor_id>",
          "type": "FORM_PARSER_PROCESSOR",
          "displayName": "<processor_name>",
          "state": "ENABLED",
          "processEndpoint": "https://<location>-documentai.googleapis.com/v1/projects/<project_id>/locations/<location>/processors/<processor_id>:process",
          "createTime": "2022-03-02T22:33:54.938593Z",
          "defaultProcessorVersion": "projects/<project_id>/locations/<location>/processors/<processor_id>/processorVersions/pretrained"
        }
      ]
    }
    

View details about a processor

Console

  1. In the Google Cloud console, in the Document AI section, go to the Processors page.

    Go to the Processors page

  2. From the list of processors, click on the name of the processor that you want to view details for.

    You will be taken to the processor's Overview tab, which contains information such as Name, ID, Type, and Prediction endpoint.

REST & CMD LINE

  1. Call processors.get with the processor ID. For example:

    curl -X GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID \
         -H 'Authorization: Bearer '$(gcloud auth print-access-token)
    
  2. The response is a Processor instance, which contains information about the processor such as its name, type, state, and other details.

    {
      "name": "projects/<project_id>/locations/us/processors/<processor_id>",
      "type": "OCR_PROCESSOR",
      "displayName": "<processor_name>",
      "state": "ENABLED",
      "processEndpoint": "https://<location>-documentai.googleapis.com/v1/projects/<project_id>/locations/<location>/processors/<processor_id>:process",
      "createTime": "2022-03-02T22:51:45.593858Z",
      "defaultProcessorVersion": "projects/<project_id>/locations/<location>/processors/<processor_id>/processorVersions/pretrained"
    }
    

Enable a processor

Console

  1. In the Google Cloud console, in the Document AI section, go to the Processors page.

    Go to the Processors page

  2. Next to your processor, in the Action menu , click Enable processor.

REST & CMD LINE

  1. Call processors.enable with the processor ID. For example:

    curl -X POST https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:enable \
         -H 'Authorization: Bearer '$(gcloud auth print-access-token)
    
  2. The response is a long running operation:

    {
      "name": "projects/<project_id>/locations/<location>/operations/<operation>",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.EnableProcessorMetadata",
        "commonMetadata": {
          "state": "RUNNING",
          "createTime": "2022-03-02T22:52:49.957096Z",
          "updateTime": "2022-03-02T22:52:50.175976Z",
          "resource": "projects/<project_id>/locations/<location>/processors/<processor_id>"
        }
      }
    }
    
  3. To poll the long-running operation, call operations.get:

    curl -X GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION \
         -H "Authorization: Bearer "$(gcloud auth print-access-token) \
         -H "X-Goog-User-Project: PROJECT_ID"
    
  4. The EnableProcessorMetadata in the response indicates the state of the operation:

    {
      "name": "projects/<project_id>/locations/<location>/operations/<operation>",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.EnableProcessorMetadata",
        "commonMetadata": {
          "state": "SUCCEEDED",
          "createTime": "2022-03-02T22:52:49.957096Z",
          "updateTime": "2022-03-02T22:52:50.175976Z",
          "resource": "projects/<project_id>/locations/<location>/processors/<processor_id>"
        }
      },
      "done": true,
      "response": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.EnableProcessorResponse"
      }
    }
    

Disable a processor

Console

  1. In the Google Cloud console, in the Document AI section, go to the Processors page.

    Go to the Processors page

  2. Next to your processor, in the Action menu , click Disable processor.

REST & CMD LINE

  1. Call processors.disable with the processor ID. For example:

    curl -X POST https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:disable \
         -H 'Authorization: Bearer '$(gcloud auth print-access-token)
    
  2. The response is a long running operation:

    {
      "name": "projects/<project_id>/locations/<location>/operations/<operation>",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.DisableProcessorMetadata",
        "commonMetadata": {
          "state": "RUNNING",
          "createTime": "2022-03-02T22:52:49.957096Z",
          "updateTime": "2022-03-02T22:52:50.175976Z",
          "resource": "projects/<project_id>/locations/<location>/processors/<processor_id>"
        }
      }
    }
    
  3. To poll the long-running operation, call operations.get:

    curl -X GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION \
         -H "Authorization: Bearer "$(gcloud auth print-access-token) \
         -H "X-Goog-User-Project: PROJECT_ID"
    
  4. The DisableProcessorMetadata in the response indicates the state of the operation:

    {
      "name": "projects/<project_id>/locations/<location>/operations/<operation>",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.DisableProcessorMetadata",
        "commonMetadata": {
          "state": "SUCCEEDED",
          "createTime": "2022-03-02T22:52:49.957096Z",
          "updateTime": "2022-03-02T22:52:50.175976Z",
          "resource": "projects/<project_id>/locations/<location>/processors/<processor_id>"
        }
      },
      "done": true,
      "response": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.DisableProcessorResponse"
      }
    }
    

Delete a processor

Console

  1. In the Google Cloud console, in the Document AI section, go to the Processors page.

    Go to the Processors page

  2. Next to your processor, in the Action menu , click Delete processor.

REST & CMD LINE

  1. Call processors.delete. For example:

    curl -X DELETE https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID \
         -H 'Authorization: Bearer '$(gcloud auth print-access-token)
    
  2. The response is a long running operation:

    {
      "name": "projects/<project_id>/locations/<location>/operations/<operation>",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.DeleteProcessorMetadata",
        "commonMetadata": {
          "state": "RUNNING",
          "createTime": "2022-03-02T22:52:49.957096Z",
          "updateTime": "2022-03-02T22:52:50.175976Z",
          "resource": "projects/<project_id>/locations/<location>/processors/<processor_id>"
        }
      }
    }
    
  3. To poll the long-running operation, call operations.get:

    curl -X GET https://documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION \
         -H "Authorization: Bearer "$(gcloud auth print-access-token) \
         -H "X-Goog-User-Project: PROJECT_ID"
    
  4. The DeleteProcessorMetadata in the response indicates the state of the operation:

    {
      "name": "projects/<project_id>/locations/<location>/operations/<operation>",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.documentai.v1.DeleteProcessorMetadata",
        "commonMetadata": {
          "state": "SUCCEEDED",
          "createTime": "2022-03-02T22:52:49.957096Z",
          "updateTime": "2022-03-02T22:52:50.175976Z",
          "resource": "projects/<project_id>/locations/<location>/processors/<processor_id>"
        }
      },
      "done": true,
      "response": {
        "@type": "type.googleapis.com/google.protobuf.Empty"
      }
    }
    

Python

from google.cloud import documentai_v1beta3


def sample_delete_processor():
    # Create a client
    client = documentai_v1beta3.DocumentProcessorServiceClient()

    # Initialize request argument(s)
    request = documentai_v1beta3.DeleteProcessorRequest(
        name="name_value",
    )

    # Make the request
    operation = client.delete_processor(request=request)

    print("Waiting for operation to complete...")

    response = operation.result()

    # Handle the response
    print(response)