Process documents

This quickstart shows you how to process documents (invoices) from a source Cloud Storage bucket and store the processed document (JSON file) in a target bucket by using the batch processing capability of Document AI API through the SAP BTP edition of ABAP SDK for Google Cloud.

Before you begin

Before you run this quickstart, make sure that you or your administrators have completed the following prerequisites:

  • Make sure the Document AI API is enabled in your Google Cloud project.

    Go to API library

  • In the Document AI Workbench, create a processor with type INVOICE_PROCESSOR. For more information, see Creating and managing processors.

  • In Cloud Storage, create a source bucket to store the invoices for processing and place the invoices in this bucket. For more information, see Create buckets.

  • In Cloud Storage, create a target bucket to store the processed files.

Create an ABAP class to process documents

  1. Create a package:

    1. In ADT, go to the Project Explorer.
    2. Right-click the package ZLOCAL, and select New > ABAP Package.
    3. Enter the following details for your package:

      • Name: enter ZABAPSDK_TEST.
      • Description: enter ABAP SDK Test Package.
    4. Click Next.

    5. In the Select a Transport Request dialog, select the Create a new request checkbox.

    6. Enter a description for the transport request.

    7. Click Finish.

  2. Create an ABAP class to call the Document AI API:

    1. Right-click your ABAP package and select New > ABAP Class.
    2. Enter the following details for your ABAP class:

      • Name: enter ZGOOG_CL_QS_DOCUMENT_AI.
      • Description: enter Quick start for Document AI API.
    3. Click Next.

    4. Select a transport request and click Finish.

  3. In the code editor, replace the default code with the following code snippet:

     CLASS zcl_qs_process_documents DEFINITION
     PUBLIC FINAL
     CREATE PUBLIC.
    
     PUBLIC SECTION.
       INTERFACES if_oo_adt_classrun.
     ENDCLASS.
    
     CLASS zcl_qs_process_documents IMPLEMENTATION.
      METHOD if_oo_adt_classrun~main.
       DATA lv_p_projects_id   TYPE string.
       DATA lv_p_locations_id  TYPE string.
       DATA lv_p_processors_id TYPE string.
       DATA ls_input           TYPE /goog/cl_documentai_v1=>ty_017.
       DATA lo_docai           TYPE REF TO /goog/cl_documentai_v1.
    
       TRY.
    
           " Open HTTP connection
           lo_docai = NEW #( iv_key_name = 'DEMO_DOC_PROCESSING' ).
    
           " Populate relevant parameters to be passed to API
           lv_p_projects_id  = 'PROJECT_ID'.
           lv_p_locations_id = 'LOCATION_ID'.
           lv_p_processors_id = 'PROCESSOR_ID'.
           ls_input-input_documents-gcs_prefix-gcs_uri_prefix = 'SOURCE_BUCKET_URI'.
           ls_input-document_output_config-gcs_output_config-gcs_uri = 'TARGET_BUCKET_URI'.
    
           " Call API method
           lo_docai->batch_process_processors( EXPORTING iv_p_projects_id   = lv_p_projects_id
                                                         iv_p_locations_id  = lv_p_locations_id
                                                         iv_p_processors_id = lv_p_processors_id
                                                         is_input           = ls_input
                                               IMPORTING
                                                         es_output          = DATA(ls_output)
                                                         ev_ret_code        = DATA(lv_ret_code)
                                                         ev_err_text        = DATA(lv_err_text)
                                                         es_err_resp        = DATA(ls_err_resp) ).
    
           IF lo_docai->is_success( lv_ret_code ) = abap_true.
             out->write( |API call successful| ).
           ELSE.
             out->write( |Error occurred during API call| ).
             out->write( lv_err_text ).
           ENDIF.
    
           " Close HTTP connection
           lo_docai->close( ).
    
         CATCH /goog/cx_sdk INTO DATA(lo_exception).
           " Handle exception here
       ENDTRY.
      ENDMETHOD.
    ENDCLASS.
    

    Replace the following:

    • DEMO_DOC_PROCESSING: the client key name.
    • PROJECT_ID: the ID of the Google Cloud project where your source and target buckets exist.
    • LOCATION_ID: the processor's location.
    • PROCESSOR_ID: the ID of the processor.
    • SOURCE_BUCKET_URI: the URI of the Cloud Storage bucket folder where source documents are kept for processing.
    • TARGET_BUCKET_URI: the URI of the Cloud Storage bucket where the processed document (JSON file) would be stored.
  4. Save and activate the changes.

  5. Run your application:

    1. Select the ABAP class ZGOOG_CL_QS_DOCUMENT_AI.
    2. Click Run > Run As > ABAP Application (Console). Alternatively, press F9.
  6. To validate the results, follow these steps:

    1. In the Google Cloud console, go to Cloud Storage Buckets page.

      Go to Buckets

    2. Open the target bucket. The processed document is stored in the form of a JSON file.

What's next