Process using an AutoML model

You can use a trained AutoML Natural Language Classification or Entity Extraction model to process your documents. For information about building your own custom AutoML Natural Language model see the AutoML Natural Language documentation.

Form processing with an AutoML model

You can use a custom AutoML model with any Document AI feature. The following example shows how to send an online (synchronous) request with a small form file.

v1beta2

Select the tab below for your language or environment:

REST & CMD LINE

This sample shows you how to use the process method to request small document processing with an AutoML model (<=5 pages, < 20MB). The example uses the access token for a service account set up for the project using the Cloud SDK. For instructions on installing the Cloud SDK, setting up a project with a service account, and obtaining an access token, see Before you begin.

The sample request body contains required fields (inputConfig) and optional fields, some for processing with an AutoML Natural Language model (automlParams).

Before using any of the request data, make the following replacements:

  • LOCATION: one of the following regional processing options:
    • us - United States
    • eu - European Union
  • PROJECT_ID: Your GCP project ID.
  • STORAGE_URI: The URI of the document you want to process stored in a Cloud Storage bucket, including the gs:// prefix. You must at least have read privileges to the file. Example:
    • gs://cloud-samples-data/documentai/loan_form.pdf
  • MODEL_RESOURCE: an AutoML Natural Language model resource string. This only applies to Classification and Entity Extraction models. format:
    • projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID

HTTP method and URL:

POST https://LOCATION-documentai.googleapis.com/v1beta2/projects/PROJECT_ID/locations/LOCATION/documents:process

Request JSON body:

{
   "inputConfig":{
      "gcsSource":{
         "uri":"STORAGE_URI"
      },
      "mimeType":"application/pdf"
   },
   "documentType":"general",
   "automlParams": {
     "model": "MODEL_RESOURCE"
   }
}

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-documentai.googleapis.com/v1beta2/projects/PROJECT_ID/locations/LOCATION/documents:process"

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-documentai.googleapis.com/v1beta2/projects/PROJECT_ID/locations/LOCATION/documents:process" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. The response body contains an instance of Document with applicable fields.

Java


import com.google.cloud.documentai.v1beta2.AutoMlParams;
import com.google.cloud.documentai.v1beta2.Document;
import com.google.cloud.documentai.v1beta2.DocumentUnderstandingServiceClient;
import com.google.cloud.documentai.v1beta2.GcsSource;
import com.google.cloud.documentai.v1beta2.InputConfig;
import com.google.cloud.documentai.v1beta2.ProcessDocumentRequest;
import java.io.IOException;

public class ParseWithModelBeta {

  public static void parseWithModel() throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String location = "your-project-location"; // Format is "us" or "eu".
    // AutoML model name formatted as:
    //   "projects/[PROJECT_ID]/locations/[LOCATION]/models/[MODEL_ID]"
    String autoMlModel = "your-full-resource-model-name";
    String gcsUri = "gs://your-gcs-bucket/path/to/input/file.json";
    parseWithModel(projectId, location, autoMlModel, gcsUri);
  }

  public static void parseWithModel(
      String projectId, String location, String autoMlModel, String gcsUri) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DocumentUnderstandingServiceClient client = DocumentUnderstandingServiceClient.create()) {
      // Configure the request for processing the PDF
      String parent = String.format("projects/%s/locations/%s", projectId, location);

      AutoMlParams params = AutoMlParams.newBuilder().setModel(autoMlModel).build();

      GcsSource uri = GcsSource.newBuilder().setUri(gcsUri).build();

      // mime_type can be application/pdf, image/tiff,
      // and image/gif, or application/json
      InputConfig config =
          InputConfig.newBuilder().setGcsSource(uri).setMimeType("application/pdf").build();

      ProcessDocumentRequest request =
          ProcessDocumentRequest.newBuilder()
              .setParent(parent)
              .setAutomlParams(params)
              .setInputConfig(config)
              .build();

      // Recognizes text entities in the PDF document
      Document response = client.processDocument(request);

      // Process the output
      for (Document.Label label : response.getLabelsList()) {
        System.out.printf("Label detected: %s\n", label.getName());
        System.out.printf("Confidence:  %s\n", label.getConfidence());
      }
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION'; // Format is 'us' or 'eu'
// const autoMLModel = 'Full resource name of AutoML Natural Language model';
// const gcsInputUri = 'YOUR_SOURCE_PDF';

const {DocumentUnderstandingServiceClient} =
  require('@google-cloud/documentai').v1beta2;
const client = new DocumentUnderstandingServiceClient();

async function parseWithModel() {
  // Configure the request for processing the PDF
  const parent = `projects/${projectId}/locations/${location}`;
  const request = {
    parent,
    inputConfig: {
      gcsSource: {
        uri: gcsInputUri,
      },
      mimeType: 'application/pdf',
    },
    automlParams: {
      model: autoMLModel,
    },
  };

  // Recognizes text entities in the PDF document
  const [result] = await client.processDocument(request);

  for (const label of result.labels) {
    console.log(`Label detected: ${label.name}`);
    console.log(`Confidence: ${label.confidence}`);
  }
}

Python

from google.cloud import documentai_v1beta2 as documentai


def parse_with_model(
    project_id="YOUR_PROJECT_ID",
    input_uri="gs://cloud-samples-data/documentai/invoice.pdf",
    automl_model_name="YOUR_AUTOML_MODEL_NAME",
):
    """Process a single document with the Document AI API.

    Args:
        project_id: your Google Cloud project id
        input_uri: the Cloud Storage URI of your input PDF
        automl_model_name: the AutoML model name formatted as:
            `projects/[PROJECT_ID]/locations/[LOCATION]/models/[MODEL_ID]
            where LOCATION is a Compute Engine region, e.g. `us-central1`
    """

    client = documentai.DocumentUnderstandingServiceClient()

    gcs_source = documentai.types.GcsSource(uri=input_uri)

    # mime_type can be application/pdf, image/tiff,
    # and image/gif, or application/json
    input_config = documentai.types.InputConfig(
        gcs_source=gcs_source, mime_type="application/pdf"
    )

    automl_params = documentai.types.AutoMlParams(model=automl_model_name)

    # Location can be 'us' or 'eu'
    parent = "projects/{}/locations/us".format(project_id)
    request = documentai.types.ProcessDocumentRequest(
        parent=parent, input_config=input_config, automl_params=automl_params
    )

    document = client.process_document(request=request)

    for label in document.labels:
        print("Label detected: {}".format(label.name))
        print("Confidence: {}".format(label.confidence))