此页面由 Cloud Translation API 翻译。

发送处理请求

设置 Google Cloud 账号并创建处理器后，您就可以向 Document AI 处理器发送请求了。

用于发送请求的代码对所有处理器都是相同的。您会在每个处理器输出的信息中看到处理器运作方式的差异。

在 Document AI 的 v1 API 版本中或在 Google Cloud 控制台中使用时，您可以向该特定处理器版本发送处理请求。如果您未指定处理器版本，则系统会使用默认版本。如需了解详情，请参阅管理处理器版本。

在线处理

借助在线（同步）请求，您可以发送单个文档以进行处理。Document AI 会立即处理请求并返回 document。

向处理器发送请求

以下代码示例展示了如何向处理器发送请求。

REST

此示例展示了如何在 rawDocument 对象中提供文档内容（通过 base64 编码字符串提供的字节形式的原始文档内容）。

或者，您也可以指定 inlineDocument，这与 Document AI 返回的 Document JSON 格式相同。这样，您就可以通过来回传递相同的格式来串联请求（例如，如果您对文档进行分类，然后提取其内容）。

在使用任何请求数据之前，请先进行以下替换：

LOCATION：处理器的位置，例如：
- us - 美国
- eu - 欧盟
PROJECT_ID：您的 Google Cloud 项目 ID。
PROCESSOR_ID：自定义处理器的 ID。
skipHumanReview：用于停用人工审核的布尔值（仅人机协同处理器支持）。
- true - 跳过人工审核
- false - 启用人工审核（默认）
MIME_TYPE^†：有效的 MIME 类型选项之一。
IMAGE_CONTENT^†：有效的嵌入式文档内容之一，表示为字节流。对于 JSON 表示法，二进制图片数据的 base64 编码（ASCII 字符串）。此字符串应类似于以下字符串：
- /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
如需了解详情，请参阅 Base64 编码主题。
FIELD_MASK：指定要包含在 Document 输出中的字段。这是以 FieldMask 格式表示的完全限定字段名称的逗号分隔列表。
- 示例：text,entities,pages.pageNumber
INDIVIDUAL_PAGES：要处理的各个网页的列表。
- 或者，您也可以提供 fromStart 或 fromEnd 字段，以处理文档开头或结尾的特定数量的页面。

† 您还可以在 inlineDocument 对象中使用 base64 编码的内容指定此内容。

HTTP 方法和网址：

POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process

请求 JSON 正文：

{
  "skipHumanReview": skipHumanReview,
  "rawDocument": {
    "mimeType": "MIME_TYPE",
    "content": "IMAGE_CONTENT"
  },
  "fieldMask": "FIELD_MASK",
  "processOptions": {
    "individualPageSelector" {
      "pages": [INDIVIDUAL_PAGES]
    }
  }
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process" | Select-Object -Expand Content

如果请求成功，服务器将返回一个 200 OK HTTP 状态代码以及 JSON 格式的响应。响应正文包含一个 Document 实例。

向处理器版本发送请求

在使用任何请求数据之前，请先进行以下替换：

LOCATION：处理器的位置，例如：
- us - 美国
- eu - 欧盟
PROJECT_ID：您的 Google Cloud 项目 ID。
PROCESSOR_ID：自定义处理器的 ID。
PROCESSOR_VERSION：处理器版本标识符。如需了解详情，请参阅选择处理器版本。例如：
- pretrained-TYPE-vX.X-YYYY-MM-DD
- stable
- rc
skipHumanReview：用于停用人工审核的布尔值（仅人机协同处理器支持）。
- true - 跳过人工审核
- false - 启用人工审核（默认）
MIME_TYPE^†：有效的 MIME 类型选项之一。
IMAGE_CONTENT^†：有效的嵌入式文档内容之一，表示为字节流。对于 JSON 表示法，二进制图片数据的 base64 编码（ASCII 字符串）。此字符串应类似于以下字符串：
- /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
如需了解详情，请参阅 Base64 编码主题。
FIELD_MASK：指定要包含在 Document 输出中的字段。这是以 FieldMask 格式表示的完全限定字段名称的逗号分隔列表。
- 示例：text,entities,pages.pageNumber

† 您还可以在 inlineDocument 对象中使用 base64 编码的内容指定此内容。

HTTP 方法和网址：

POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process

请求 JSON 正文：

{
  "skipHumanReview": skipHumanReview,
  "rawDocument": {
    "mimeType": "MIME_TYPE",
    "content": "IMAGE_CONTENT"
  },
  "fieldMask": "FIELD_MASK"
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process" | Select-Object -Expand Content

如果请求成功，服务器将返回一个 200 OK HTTP 状态代码以及 JSON 格式的响应。响应正文包含一个 Document 实例。

C#

如需了解详情，请参阅 Document AI C# API 参考文档。

如需向 Document AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。


using Google.Cloud.DocumentAI.V1;
using Google.Protobuf;
using System;
using System.IO;

public class QuickstartSample
{
    public Document Quickstart(
        string projectId = "your-project-id",
        string locationId = "your-processor-location",
        string processorId = "your-processor-id",
        string localPath = "my-local-path/my-file-name",
        string mimeType = "application/pdf"
    )
    {
        // Create client
        var client = new DocumentProcessorServiceClientBuilder
        {
            Endpoint = $"{locationId}-documentai.googleapis.com"
        }.Build();

        // Read in local file
        using var fileStream = File.OpenRead(localPath);
        var rawDocument = new RawDocument
        {
            Content = ByteString.FromStream(fileStream),
            MimeType = mimeType
        };

        // Initialize request argument(s)
        var request = new ProcessRequest
        {
            Name = ProcessorName.FromProjectLocationProcessor(projectId, locationId, processorId).ToString(),
            RawDocument = rawDocument
        };

        // Make the request
        var response = client.ProcessDocument(request);

        var document = response.Document;
        Console.WriteLine(document.Text);
        return document;
    }
}

Java

如需了解详情，请参阅 Document AI Java API 参考文档。

如需向 Document AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。


import com.google.cloud.documentai.v1.Document;
import com.google.cloud.documentai.v1.DocumentProcessorServiceClient;
import com.google.cloud.documentai.v1.DocumentProcessorServiceSettings;
import com.google.cloud.documentai.v1.ProcessRequest;
import com.google.cloud.documentai.v1.ProcessResponse;
import com.google.cloud.documentai.v1.RawDocument;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;

public class ProcessDocument {
  public static void processDocument()
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String location = "your-project-location"; // Format is "us" or "eu".
    String processerId = "your-processor-id";
    String filePath = "path/to/input/file.pdf";
    processDocument(projectId, location, processerId, filePath);
  }

  public static void processDocument(
      String projectId, String location, String processorId, String filePath)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs
    // to be created
    // once, and can be reused for multiple requests. After completing all of your
    // requests, call
    // the "close" method on the client to safely clean up any remaining background
    // resources.
    String endpoint = String.format("%s-documentai.googleapis.com:443", location);
    DocumentProcessorServiceSettings settings =
        DocumentProcessorServiceSettings.newBuilder().setEndpoint(endpoint).build();
    try (DocumentProcessorServiceClient client = DocumentProcessorServiceClient.create(settings)) {
      // The full resource name of the processor, e.g.:
      // projects/project-id/locations/location/processor/processor-id
      // You must create new processors in the Cloud Console first
      String name =
          String.format("projects/%s/locations/%s/processors/%s", projectId, location, processorId);

      // Read the file.
      byte[] imageFileData = Files.readAllBytes(Paths.get(filePath));

      // Convert the image data to a Buffer and base64 encode it.
      ByteString content = ByteString.copyFrom(imageFileData);

      RawDocument document =
          RawDocument.newBuilder().setContent(content).setMimeType("application/pdf").build();

      // Configure the process request.
      ProcessRequest request =
          ProcessRequest.newBuilder().setName(name).setRawDocument(document).build();

      // Recognizes text entities in the PDF document
      ProcessResponse result = client.processDocument(request);
      Document documentResponse = result.getDocument();

      // Get all of the document text as one big string
      String text = documentResponse.getText();

      // Read the text recognition output from the processor
      System.out.println("The document contains the following paragraphs:");
      Document.Page firstPage = documentResponse.getPages(0);
      List<Document.Page.Paragraph> paragraphs = firstPage.getParagraphsList();

      for (Document.Page.Paragraph paragraph : paragraphs) {
        String paragraphText = getText(paragraph.getLayout().getTextAnchor(), text);
        System.out.printf("Paragraph text:\n%s\n", paragraphText);
      }

      // Form parsing provides additional output about
      // form-formatted PDFs. You must create a form
      // processor in the Cloud Console to see full field details.
      System.out.println("The following form key/value pairs were detected:");

      for (Document.Page.FormField field : firstPage.getFormFieldsList()) {
        String fieldName = getText(field.getFieldName().getTextAnchor(), text);
        String fieldValue = getText(field.getFieldValue().getTextAnchor(), text);

        System.out.println("Extracted form fields pair:");
        System.out.printf("\t(%s, %s))\n", fieldName, fieldValue);
      }
    }
  }

  // Extract shards from the text field
  private static String getText(Document.TextAnchor textAnchor, String text) {
    if (textAnchor.getTextSegmentsList().size() > 0) {
      int startIdx = (int) textAnchor.getTextSegments(0).getStartIndex();
      int endIdx = (int) textAnchor.getTextSegments(0).getEndIndex();
      return text.substring(startIdx, endIdx);
    }
    return "[NO TEXT]";
  }
}

Node.js

如需了解详情，请参阅 Document AI Node.js API 参考文档。

如需向 Document AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION'; // Format is 'us' or 'eu'
// const processorId = 'YOUR_PROCESSOR_ID'; // Create processor in Cloud Console
// const filePath = '/path/to/local/pdf';

const {DocumentProcessorServiceClient} =
  require('@google-cloud/documentai').v1;

// Instantiates a client
const client = new DocumentProcessorServiceClient();

async function processDocument() {
  // The full resource name of the processor, e.g.:
  // projects/project-id/locations/location/processor/processor-id
  // You must create new processors in the Cloud Console first
  const name = `projects/${projectId}/locations/${location}/processors/${processorId}`;

  // Read the file into memory.
  const fs = require('fs').promises;
  const imageFile = await fs.readFile(filePath);

  // Convert the image data to a Buffer and base64 encode it.
  const encodedImage = Buffer.from(imageFile).toString('base64');

  const request = {
    name,
    rawDocument: {
      content: encodedImage,
      mimeType: 'application/pdf',
    },
  };

  // Recognizes text entities in the PDF document
  const [result] = await client.processDocument(request);
  const {document} = result;

  // Get all of the document text as one big string
  const {text} = document;

  // Extract shards from the text field
  const getText = textAnchor => {
    if (!textAnchor.textSegments || textAnchor.textSegments.length === 0) {
      return '';
    }

    // First shard in document doesn't have startIndex property
    const startIndex = textAnchor.textSegments[0].startIndex || 0;
    const endIndex = textAnchor.textSegments[0].endIndex;

    return text.substring(startIndex, endIndex);
  };

  // Read the text recognition output from the processor
  console.log('The document contains the following paragraphs:');
  const [page1] = document.pages;
  const {paragraphs} = page1;

  for (const paragraph of paragraphs) {
    const paragraphText = getText(paragraph.layout.textAnchor);
    console.log(`Paragraph text:\n${paragraphText}`);
  }

  // Form parsing provides additional output about
  // form-formatted PDFs. You  must create a form
  // processor in the Cloud Console to see full field details.
  console.log('\nThe following form key/value pairs were detected:');

  const {formFields} = page1;
  for (const field of formFields) {
    const fieldName = getText(field.fieldName.textAnchor);
    const fieldValue = getText(field.fieldValue.textAnchor);

    console.log('Extracted key value pair:');
    console.log(`\t(${fieldName}, ${fieldValue})`);
  }
}

Python

如需了解详情，请参阅 Document AI Python API 参考文档。

如需向 Document AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

from typing import Optional

from google.api_core.client_options import ClientOptions
from google.cloud import documentai  # type: ignore

# TODO(developer): Uncomment these variables before running the sample.
# project_id = "YOUR_PROJECT_ID"
# location = "YOUR_PROCESSOR_LOCATION" # Format is "us" or "eu"
# processor_id = "YOUR_PROCESSOR_ID" # Create processor before running sample
# file_path = "/path/to/local/pdf"
# mime_type = "application/pdf" # Refer to https://cloud.google.com/document-ai/docs/file-types for supported file types
# field_mask = "text,entities,pages.pageNumber"  # Optional. The fields to return in the Document object.
# processor_version_id = "YOUR_PROCESSOR_VERSION_ID" # Optional. Processor version to use


def process_document_sample(
    project_id: str,
    location: str,
    processor_id: str,
    file_path: str,
    mime_type: str,
    field_mask: Optional[str] = None,
    processor_version_id: Optional[str] = None,
) -> None:
    # You must set the `api_endpoint` if you use a location other than "us".
    opts = ClientOptions(api_endpoint=f"{location}-documentai.googleapis.com")

    client = documentai.DocumentProcessorServiceClient(client_options=opts)

    if processor_version_id:
        # The full resource name of the processor version, e.g.:
        # `projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}`
        name = client.processor_version_path(
            project_id, location, processor_id, processor_version_id
        )
    else:
        # The full resource name of the processor, e.g.:
        # `projects/{project_id}/locations/{location}/processors/{processor_id}`
        name = client.processor_path(project_id, location, processor_id)

    # Read the file into memory
    with open(file_path, "rb") as image:
        image_content = image.read()

    # Load binary data
    raw_document = documentai.RawDocument(content=image_content, mime_type=mime_type)

    # For more information: https://cloud.google.com/document-ai/docs/reference/rest/v1/ProcessOptions
    # Optional: Additional configurations for processing.
    process_options = documentai.ProcessOptions(
        # Process only specific pages
        individual_page_selector=documentai.ProcessOptions.IndividualPageSelector(
            pages=[1]
        )
    )

    # Configure the process request
    request = documentai.ProcessRequest(
        name=name,
        raw_document=raw_document,
        field_mask=field_mask,
        process_options=process_options,
    )

    result = client.process_document(request=request)

    # For a full list of `Document` object attributes, reference this page:
    # https://cloud.google.com/document-ai/docs/reference/rest/v1/Document
    document = result.document

    # Read the text recognition output from the processor
    print("The document contains the following text:")
    print(document.text)

批处理

借助批量（异步）请求，您可以在单个请求中发送多个文档。Document AI 会返回 operation，您可以通过该返回值轮询请求状态。此操作完成后，其中会包含一个 BatchProcessMetadata，指向存储处理结果的 Cloud Storage 存储分区。

如果您要访问的输入文件位于其他项目的存储分区中，则必须先提供对该存储分区的访问权限，然后才能访问这些文件。请参阅设置文件访问权限。

向处理器发送请求

以下代码示例展示了如何向处理器发送批处理请求。

REST

此示例展示了如何向 batchProcess 方法发送 POST 请求，以进行大型文档异步处理。该示例使用通过 Google Cloud CLI 为项目设置的服务账号的访问令牌。如需了解有关安装 Google Cloud CLI、使用服务账号设置项目以及获取访问令牌的说明，请参阅开始前须知。

batchProcess 请求会启动长时间运行的操作，并将结果存储在 Cloud Storage 存储分区中。此示例还展示了如何在长时间运行的操作启动后获取其状态。