Document AI Client Libraries

This page shows how to get started with the Cloud Client Libraries for the Document AI API. Read more about the client libraries for Cloud APIs, including the older Google APIs Client Libraries, in Client Libraries Explained.

Installing the client library

Java

For more information, see Setting Up a Java Development Environment.

Se você estiver usando o Maven, adicione o código abaixo ao arquivo pom.xml. Para mais informações sobre BOMs, consulte BOM das bibliotecas do Google Cloud Platform.

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>16.1.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-document-ai</artifactId>
    <version>0.3.5</version>
  </dependency>

Se você estiver usando o Gradle, adicione isto às dependências:

compile 'com.google.cloud:google-cloud-document-ai:0.3.5'

Se você estiver usando o sbt, adicione o seguinte às suas dependências:

libraryDependencies += "com.google.cloud" % "google-cloud-document-ai" % "0.3.5"

Caso você esteja usando o IntelliJ ou o Eclipse, poderá adicionar bibliotecas de cliente ao seu projeto usando estes plug-ins de ambiente de desenvolvimento integrado:

Os plug-ins também oferecem outras funcionalidades, como gerenciamento de chaves de contas de serviço. Consulte a documentação de cada plug-in para mais detalhes.

Node.js

For more information, see Setting Up a Node.js Development Environment.

npm install @google-cloud/documentai

Python

For more information, see Setting Up a Python Development Environment.

pip install --upgrade google-cloud-documentai

Setting up authentication

To run the client library, you must first set up authentication by creating a service account and setting an environment variable. Complete the following steps to set up authentication. For other ways to authenticate, see the GCP authentication documentation.

Console do Cloud

  1. No Console do Cloud, acesse a página Criar chave da conta de serviço.

    Acessar página "Criar chave da conta de serviço"
  2. Na lista Conta de serviço, selecione Nova conta de serviço.
  3. No campo Nome da conta de serviço, insira um nome.
  4. Não selecione um valor na lista Papel. Não é necessário ter um papel para acessar esse serviço.
  5. Clique em Criar. Uma nota aparecerá informando que esta conta de serviço não tem papel.
  6. Clique em Criar sem papel. O download de um arquivo JSON que contém sua chave é feito no seu computador.

Linha de comando

É possível executar os seguintes comandos usando o SDK do Cloud na máquina local ou no Cloud Shell.

  1. Crie a conta de serviço. Substitua NAME por um nome para a conta de serviço.

    gcloud iam service-accounts create NAME
  2. Gere o arquivo de chave. Substitua FILE_NAME pelo nome do arquivo de chave.

    gcloud iam service-accounts keys create FILE_NAME.json --iam-account=NAME@PROJECT_ID.iam.gserviceaccount.com

Forneça credenciais de autenticação ao código do aplicativo definindo a variável de ambiente GOOGLE_APPLICATION_CREDENTIALS. Substitua [PATH] pelo caminho do arquivo JSON que contém sua chave da conta de serviço. Essa variável só se aplica à sessão de shell atual. Assim, se você abrir uma nova sessão, precisará definir a variável novamente.

Linux ou macOS

export GOOGLE_APPLICATION_CREDENTIALS="[PATH]"

Exemplo:

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/my-key.json"

Windows

Com o PowerShell:

$env:GOOGLE_APPLICATION_CREDENTIALS="[PATH]"

Exemplo:

$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\my-key.json"

Com prompt de comando:

set GOOGLE_APPLICATION_CREDENTIALS=[PATH]

Using the client library

The following example shows how to use the client library.

Java


import com.google.cloud.documentai.v1beta2.Document;
import com.google.cloud.documentai.v1beta2.DocumentUnderstandingServiceClient;
import com.google.cloud.documentai.v1beta2.GcsSource;
import com.google.cloud.documentai.v1beta2.InputConfig;
import com.google.cloud.documentai.v1beta2.ProcessDocumentRequest;
import java.io.IOException;

public class QuickStart {

  public static void quickStart() throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String location = "your-project-location"; // Format is "us" or "eu".
    String inputGcsUri = "gs://your-gcs-bucket/path/to/input/file.json";
    quickStart(projectId, location, inputGcsUri);
  }

  public static void quickStart(String projectId, String location, String inputGcsUri)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DocumentUnderstandingServiceClient client = DocumentUnderstandingServiceClient.create()) {
      // Configure the request for processing a single document
      String parent = String.format("projects/%s/locations/%s", projectId, location);

      GcsSource uri = GcsSource.newBuilder().setUri(inputGcsUri).build();

      // mime_type can be application/pdf, image/tiff,
      // and image/gif, or application/json
      InputConfig config =
          InputConfig.newBuilder().setGcsSource(uri).setMimeType("application/pdf").build();

      ProcessDocumentRequest request =
          ProcessDocumentRequest.newBuilder().setParent(parent).setInputConfig(config).build();

      // Recognizes text entities in the PDF document
      Document response = client.processDocument(request);

      // Get all of the document text as one big string
      String text = response.getText();

      // Process the output
      for (Document.Entity entity : response.getEntitiesList()) {
        System.out.printf("Entity text: %s\n", getText(entity, text));
        System.out.printf("Entity type: %s\n", entity.getType());
        System.out.printf("Entity mention text: %s\n", entity.getMentionText());
      }
    }
  }

  private static String getText(Document.Entity entity, String text) {
    int startIdx = (int) entity.getTextAnchor().getTextSegments(0).getStartIndex();
    int endIdx = (int) entity.getTextAnchor().getTextSegments(0).getEndIndex();
    return text.substring(startIdx, endIdx);
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION'; // Format is 'us' or 'eu'
// const processorId = 'YOUR_PROCESSOR_ID'; // Create processor in Cloud Console
// const filePath = '/path/to/local/pdf';

const {
  DocumentProcessorServiceClient,
} = require('@google-cloud/documentai').v1beta3;

// Instantiates a client
const client = new DocumentProcessorServiceClient();

async function quickstart() {
  // The full resource name of the processor, e.g.:
  // projects/project-id/locations/location/processor/processor-id
  // You must create new processors in the Cloud Console first
  const name = `projects/${projectId}/locations/${location}/processors/${processorId}`;

  // Read the file into memory.
  const fs = require('fs').promises;
  const imageFile = await fs.readFile(filePath);

  // Convert the image data to a Buffer and base64 encode it.
  const encodedImage = Buffer.from(imageFile).toString('base64');

  const request = {
    name,
    document: {
      content: encodedImage,
      mimeType: 'application/pdf',
    },
  };

  // Recognizes text entities in the PDF document
  const [result] = await client.processDocument(request);
  const {document} = result;

  // Get all of the document text as one big string
  const {text} = document;

  // Extract shards from the text field
  const getText = textAnchor => {
    if (!textAnchor.textSegments || textAnchor.textSegments.length === 0) {
      return '';
    }

    // First shard in document doesn't have startIndex property
    const startIndex = textAnchor.textSegments[0].startIndex || 0;
    const endIndex = textAnchor.textSegments[0].endIndex;

    return text.substring(startIndex, endIndex);
  };

  // Read the text recognition output from the processor
  console.log('The document contains the following paragraphs:');
  const [page1] = document.pages;
  const {paragraphs} = page1;

  for (const paragraph of paragraphs) {
    const paragraphText = getText(paragraph.layout.textAnchor);
    console.log(`Paragraph text:\n${paragraphText}`);
  }
}

Python

from google.cloud import documentai_v1beta2 as documentai


def main(
    project_id="YOUR_PROJECT_ID",
    input_uri="gs://cloud-samples-data/documentai/invoice.pdf",
):
    """Process a single document with the Document AI API, including
    text extraction and entity extraction."""

    client = documentai.DocumentUnderstandingServiceClient()

    gcs_source = documentai.types.GcsSource(uri=input_uri)

    # mime_type can be application/pdf, image/tiff,
    # and image/gif, or application/json
    input_config = documentai.types.InputConfig(
        gcs_source=gcs_source, mime_type="application/pdf"
    )

    # Location can be 'us' or 'eu'
    parent = "projects/{}/locations/us".format(project_id)
    request = documentai.types.ProcessDocumentRequest(
        parent=parent, input_config=input_config
    )

    document = client.process_document(request=request)

    # All text extracted from the document
    print("Document Text: {}".format(document.text))

    def _get_text(el):
        """Convert text offset indexes into text snippets."""
        response = ""
        # If a text segment spans several lines, it will
        # be stored in different text segments.
        for segment in el.text_anchor.text_segments:
            start_index = segment.start_index
            end_index = segment.end_index
            response += document.text[start_index:end_index]
        return response

    for entity in document.entities:
        print("Entity type: {}".format(entity.type_))
        print("Text: {}".format(_get_text(entity)))
        print("Mention text: {}\n".format(entity.mention_text))

Additional resources