Tutorial de reconhecimento ótico de carateres (OCR) (1.ª geração)


Saiba como fazer o reconhecimento ótico de carateres (OCR) no Google Cloud. Este tutorial demonstra como carregar ficheiros de imagem para o Cloud Storage, extrair texto das imagens através da API Cloud Vision, traduzir o texto através da API Google Cloud Translation e guardar as traduções novamente no Cloud Storage. O Pub/Sub é usado para colocar várias tarefas em fila e acionar as funções do Cloud Run certas para as executar.

Para mais informações sobre como enviar um pedido de deteção de texto (OCR), consulte os artigos Detete texto em imagens, Detete escrita manual em imagens ou Detete texto em ficheiros (PDF/TIFF).

Objetivos

Custos

Neste documento, usa os seguintes componentes faturáveis do Google Cloud:

  • Cloud Run functions
  • Pub/Sub
  • Cloud Storage
  • Cloud Translation API
  • Cloud Vision

Para gerar uma estimativa de custos com base na sua utilização projetada, use a calculadora de preços.

Os novos Google Cloud utilizadores podem ser elegíveis para uma avaliação gratuita.

Antes de começar

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. Enable the Cloud Functions, Cloud Build, Cloud Pub/Sub, Cloud Storage, Cloud Translation, and Cloud Vision APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the APIs

  5. Install the Google Cloud CLI.

  6. Se estiver a usar um fornecedor de identidade (IdP) externo, tem primeiro de iniciar sessão na CLI gcloud com a sua identidade federada.

  7. Para inicializar a CLI gcloud, execute o seguinte comando:

    gcloud init
  8. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  9. Verify that billing is enabled for your Google Cloud project.

  10. Enable the Cloud Functions, Cloud Build, Cloud Pub/Sub, Cloud Storage, Cloud Translation, and Cloud Vision APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the APIs

  11. Install the Google Cloud CLI.

  12. Se estiver a usar um fornecedor de identidade (IdP) externo, tem primeiro de iniciar sessão na CLI gcloud com a sua identidade federada.

  13. Para inicializar a CLI gcloud, execute o seguinte comando:

    gcloud init
  14. Se já tiver a CLI gcloud instalada, atualize-a executando o seguinte comando:

    gcloud components update
  15. Prepare o seu ambiente de desenvolvimento.
  16. Visualizar o fluxo de dados

    O fluxo de dados na aplicação do tutorial de OCR envolve vários passos:

    1. Uma imagem que contém texto em qualquer idioma é carregada para o Cloud Storage.
    2. É acionada uma função do Cloud Run, que usa a API Vision para extrair o texto e detetar o idioma de origem.
    3. O texto é colocado em fila para tradução através da publicação de uma mensagem num tópico do Pub/Sub. É colocada em fila uma tradução para cada idioma de destino diferente do idioma de origem.
    4. Se um idioma de destino corresponder ao idioma de origem, a fila de tradução é ignorada e o texto é enviado para a fila de resultados, que é um tópico do Pub/Sub diferente.
    5. Uma função do Cloud Run usa a API Translation para traduzir o texto na fila de tradução. O resultado traduzido é enviado para a fila de resultados.
    6. Outra função do Cloud Run guarda o texto traduzido da fila de resultados no Cloud Storage.
    7. Os resultados encontram-se no Cloud Storage como ficheiros de texto para cada tradução.

    Pode ser útil visualizar os passos:

    A preparar a aplicação

    1. Crie um contentor do Cloud Storage para carregar imagens, em que YOUR_IMAGE_BUCKET_NAME é um nome de contentor exclusivo a nível global:

      gcloud storage buckets create gs://YOUR_IMAGE_BUCKET_NAME
    2. Crie um contentor do Cloud Storage para guardar as traduções de texto, em que YOUR_RESULT_BUCKET_NAME é um nome de contentor exclusivo a nível global:

      gcloud storage buckets create gs://YOUR_RESULT_BUCKET_NAME
    3. Crie um tópico do Pub/Sub para publicar pedidos de tradução, em que YOUR_TRANSLATE_TOPIC_NAME é o nome do seu tópico de pedidos de tradução:

      gcloud pubsub topics create YOUR_TRANSLATE_TOPIC_NAME
    4. Crie um tópico Pub/Sub para publicar os resultados da tradução concluídos, onde YOUR_RESULT_TOPIC_NAME é o nome do tópico de resultados da tradução:

      gcloud pubsub topics create YOUR_RESULT_TOPIC_NAME
    5. Clone o repositório da app de exemplo para a sua máquina local:

      Node.js

      git clone https://github.com/GoogleCloudPlatform/nodejs-docs-samples.git

      Em alternativa, pode transferir o exemplo como um ficheiro ZIP e extraí-lo.

      Python

      git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git

      Em alternativa, pode transferir o exemplo como um ficheiro ZIP e extraí-lo.

      Go

      git clone https://github.com/GoogleCloudPlatform/golang-samples.git

      Em alternativa, pode transferir o exemplo como um ficheiro ZIP e extraí-lo.

      Java

      git clone https://github.com/GoogleCloudPlatform/java-docs-samples.git

      Em alternativa, pode transferir o exemplo como um ficheiro ZIP e extraí-lo.

    6. Altere para o diretório que contém o código de exemplo das funções do Cloud Run:

      Node.js

      cd nodejs-docs-samples/functions/ocr/app/

      Python

      cd python-docs-samples/functions/ocr/app/

      Go

      cd golang-samples/functions/ocr/app/

      Java

      cd java-docs-samples/functions/ocr/ocr-process-image/

    Compreender o código

    Importar dependências

    A aplicação tem de importar várias dependências para comunicar com os serviços da Google Cloud Platform:

    Node.js

    // Get a reference to the Pub/Sub component
    const {PubSub} = require('@google-cloud/pubsub');
    const pubsub = new PubSub();
    // Get a reference to the Cloud Storage component
    const {Storage} = require('@google-cloud/storage');
    const storage = new Storage();
    
    // Get a reference to the Cloud Vision API component
    const Vision = require('@google-cloud/vision');
    const vision = new Vision.ImageAnnotatorClient();
    
    // Get a reference to the Translate API component
    const {Translate} = require('@google-cloud/translate').v2;
    const translate = new Translate();
    

    Python

    import base64
    import json
    import os
    from typing import Dict, TypeVar
    
    from google.cloud import pubsub_v1
    from google.cloud import storage
    from google.cloud import translate_v2 as translate
    from google.cloud import vision
    
    vision_client = vision.ImageAnnotatorClient()
    translate_client = translate.Client()
    publisher = pubsub_v1.PublisherClient()
    storage_client = storage.Client()
    
    project_id = os.environ["GCP_PROJECT"]

    Go

    
    // Package ocr contains Go samples for creating OCR
    // (Optical Character Recognition) Cloud functions.
    package ocr
    
    import (
    	"context"
    	"fmt"
    	"os"
    	"strings"
    	"time"
    
    	"cloud.google.com/go/pubsub"
    	"cloud.google.com/go/storage"
    	"cloud.google.com/go/translate"
    	vision "cloud.google.com/go/vision/apiv1"
    	"golang.org/x/text/language"
    )
    
    type ocrMessage struct {
    	Text     string       `json:"text"`
    	FileName string       `json:"fileName"`
    	Lang     language.Tag `json:"lang"`
    	SrcLang  language.Tag `json:"srcLang"`
    }
    
    // GCSEvent is the payload of a GCS event.
    type GCSEvent struct {
    	Bucket         string    `json:"bucket"`
    	Name           string    `json:"name"`
    	Metageneration string    `json:"metageneration"`
    	ResourceState  string    `json:"resourceState"`
    	TimeCreated    time.Time `json:"timeCreated"`
    	Updated        time.Time `json:"updated"`
    }
    
    // PubSubMessage is the payload of a Pub/Sub event.
    // See the documentation for more details:
    // https://cloud.google.com/pubsub/docs/reference/rest/v1/PubsubMessage
    type PubSubMessage struct {
    	Data []byte `json:"data"`
    }
    
    var (
    	visionClient    *vision.ImageAnnotatorClient
    	translateClient *translate.Client
    	pubsubClient    *pubsub.Client
    	storageClient   *storage.Client
    
    	projectID      string
    	resultBucket   string
    	resultTopic    string
    	toLang         []string
    	translateTopic string
    )
    
    func setup(ctx context.Context) error {
    	projectID = os.Getenv("GCP_PROJECT")
    	resultBucket = os.Getenv("RESULT_BUCKET")
    	resultTopic = os.Getenv("RESULT_TOPIC")
    	toLang = strings.Split(os.Getenv("TO_LANG"), ",")
    	translateTopic = os.Getenv("TRANSLATE_TOPIC")
    
    	var err error // Prevent shadowing clients with :=.
    
    	if visionClient == nil {
    		visionClient, err = vision.NewImageAnnotatorClient(ctx)
    		if err != nil {
    			return fmt.Errorf("vision.NewImageAnnotatorClient: %w", err)
    		}
    	}
    
    	if translateClient == nil {
    		translateClient, err = translate.NewClient(ctx)
    		if err != nil {
    			return fmt.Errorf("translate.NewClient: %w", err)
    		}
    	}
    
    	if pubsubClient == nil {
    		pubsubClient, err = pubsub.NewClient(ctx, projectID)
    		if err != nil {
    			return fmt.Errorf("translate.NewClient: %w", err)
    		}
    	}
    
    	if storageClient == nil {
    		storageClient, err = storage.NewClient(ctx)
    		if err != nil {
    			return fmt.Errorf("storage.NewClient: %w", err)
    		}
    	}
    	return nil
    }
    

    Java

    public class OcrProcessImage implements BackgroundFunction<GcsEvent> {
      // TODO<developer> set these environment variables
      private static final String PROJECT_ID = System.getenv("GCP_PROJECT");
      private static final String TRANSLATE_TOPIC_NAME = System.getenv("TRANSLATE_TOPIC");
      private static final String[] TO_LANGS = System.getenv("TO_LANG").split(",");
    
      private static final Logger logger = Logger.getLogger(OcrProcessImage.class.getName());
      private static final String LOCATION_NAME = LocationName.of(PROJECT_ID, "global").toString();
      private Publisher publisher;
    
      public OcrProcessImage() throws IOException {
        publisher = Publisher.newBuilder(
            ProjectTopicName.of(PROJECT_ID, TRANSLATE_TOPIC_NAME)).build();
      }
    }

    A processar imagens

    A função seguinte lê um ficheiro de imagem carregado do Cloud Storage e chama uma função para detetar se a imagem contém texto:

    Node.js

    /**
     * This function is exported by index.js, and is executed when
     * a file is uploaded to the Cloud Storage bucket you created
     * for uploading images.
     *
     * @param {object} event A Google Cloud Storage File object.
     */
    exports.processImage = async event => {
      const {bucket, name} = event;
    
      if (!bucket) {
        throw new Error(
          'Bucket not provided. Make sure you have a "bucket" property in your request'
        );
      }
      if (!name) {
        throw new Error(
          'Filename not provided. Make sure you have a "name" property in your request'
        );
      }
    
      await detectText(bucket, name);
      console.log(`File ${name} processed.`);
    };

    Python

    def process_image(file_info: dict, context: dict) -> None:
        """Cloud Function triggered by Cloud Storage when a file is changed.
    
        Args:
            file_info: Metadata of the changed file, provided by the
                triggering Cloud Storage event.
            context: a dictionary containing metadata about the event.
    
        Returns:
            None; the output is written to stdout and Stackdriver Logging.
        """
        bucket = validate_message(file_info, "bucket")
        name = validate_message(file_info, "name")
    
        detect_text(bucket, name)
    
        print(f"File '{file_info['name']}' processed.")

    Go

    
    package ocr
    
    import (
    	"context"
    	"fmt"
    	"log"
    )
    
    // ProcessImage is executed when a file is uploaded to the Cloud Storage bucket you
    // created for uploading images. It runs detectText, which processes the image for text.
    func ProcessImage(ctx context.Context, event GCSEvent) error {
    	if err := setup(ctx); err != nil {
    		return fmt.Errorf("ProcessImage: %w", err)
    	}
    	if event.Bucket == "" {
    		return fmt.Errorf("empty file.Bucket")
    	}
    	if event.Name == "" {
    		return fmt.Errorf("empty file.Name")
    	}
    	if err := detectText(ctx, event.Bucket, event.Name); err != nil {
    		return fmt.Errorf("detectText: %w", err)
    	}
    	log.Printf("File %s processed.", event.Name)
    	return nil
    }
    

    Java

    
    import com.google.cloud.functions.BackgroundFunction;
    import com.google.cloud.functions.Context;
    import com.google.cloud.pubsub.v1.Publisher;
    import com.google.cloud.translate.v3.DetectLanguageRequest;
    import com.google.cloud.translate.v3.DetectLanguageResponse;
    import com.google.cloud.translate.v3.LocationName;
    import com.google.cloud.translate.v3.TranslationServiceClient;
    import com.google.cloud.vision.v1.AnnotateImageRequest;
    import com.google.cloud.vision.v1.AnnotateImageResponse;
    import com.google.cloud.vision.v1.Feature;
    import com.google.cloud.vision.v1.Image;
    import com.google.cloud.vision.v1.ImageAnnotatorClient;
    import com.google.cloud.vision.v1.ImageSource;
    import com.google.protobuf.ByteString;
    import com.google.pubsub.v1.ProjectTopicName;
    import com.google.pubsub.v1.PubsubMessage;
    import functions.eventpojos.GcsEvent;
    import java.io.IOException;
    import java.util.ArrayList;
    import java.util.List;
    import java.util.concurrent.ExecutionException;
    import java.util.logging.Level;
    import java.util.logging.Logger;
    
      @Override
      public void accept(GcsEvent gcsEvent, Context context) {
    
        // Validate parameters
        String bucket = gcsEvent.getBucket();
        if (bucket == null) {
          throw new IllegalArgumentException("Missing bucket parameter");
        }
        String filename = gcsEvent.getName();
        if (filename == null) {
          throw new IllegalArgumentException("Missing name parameter");
        }
    
        detectText(bucket, filename);
      }
    }

    A seguinte função extrai texto da imagem através da API Vision e coloca o texto em fila para tradução:

    Node.js

    /**
     * Detects the text in an image using the Google Vision API.
     *
     * @param {string} bucketName Cloud Storage bucket name.
     * @param {string} filename Cloud Storage file name.
     * @returns {Promise}
     */
    const detectText = async (bucketName, filename) => {
      console.log(`Looking for text in image ${filename}`);
      const [textDetections] = await vision.textDetection(
        `gs://${bucketName}/${filename}`
      );
      const [annotation] = textDetections.textAnnotations;
      const text = annotation ? annotation.description.trim() : '';
      console.log('Extracted text from image:', text);
    
      let [translateDetection] = await translate.detect(text);
      if (Array.isArray(translateDetection)) {
        [translateDetection] = translateDetection;
      }
      console.log(
        `Detected language "${translateDetection.language}" for ${filename}`
      );
    
      // Submit a message to the bus for each language we're going to translate to
      const TO_LANGS = process.env.TO_LANG.split(',');
      const topicName = process.env.TRANSLATE_TOPIC;
    
      const tasks = TO_LANGS.map(lang => {
        const messageData = {
          text: text,
          filename: filename,
          lang: lang,
        };
    
        // Helper function that publishes translation result to a Pub/Sub topic
        // For more information on publishing Pub/Sub messages, see this page:
        //   https://cloud.google.com/pubsub/docs/publisher
        return publishResult(topicName, messageData);
      });
    
      return Promise.all(tasks);
    };

    Python

    def detect_text(bucket: str, filename: str) -> None:
        """
        Extract the text from an image uploaded to Cloud Storage.
    
        Extract the text from an image uploaded to Cloud Storage, then
        publish messages requesting subscribing services translate the text
        to each target language and save the result.
    
        Args:
            bucket: name of GCS bucket in which the file is stored.
            filename: name of the file to be read.
    
        Returns:
            None; the output is written to stdout and Stackdriver Logging.
        """
        print("Looking for text in image {}".format(filename))
    
        futures = []
    
        image = vision.Image(
            source=vision.ImageSource(gcs_image_uri=f"gs://{bucket}/{filename}")
        )
        text_detection_response = vision_client.text_detection(image=image)
        annotations = text_detection_response.text_annotations
    
        if len(annotations) > 0:
            text = annotations[0].description
        else:
            text = ""
    
        print(f"Extracted text {text} from image ({len(text)} chars).")
    
        detect_language_response = translate_client.detect_language(text)
        src_lang = detect_language_response["language"]
        print(f"Detected language {src_lang} for text {text}.")
    
        # Submit a message to the bus for each target language
        to_langs = os.environ["TO_LANG"].split(",")
        for target_lang in to_langs:
            topic_name = os.environ["TRANSLATE_TOPIC"]
            if src_lang == target_lang or src_lang == "und":
                topic_name = os.environ["RESULT_TOPIC"]
            message = {
                "text": text,
                "filename": filename,
                "lang": target_lang,
                "src_lang": src_lang,
            }
            message_data = json.dumps(message).encode("utf-8")
            topic_path = publisher.topic_path(project_id, topic_name)
            future = publisher.publish(topic_path, data=message_data)
            futures.append(future)
        for future in futures:
            future.result()

    Go

    
    package ocr
    
    import (
    	"context"
    	"encoding/json"
    	"fmt"
    	"log"
    
    	"cloud.google.com/go/pubsub"
    	"cloud.google.com/go/vision/v2/apiv1/visionpb"
    	"golang.org/x/text/language"
    )
    
    // detectText detects the text in an image using the Google Vision API.
    func detectText(ctx context.Context, bucketName, fileName string) error {
    	log.Printf("Looking for text in image %v", fileName)
    	maxResults := 1
    	image := &visionpb.Image{
    		Source: &visionpb.ImageSource{
    			GcsImageUri: fmt.Sprintf("gs://%s/%s", bucketName, fileName),
    		},
    	}
    	annotations, err := visionClient.DetectTexts(ctx, image, &visionpb.ImageContext{}, maxResults)
    	if err != nil {
    		return fmt.Errorf("DetectTexts: %w", err)
    	}
    	text := ""
    	if len(annotations) > 0 {
    		text = annotations[0].Description
    	}
    	if len(annotations) == 0 || len(text) == 0 {
    		log.Printf("No text detected in image %q. Returning early.", fileName)
    		return nil
    	}
    	log.Printf("Extracted text %q from image (%d chars).", text, len(text))
    
    	detectResponse, err := translateClient.DetectLanguage(ctx, []string{text})
    	if err != nil {
    		return fmt.Errorf("DetectLanguage: %w", err)
    	}
    	if len(detectResponse) == 0 || len(detectResponse[0]) == 0 {
    		return fmt.Errorf("DetectLanguage gave empty response")
    	}
    	srcLang := detectResponse[0][0].Language.String()
    	log.Printf("Detected language %q for text %q.", srcLang, text)
    
    	// Submit a message to the bus for each target language
    	for _, targetLang := range toLang {
    		topicName := translateTopic
    		if srcLang == targetLang || srcLang == "und" { // detection returns "und" for undefined language
    			topicName = resultTopic
    		}
    		targetTag, err := language.Parse(targetLang)
    		if err != nil {
    			return fmt.Errorf("language.Parse: %w", err)
    		}
    		srcTag, err := language.Parse(srcLang)
    		if err != nil {
    			return fmt.Errorf("language.Parse: %w", err)
    		}
    		message, err := json.Marshal(ocrMessage{
    			Text:     text,
    			FileName: fileName,
    			Lang:     targetTag,
    			SrcLang:  srcTag,
    		})
    		if err != nil {
    			return fmt.Errorf("json.Marshal: %w", err)
    		}
    		topic := pubsubClient.Topic(topicName)
    		ok, err := topic.Exists(ctx)
    		if err != nil {
    			return fmt.Errorf("Exists: %w", err)
    		}
    		if !ok {
    			topic, err = pubsubClient.CreateTopic(ctx, topicName)
    			if err != nil {
    				return fmt.Errorf("CreateTopic: %w", err)
    			}
    		}
    		msg := &pubsub.Message{
    			Data: []byte(message),
    		}
    		if _, err = topic.Publish(ctx, msg).Get(ctx); err != nil {
    			return fmt.Errorf("Get: %w", err)
    		}
    	}
    	return nil
    }
    

    Java

    private void detectText(String bucket, String filename) {
      logger.info("Looking for text in image " + filename);
    
      List<AnnotateImageRequest> visionRequests = new ArrayList<>();
      String gcsPath = String.format("gs://%s/%s", bucket, filename);
    
      ImageSource imgSource = ImageSource.newBuilder().setGcsImageUri(gcsPath).build();
      Image img = Image.newBuilder().setSource(imgSource).build();
    
      Feature textFeature = Feature.newBuilder().setType(Feature.Type.TEXT_DETECTION).build();
      AnnotateImageRequest visionRequest =
          AnnotateImageRequest.newBuilder().addFeatures(textFeature).setImage(img).build();
      visionRequests.add(visionRequest);
    
      // Detect text in an image using the Cloud Vision API
      AnnotateImageResponse visionResponse;
      try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
        visionResponse = client.batchAnnotateImages(visionRequests).getResponses(0);
        if (visionResponse == null || !visionResponse.hasFullTextAnnotation()) {
          logger.info(String.format("Image %s contains no text", filename));
          return;
        }
    
        if (visionResponse.hasError()) {
          // Log error
          logger.log(
              Level.SEVERE, "Error in vision API call: " + visionResponse.getError().getMessage());
          return;
        }
      } catch (IOException e) {
        // Log error (since IOException cannot be thrown by a Cloud Function)
        logger.log(Level.SEVERE, "Error detecting text: " + e.getMessage(), e);
        return;
      }
    
      String text = visionResponse.getFullTextAnnotation().getText();
      logger.info("Extracted text from image: " + text);
    
      // Detect language using the Cloud Translation API
      DetectLanguageRequest languageRequest =
          DetectLanguageRequest.newBuilder()
              .setParent(LOCATION_NAME)
              .setMimeType("text/plain")
              .setContent(text)
              .build();
      DetectLanguageResponse languageResponse;
      try (TranslationServiceClient client = TranslationServiceClient.create()) {
        languageResponse = client.detectLanguage(languageRequest);
      } catch (IOException e) {
        // Log error (since IOException cannot be thrown by a function)
        logger.log(Level.SEVERE, "Error detecting language: " + e.getMessage(), e);
        return;
      }
    
      if (languageResponse.getLanguagesCount() == 0) {
        logger.info("No languages were detected for text: " + text);
        return;
      }
    
      String languageCode = languageResponse.getLanguages(0).getLanguageCode();
      logger.info(String.format("Detected language %s for file %s", languageCode, filename));
    
      // Send a Pub/Sub translation request for every language we're going to translate to
      for (String targetLanguage : TO_LANGS) {
        logger.info("Sending translation request for language " + targetLanguage);
        OcrTranslateApiMessage message = new OcrTranslateApiMessage(text, filename, targetLanguage);
        ByteString byteStr = ByteString.copyFrom(message.toPubsubData());
        PubsubMessage pubsubApiMessage = PubsubMessage.newBuilder().setData(byteStr).build();
        try {
          publisher.publish(pubsubApiMessage).get();
        } catch (InterruptedException | ExecutionException e) {
          // Log error
          logger.log(Level.SEVERE, "Error publishing translation request: " + e.getMessage(), e);
          return;
        }
      }
    }

    A traduzir texto

    A função seguinte traduz o texto extraído e coloca o texto traduzido em fila para ser guardado novamente no Cloud Storage:

    Node.js

    /**
     * This function is exported by index.js, and is executed when
     * a message is published to the Cloud Pub/Sub topic specified
     * by the TRANSLATE_TOPIC environment variable. The function
     * translates text using the Google Translate API.
     *
     * @param {object} event The Cloud Pub/Sub Message object.
     * @param {string} {messageObject}.data The "data" property of the Cloud Pub/Sub
     * Message. This property will be a base64-encoded string that you must decode.
     */
    exports.translateText = async event => {
      const pubsubData = event.data;
      const jsonStr = Buffer.from(pubsubData, 'base64').toString();
      const {text, filename, lang} = JSON.parse(jsonStr);
    
      if (!text) {
        throw new Error(
          'Text not provided. Make sure you have a "text" property in your request'
        );
      }
      if (!filename) {
        throw new Error(
          'Filename not provided. Make sure you have a "filename" property in your request'
        );
      }
      if (!lang) {
        throw new Error(
          'Language not provided. Make sure you have a "lang" property in your request'
        );
      }
    
      console.log(`Translating text into ${lang}`);
      const [translation] = await translate.translate(text, lang);
    
      console.log('Translated text:', translation);
    
      const messageData = {
        text: translation,
        filename: filename,
        lang: lang,
      };
    
      await publishResult(process.env.RESULT_TOPIC, messageData);
      console.log(`Text translated to ${lang}`);
    };

    Python

    def translate_text(event: dict, context: dict) -> None:
        """Cloud Function triggered by PubSub when a message is received from
        a subscription.
    
        Translates the text in the message from the specified source language
        to the requested target language, then sends a message requesting another
        service save the result.
    
        Args:
            event: dictionary containing the PubSub event.
            context: a dictionary containing metadata about the event.
    
        Returns:
            None; the output is written to stdout and Stackdriver Logging.
        """
        if event.get("data"):
            message_data = base64.b64decode(event["data"]).decode("utf-8")
            message = json.loads(message_data)
        else:
            raise ValueError("Data sector is missing in the Pub/Sub message.")
    
        text = validate_message(message, "text")
        filename = validate_message(message, "filename")
        target_lang = validate_message(message, "lang")
        src_lang = validate_message(message, "src_lang")
    
        print(f"Translating text into {target_lang}.")
        translated_text = translate_client.translate(
            text, target_language=target_lang, source_language=src_lang
        )
        topic_name = os.environ["RESULT_TOPIC"]
        message = {
            "text": translated_text["translatedText"],
            "filename": filename,
            "lang": target_lang,
        }
        encoded_message = json.dumps(message).encode("utf-8")
        topic_path = publisher.topic_path(project_id, topic_name)
        future = publisher.publish(topic_path, data=encoded_message)
        future.result()

    Go

    
    package ocr
    
    import (
    	"context"
    	"encoding/json"
    	"fmt"
    	"log"
    
    	"cloud.google.com/go/pubsub"
    	"cloud.google.com/go/translate"
    )
    
    // TranslateText is executed when a message is published to the Cloud Pub/Sub
    // topic specified by the TRANSLATE_TOPIC environment variable, and translates
    // the text using the Google Translate API.
    func TranslateText(ctx context.Context, event PubSubMessage) error {
    	if err := setup(ctx); err != nil {
    		return fmt.Errorf("setup: %w", err)
    	}
    	if event.Data == nil {
    		return fmt.Errorf("empty data")
    	}
    	var message ocrMessage
    	if err := json.Unmarshal(event.Data, &message); err != nil {
    		return fmt.Errorf("json.Unmarshal: %w", err)
    	}
    
    	log.Printf("Translating text into %s.", message.Lang.String())
    	opts := translate.Options{
    		Source: message.SrcLang,
    	}
    	translateResponse, err := translateClient.Translate(ctx, []string{message.Text}, message.Lang, &opts)
    	if err != nil {
    		return fmt.Errorf("Translate: %w", err)
    	}
    	if len(translateResponse) == 0 {
    		return fmt.Errorf("Empty Translate response")
    	}
    	translatedText := translateResponse[0]
    
    	messageData, err := json.Marshal(ocrMessage{
    		Text:     translatedText.Text,
    		FileName: message.FileName,
    		Lang:     message.Lang,
    		SrcLang:  message.SrcLang,
    	})
    	if err != nil {
    		return fmt.Errorf("json.Marshal: %w", err)
    	}
    
    	topic := pubsubClient.Topic(resultTopic)
    	ok, err := topic.Exists(ctx)
    	if err != nil {
    		return fmt.Errorf("Exists: %w", err)
    	}
    	if !ok {
    		topic, err = pubsubClient.CreateTopic(ctx, resultTopic)
    		if err != nil {
    			return fmt.Errorf("CreateTopic: %w", err)
    		}
    	}
    	msg := &pubsub.Message{
    		Data: messageData,
    	}
    	if _, err = topic.Publish(ctx, msg).Get(ctx); err != nil {
    		return fmt.Errorf("Get: %w", err)
    	}
    	log.Printf("Sent translation: %q", translatedText.Text)
    	return nil
    }
    

    Java

    
    import com.google.cloud.functions.BackgroundFunction;
    import com.google.cloud.functions.Context;
    import com.google.cloud.pubsub.v1.Publisher;
    import com.google.cloud.translate.v3.LocationName;
    import com.google.cloud.translate.v3.TranslateTextRequest;
    import com.google.cloud.translate.v3.TranslateTextResponse;
    import com.google.cloud.translate.v3.TranslationServiceClient;
    import com.google.protobuf.ByteString;
    import com.google.pubsub.v1.ProjectTopicName;
    import com.google.pubsub.v1.PubsubMessage;
    import functions.eventpojos.Message;
    import java.io.IOException;
    import java.nio.charset.StandardCharsets;
    import java.util.concurrent.ExecutionException;
    import java.util.logging.Level;
    import java.util.logging.Logger;
    
    public class OcrTranslateText implements BackgroundFunction<Message> {
      private static final Logger logger = Logger.getLogger(OcrTranslateText.class.getName());
    
      // TODO<developer> set these environment variables
      private static final String PROJECT_ID = getenv("GCP_PROJECT");
      private static final String RESULTS_TOPIC_NAME = getenv("RESULT_TOPIC");
      private static final String LOCATION_NAME = LocationName.of(PROJECT_ID, "global").toString();
    
      private Publisher publisher;
    
      public OcrTranslateText() throws IOException {
        publisher = Publisher.newBuilder(
            ProjectTopicName.of(PROJECT_ID, RESULTS_TOPIC_NAME)).build();
      }
    
      @Override
      public void accept(Message pubSubMessage, Context context) {
        OcrTranslateApiMessage ocrMessage = OcrTranslateApiMessage.fromPubsubData(
            pubSubMessage.getData().getBytes(StandardCharsets.UTF_8));
    
        String targetLang = ocrMessage.getLang();
        logger.info("Translating text into " + targetLang);
    
        // Translate text to target language
        String text = ocrMessage.getText();
        TranslateTextRequest request =
            TranslateTextRequest.newBuilder()
                .setParent(LOCATION_NAME)
                .setMimeType("text/plain")
                .setTargetLanguageCode(targetLang)
                .addContents(text)
                .build();
    
        TranslateTextResponse response;
        try (TranslationServiceClient client = TranslationServiceClient.create()) {
          response = client.translateText(request);
        } catch (IOException e) {
          // Log error (since IOException cannot be thrown by a function)
          logger.log(Level.SEVERE, "Error translating text: " + e.getMessage(), e);
          return;
        }
        if (response.getTranslationsCount() == 0) {
          return;
        }
    
        String translatedText = response.getTranslations(0).getTranslatedText();
        logger.info("Translated text: " + translatedText);
    
        // Send translated text to (subsequent) Pub/Sub topic
        String filename = ocrMessage.getFilename();
        OcrTranslateApiMessage translateMessage = new OcrTranslateApiMessage(
            translatedText, filename, targetLang);
        try {
          ByteString byteStr = ByteString.copyFrom(translateMessage.toPubsubData());
          PubsubMessage pubsubApiMessage = PubsubMessage.newBuilder().setData(byteStr).build();
    
          publisher.publish(pubsubApiMessage).get();
          logger.info("Text translated to " + targetLang);
        } catch (InterruptedException | ExecutionException e) {
          // Log error (since these exception types cannot be thrown by a function)
          logger.log(Level.SEVERE, "Error publishing translation save request: " + e.getMessage(), e);
        }
      }
    
      // Avoid ungraceful deployment failures due to unset environment variables.
      // If you get this warning you should redeploy with the variable set.
      private static String getenv(String name) {
        String value = System.getenv(name);
        if (value == null) {
          logger.warning("Environment variable " + name + " was not set");
          value = "MISSING";
        }
        return value;
      }
    }

    Guardar as traduções

    Por último, a seguinte função recebe o texto traduzido e guarda-o novamente no Cloud Storage:

    Node.js

    /**
     * This function is exported by index.js, and is executed when
     * a message is published to the Cloud Pub/Sub topic specified
     * by the RESULT_TOPIC environment variable. The function saves
     * the data packet to a file in GCS.
     *
     * @param {object} event The Cloud Pub/Sub Message object.
     * @param {string} {messageObject}.data The "data" property of the Cloud Pub/Sub
     * Message. This property will be a base64-encoded string that you must decode.
     */
    exports.saveResult = async event => {
      const pubsubData = event.data;
      const jsonStr = Buffer.from(pubsubData, 'base64').toString();
      const {text, filename, lang} = JSON.parse(jsonStr);
    
      if (!text) {
        throw new Error(
          'Text not provided. Make sure you have a "text" property in your request'
        );
      }
      if (!filename) {
        throw new Error(
          'Filename not provided. Make sure you have a "filename" property in your request'
        );
      }
      if (!lang) {
        throw new Error(
          'Language not provided. Make sure you have a "lang" property in your request'
        );
      }
    
      console.log(`Received request to save file ${filename}`);
    
      const bucketName = process.env.RESULT_BUCKET;
      const newFilename = renameImageForSave(filename, lang);
      const file = storage.bucket(bucketName).file(newFilename);
    
      console.log(`Saving result to ${newFilename} in bucket ${bucketName}`);
    
      await file.save(text);
      console.log('File saved.');
    };

    Python

    def save_result(event: dict, context: dict) -> None:
        """
        Cloud Function triggered by PubSub when a message is received from
        a subscription.
    
        Args:
            event: dictionary containing the PubSub event.
            context: a dictionary containing metadata about the event.
    
        Returns:
            None; the output is written to stdout and Stackdriver Logging.
        """
        if event.get("data"):
            message_data = base64.b64decode(event["data"]).decode("utf-8")
            message = json.loads(message_data)
        else:
            raise ValueError("Data sector is missing in the Pub/Sub message.")
    
        text = validate_message(message, "text")
        filename = validate_message(message, "filename")
        lang = validate_message(message, "lang")
    
        print(f"Received request to save file {filename}.")
    
        bucket_name = os.environ["RESULT_BUCKET"]
        result_filename = f"{filename}_{lang}.txt"
        bucket = storage_client.get_bucket(bucket_name)
        blob = bucket.blob(result_filename)
    
        print(f"Saving result to {result_filename} in bucket {bucket_name}.")
    
        blob.upload_from_string(text)
    
        print("File saved.")

    Go

    
    package ocr
    
    import (
    	"context"
    	"encoding/json"
    	"fmt"
    	"log"
    )
    
    // SaveResult is executed when a message is published to the Cloud Pub/Sub topic
    // specified by the RESULT_TOPIC environment vairable, and saves the data packet
    // to a file in GCS.
    func SaveResult(ctx context.Context, event PubSubMessage) error {
    	if err := setup(ctx); err != nil {
    		return fmt.Errorf("ProcessImage: %w", err)
    	}
    	var message ocrMessage
    	if event.Data == nil {
    		return fmt.Errorf("Empty data")
    	}
    	if err := json.Unmarshal(event.Data, &message); err != nil {
    		return fmt.Errorf("json.Unmarshal: %w", err)
    	}
    	log.Printf("Received request to save file %q.", message.FileName)
    
    	resultFilename := fmt.Sprintf("%s_%s.txt", message.FileName, message.Lang)
    	bucket := storageClient.Bucket(resultBucket)
    
    	log.Printf("Saving result to %q in bucket %q.", resultFilename, resultBucket)
    
    	w := bucket.Object(resultFilename).NewWriter(ctx)
    	defer w.Close()
    	fmt.Fprint(w, message.Text)
    
    	log.Printf("File saved.")
    	return nil
    }
    

    Java

    
    import com.google.cloud.functions.BackgroundFunction;
    import com.google.cloud.functions.Context;
    import com.google.cloud.storage.BlobId;
    import com.google.cloud.storage.BlobInfo;
    import com.google.cloud.storage.Storage;
    import com.google.cloud.storage.StorageOptions;
    import functions.eventpojos.PubsubMessage;
    import java.nio.charset.StandardCharsets;
    import java.util.logging.Logger;
    
    public class OcrSaveResult implements BackgroundFunction<PubsubMessage> {
      // TODO<developer> set this environment variable
      private static final String RESULT_BUCKET = System.getenv("RESULT_BUCKET");
    
      private static final Storage STORAGE = StorageOptions.getDefaultInstance().getService();
      private static final Logger logger = Logger.getLogger(OcrSaveResult.class.getName());
    
      @Override
      public void accept(PubsubMessage pubSubMessage, Context context) {
        OcrTranslateApiMessage ocrMessage = OcrTranslateApiMessage.fromPubsubData(
            pubSubMessage.getData().getBytes(StandardCharsets.UTF_8));
    
        logger.info("Received request to save file " +  ocrMessage.getFilename());
    
        String newFileName = String.format(
            "%s_to_%s.txt", ocrMessage.getFilename(), ocrMessage.getLang());
    
        // Save file to RESULT_BUCKET with name newFileNaem
        logger.info(String.format("Saving result to %s in bucket %s", newFileName, RESULT_BUCKET));
        BlobInfo blobInfo = BlobInfo.newBuilder(BlobId.of(RESULT_BUCKET, newFileName)).build();
        STORAGE.create(blobInfo, ocrMessage.getText().getBytes(StandardCharsets.UTF_8));
        logger.info("File saved");
      }
    }

    Implementar as funções

    1. Para implementar a função de processamento de imagens com um acionador do Cloud Storage, execute o seguinte comando no diretório que contém o código de exemplo (ou, no caso do Java, o ficheiro pom.xml):

      Node.js

      gcloud functions deploy ocr-extract \
      --runtime nodejs20 \
      --trigger-bucket YOUR_IMAGE_BUCKET_NAME \
      --entry-point processImage \
      --set-env-vars "^:^GCP_PROJECT=YOUR_GCP_PROJECT_ID:TRANSLATE_TOPIC=YOUR_TRANSLATE_TOPIC_NAME:RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME:TO_LANG=es,en,fr,ja"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão suportada do Node.js para executar a sua função.

      Python

      gcloud functions deploy ocr-extract \
      --runtime python312 \
      --trigger-bucket YOUR_IMAGE_BUCKET_NAME \
      --entry-point process_image \
      --set-env-vars "^:^GCP_PROJECT=YOUR_GCP_PROJECT_ID:TRANSLATE_TOPIC=YOUR_TRANSLATE_TOPIC_NAME:RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME:TO_LANG=es,en,fr,ja"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão do Python suportada para executar a sua função.

      Go

      gcloud functions deploy ocr-extract \
      --runtime go121 \
      --trigger-bucket YOUR_IMAGE_BUCKET_NAME \
      --entry-point ProcessImage \
      --set-env-vars "^:^GCP_PROJECT=YOUR_GCP_PROJECT_ID:TRANSLATE_TOPIC=YOUR_TRANSLATE_TOPIC_NAME:RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME:TO_LANG=es,en,fr,ja"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão do Go suportada para executar a sua função.

      Java

      gcloud functions deploy ocr-extract \
      --entry-point functions.OcrProcessImage \
      --runtime java17 \
      --memory 512MB \
      --trigger-bucket YOUR_IMAGE_BUCKET_NAME \
      --set-env-vars "^:^GCP_PROJECT=YOUR_GCP_PROJECT_ID:TRANSLATE_TOPIC=YOUR_TRANSLATE_TOPIC_NAME:RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME:TO_LANG=es,en,fr,ja"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão Java suportada para executar a sua função.

      onde YOUR_IMAGE_BUCKET_NAME é o nome do seu contentor do Cloud Storage para o qual vai carregar imagens.

    2. Para implementar a função de tradução de texto com um acionador do Pub/Sub, execute o seguinte comando no diretório que contém o código de exemplo (ou, no caso do Java, o ficheiro pom.xml):

      Node.js

      gcloud functions deploy ocr-translate \
      --runtime nodejs20 \
      --trigger-topic YOUR_TRANSLATE_TOPIC_NAME \
      --entry-point translateText \
      --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão suportada do Node.js para executar a sua função.

      Python

      gcloud functions deploy ocr-translate \
      --runtime python312 \
      --trigger-topic YOUR_TRANSLATE_TOPIC_NAME \
      --entry-point translate_text \
      --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão do Python suportada para executar a sua função.

      Go

      gcloud functions deploy ocr-translate \
      --runtime go121 \
      --trigger-topic YOUR_TRANSLATE_TOPIC_NAME \
      --entry-point TranslateText \
      --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão do Go suportada para executar a sua função.

      Java

      gcloud functions deploy ocr-translate \
      --entry-point functions.OcrTranslateText \
      --runtime java17 \
      --memory 512MB \
      --trigger-topic YOUR_TRANSLATE_TOPIC_NAME \
      --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão Java suportada para executar a sua função.

    3. Para implementar a função que guarda os resultados no Cloud Storage com um acionador do Cloud Pub/Sub, execute o seguinte comando no diretório que contém o código de exemplo (ou, no caso do Java, o ficheiro pom.xml):

      Node.js

      gcloud functions deploy ocr-save \
      --runtime nodejs20 \
      --trigger-topic YOUR_RESULT_TOPIC_NAME \
      --entry-point saveResult \
      --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_BUCKET=YOUR_RESULT_BUCKET_NAME"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão suportada do Node.js para executar a sua função.

      Python

      gcloud functions deploy ocr-save \
      --runtime python312 \
      --trigger-topic YOUR_RESULT_TOPIC_NAME \
      --entry-point save_result \
      --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_BUCKET=YOUR_RESULT_BUCKET_NAME"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão do Python suportada para executar a sua função.

      Go

      gcloud functions deploy ocr-save \
      --runtime go121 \
      --trigger-topic YOUR_RESULT_TOPIC_NAME \
      --entry-point SaveResult \
      --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_BUCKET=YOUR_RESULT_BUCKET_NAME"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão do Go suportada para executar a sua função.

      Java

      gcloud functions deploy ocr-save \
      --entry-point functions.OcrSaveResult \
      --runtime java17 \
      --memory 512MB \
      --trigger-topic YOUR_RESULT_TOPIC_NAME \
      --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_BUCKET=YOUR_RESULT_BUCKET_NAME"

      Use a flag --runtime para especificar o ID de tempo de execução de uma versão Java suportada para executar a sua função.

    Carregar uma imagem

    1. Carregue uma imagem para o seu contentor do Cloud Storage de imagens:

      gcloud storage cp PATH_TO_IMAGE gs://YOUR_IMAGE_BUCKET_NAME

      onde

      • PATH_TO_IMAGE é um caminho para um ficheiro de imagem (que contém texto) no seu sistema local.
      • YOUR_IMAGE_BUCKET_NAME é o nome do contentor onde está a carregar imagens.

      Pode transferir uma das imagens do projeto de exemplo.

    2. Monitorize os registos para se certificar de que as execuções foram concluídas:

      gcloud functions logs read --limit 100
    3. Pode ver as traduções guardadas no contentor do Cloud Storage que usou para YOUR_RESULT_BUCKET_NAME.

    Limpar

    Para evitar incorrer em custos na sua conta do Google Cloud pelos recursos usados neste tutorial, elimine o projeto que contém os recursos ou mantenha o projeto e elimine os recursos individuais.

    Eliminar o projeto

    A forma mais fácil de eliminar a faturação é eliminar o projeto que criou para o tutorial.

    Para eliminar o projeto:

    1. In the Google Cloud console, go to the Manage resources page.

      Go to Manage resources

    2. In the project list, select the project that you want to delete, and then click Delete.
    3. In the dialog, type the project ID, and then click Shut down to delete the project.

    A eliminar a função

    A eliminação de funções do Cloud Run não remove recursos armazenados no Cloud Storage.

    Para eliminar as funções do Cloud Run que criou neste tutorial, execute os seguintes comandos:

    gcloud functions delete ocr-extract
    gcloud functions delete ocr-translate
    gcloud functions delete ocr-save

    Também pode eliminar funções do Cloud Run a partir da Google Cloud consola.