光学字符识别 (OCR) 教程(第 2 代)

您将了解如何在 Google Cloud Platform 上执行光学字符识别 (OCR)。本教程演示如何将图片文件上传到 Cloud Storage、使用 Cloud Vision 从图片中提取文本、使用 Cloud Translation API 翻译文本以及将译文保存回 Cloud Storage。Pub/Sub 用于将各种任务加入队列,并触发适当的 Cloud Functions 函数来执行这些任务。

如需详细了解如何发送文本检测 (OCR) 请求,请参阅检测图片中的文本检测图片中的手写内容检测文件中的文本 (PDF/TIFF)


  • 编写和部署多个事件驱动型函数。
  • 将图片上传到 Cloud Storage。
  • 提取、翻译和保存上传的图片中包含的文本。


在本文档中,您将使用 Google Cloud 的以下收费组件:

  • Cloud Functions
  • Cloud Build
  • Pub/Sub
  • Artifact Registry
  • Eventarc
  • Cloud Run
  • Cloud Logging
  • Cloud Storage
  • Cloud Translation API
  • Cloud Vision

您可使用价格计算器根据您的预计使用情况来估算费用。 Google Cloud 新用户可能有资格申请免费试用


  1. 登录您的 Google Cloud 账号。如果您是 Google Cloud 新手,请创建一个账号来评估我们的产品在实际场景中的表现。新客户还可获享 $300 赠金,用于运行、测试和部署工作负载。
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. 确保您的 Google Cloud 项目已启用结算功能

  4. Enable the Cloud Functions, Cloud Build, Cloud Run, Artifact Registry, Eventarc, Logging, Pub/Sub, Cloud Storage, Cloud Translation, and Cloud Vision APIs.

    Enable the APIs

  5. Install the Google Cloud CLI.
  6. To initialize the gcloud CLI, run the following command:

    gcloud init
  13. 准备开发环境。


OCR 教程应用中的数据流涉及以下几个步骤:

  1. 包含任何语言文本的图片都会上传到 Cloud Storage。
  2. 触发一个 Cloud Functions 函数,该函数使用 Vision API 来提取文本并检测源语言。
  3. 向 Cloud Pub/Sub 主题发布消息后,文本会被加入队列等待翻译。系统会针对与源语言不同的每种目标语言将翻译加入队列中。
  4. 如果目标语言与源语言匹配,系统会跳过翻译队列,然后将文本发送到结果队列,即另一个 Pub/Sub 主题。
  5. Cloud Functions 函数使用 Translation API 对翻译队列中的文本进行翻译。翻译后的结果将发送到结果队列。
  6. 另一个 Cloud Functions 函数会将结果队列中的译文保存到 Cloud Storage。
  7. 您可以在 Cloud Storage 中找到每个翻译的结果(采用文本文件形式)。



  1. 创建一个 Cloud Storage 存储分区以向其中上传图片,其中 YOUR_IMAGE_BUCKET_NAME 是全局唯一的存储分区名称:

    gsutil mb gs://YOUR_IMAGE_BUCKET_NAME
  2. 创建一个 Cloud Storage 存储分区以将文本译文保存到该存储分区,其中 YOUR_RESULT_BUCKET_NAME 是全局唯一的存储分区名称:

    gsutil mb gs://YOUR_RESULT_BUCKET_NAME
  3. 创建一个 Cloud Pub/Sub 主题以向其发布翻译请求,其中 YOUR_TRANSLATE_TOPIC_NAME 是翻译请求主题的名称:

    gcloud pubsub topics create YOUR_TRANSLATE_TOPIC_NAME
  4. 创建一个 Cloud Pub/Sub 主题以向其发布已完成的翻译结果,其中 YOUR_RESULT_TOPIC_NAME 是翻译结果主题的名称:

    gcloud pubsub topics create YOUR_RESULT_TOPIC_NAME
  5. 将示例应用代码库克隆到本地机器:


    git clone https://github.com/GoogleCloudPlatform/nodejs-docs-samples.git

    或者,您也可以下载该示例的 zip 文件并将其解压缩。


    git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git

    或者,您也可以下载该示例的 zip 文件并将其解压缩。


    git clone https://github.com/GoogleCloudPlatform/golang-samples.git

    或者,您也可以下载该示例的 zip 文件并将其解压缩。


    git clone https://github.com/GoogleCloudPlatform/java-docs-samples.git

    或者,您也可以下载该示例的 zip 文件并将其解压缩。

  6. 切换到包含 Cloud Functions 函数示例代码的目录:


    cd nodejs-docs-samples/functions/v2/ocr/app/


    cd python-docs-samples/functions/v2/ocr/


    cd golang-samples/functions/functionsv2/ocr/app/


    cd java-docs-samples/functions/v2/ocr/ocr-process-image/


本部分介绍构成 OCR 示例的依赖项和函数。


应用必须导入多个依赖项才能与 Google Cloud Platform 服务进行通信:


// Get a reference to the Pub/Sub component
const {PubSub} = require('@google-cloud/pubsub');
const pubsub = new PubSub();

// Get a reference to the Cloud Storage component
const {Storage} = require('@google-cloud/storage');
const storage = new Storage();

// Get a reference to the Cloud Vision API component
const Vision = require('@google-cloud/vision');
const vision = new Vision.ImageAnnotatorClient();

// Get a reference to the Translate API component
const {Translate} = require('@google-cloud/translate').v2;
const translate = new Translate();

const functions = require('@google-cloud/functions-framework');


import base64
import json
import os

from cloudevents.http import CloudEvent

import functions_framework

from google.cloud import pubsub_v1
from google.cloud import storage
from google.cloud import translate_v2 as translate
from google.cloud import vision

vision_client = vision.ImageAnnotatorClient()
translate_client = translate.Client()
publisher = pubsub_v1.PublisherClient()
storage_client = storage.Client()

project_id = os.environ.get("GCP_PROJECT")


// Package ocr contains Go samples for creating OCR
// (Optical Character Recognition) Cloud functions.
package ocr

import (

	vision "cloud.google.com/go/vision/apiv1"

type ocrMessage struct {
	Text     string       `json:"text"`
	FileName string       `json:"fileName"`
	Lang     language.Tag `json:"lang"`
	SrcLang  language.Tag `json:"srcLang"`

// Eventarc sends a MessagePublishedData object.
// See the documentation for additional fields and more details:
// https://cloud.google.com/eventarc/docs/cloudevents#pubsub_1
type MessagePublishedData struct {
	Message PubSubMessage

// PubSubMessage is the payload of a Pub/Sub event.
// See the documentation for additional fields and more details:
// https://cloud.google.com/pubsub/docs/reference/rest/v1/PubsubMessage
type PubSubMessage struct {
	Data []byte `json:"data"`

var (
	visionClient    *vision.ImageAnnotatorClient
	translateClient *translate.Client
	pubsubClient    *pubsub.Client
	storageClient   *storage.Client

	projectID      string
	resultBucket   string
	resultTopic    string
	toLang         []string
	translateTopic string

func setup(ctx context.Context) error {
	projectID = os.Getenv("GCP_PROJECT")
	resultBucket = os.Getenv("RESULT_BUCKET")
	resultTopic = os.Getenv("RESULT_TOPIC")
	toLang = strings.Split(os.Getenv("TO_LANG"), ",")
	translateTopic = os.Getenv("TRANSLATE_TOPIC")

	var err error // Prevent shadowing clients with :=.

	if visionClient == nil {
		visionClient, err = vision.NewImageAnnotatorClient(ctx)
		if err != nil {
			return fmt.Errorf("vision.NewImageAnnotatorClient: %w", err)

	if translateClient == nil {
		translateClient, err = translate.NewClient(ctx)
		if err != nil {
			return fmt.Errorf("translate.NewClient: %w", err)

	if pubsubClient == nil {
		pubsubClient, err = pubsub.NewClient(ctx, projectID)
		if err != nil {
			return fmt.Errorf("translate.NewClient: %w", err)

	if storageClient == nil {
		storageClient, err = storage.NewClient(ctx)
		if err != nil {
			return fmt.Errorf("storage.NewClient: %w", err)
	return nil


public class OcrProcessImage implements CloudEventsFunction {
  // TODO<developer> set these environment variables
  private static final String PROJECT_ID = System.getenv("GCP_PROJECT");
  private static final String TRANSLATE_TOPIC_NAME = System.getenv("TRANSLATE_TOPIC");
  private static final String[] TO_LANGS = System.getenv("TO_LANG") == null ? new String[] { "es" }
      : System.getenv("TO_LANG").split(",");

  private static final Logger logger = Logger.getLogger(OcrProcessImage.class.getName());
  private static final String LOCATION_NAME = LocationName.of(PROJECT_ID, "global").toString();
  private Publisher publisher;

  public OcrProcessImage() throws IOException {
    publisher = Publisher.newBuilder(ProjectTopicName.of(PROJECT_ID, TRANSLATE_TOPIC_NAME)).build();



以下函数会从 Cloud Storage 中读取一个上传的图片文件,并调用一个函数来检测该图片是否含有文本:


 * This function is exported by index.js, and is executed when
 * a file is uploaded to the Cloud Storage bucket you created
 * for uploading images.
 * @param {object} cloudEvent A CloudEvent containing the Cloud Storage File object.
 * https://cloud.google.com/storage/docs/json_api/v1/objects
functions.cloudEvent('processImage', async cloudEvent => {
  const {bucket, name} = cloudEvent.data;

  if (!bucket) {
    throw new Error(
      'Bucket not provided. Make sure you have a "bucket" property in your request'
  if (!name) {
    throw new Error(
      'Filename not provided. Make sure you have a "name" property in your request'

  await detectText(bucket, name);
  console.log(`File ${name} processed.`);


def process_image(cloud_event: CloudEvent) -> None:
    """Cloud Function triggered by Cloud Storage when a file is changed.

    Gets the names of the newly created object and its bucket then calls
    detect_text to find text in that image.

    detect_text finishes by sending PubSub messages requesting another service
    then complete processing those texts by translating them and saving the

    # Check that the received event is of the expected type, return error if not
    expected_type = "google.cloud.storage.object.v1.finalized"
    received_type = cloud_event["type"]
    if received_type != expected_type:
        raise ValueError(f"Expected {expected_type} but received {received_type}")

    # Extract the bucket and file names of the uploaded image for processing
    data = cloud_event.data
    bucket = data["bucket"]
    filename = data["name"]

    # Process the information in the new image
    detect_text(bucket, filename)

    print(f"File {filename} processed.")


package ocr

import (


func init() {
	functions.CloudEvent("process-image", ProcessImage)

// ProcessImage is executed when a file is uploaded to the Cloud Storage bucket you
// created for uploading images. It runs detectText, which processes the image for text.
func ProcessImage(ctx context.Context, cloudevent event.Event) error {
	if err := setup(ctx); err != nil {
		return fmt.Errorf("ProcessImage: %w", err)

	var data storagedata.StorageObjectData
	if err := protojson.Unmarshal(cloudevent.Data(), &data); err != nil {
		return fmt.Errorf("protojson.Unmarshal: Failed to parse CloudEvent data: %w", err)
	if data.GetBucket() == "" {
		return fmt.Errorf("empty file.Bucket")
	if data.GetName() == "" {
		return fmt.Errorf("empty file.Name")
	if err := detectText(ctx, data.GetBucket(), data.GetName()); err != nil {
		return fmt.Errorf("detectText: %w", err)
	log.Printf("File %s processed.", data.GetName())
	return nil


import com.google.cloud.functions.CloudEventsFunction;
import com.google.cloud.pubsub.v1.Publisher;
import com.google.cloud.translate.v3.DetectLanguageRequest;
import com.google.cloud.translate.v3.DetectLanguageResponse;
import com.google.cloud.translate.v3.LocationName;
import com.google.cloud.translate.v3.TranslationServiceClient;
import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.ImageSource;
import com.google.events.cloud.storage.v1.StorageObjectData;
import com.google.protobuf.ByteString;
import com.google.protobuf.InvalidProtocolBufferException;
import com.google.protobuf.util.JsonFormat;
import com.google.pubsub.v1.ProjectTopicName;
import com.google.pubsub.v1.PubsubMessage;
import io.cloudevents.CloudEvent;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.logging.Level;
import java.util.logging.Logger;

  public void accept(CloudEvent event) throws InvalidProtocolBufferException {
    // Unmarshal data from CloudEvent
    String cloudEventData = new String(event.getData().toBytes(), StandardCharsets.UTF_8);
    StorageObjectData.Builder builder = StorageObjectData.newBuilder();
    JsonFormat.parser().merge(cloudEventData, builder);
    StorageObjectData gcsEvent = builder.build();

    String bucket = gcsEvent.getBucket();
    if (bucket.isEmpty()) {
      throw new IllegalArgumentException("Missing bucket parameter");
    String filename = gcsEvent.getName();
    if (filename.isEmpty()) {
      throw new IllegalArgumentException("Missing name parameter");

    detectText(bucket, filename);

以下函数使用 Cloud Vision API 从图片中提取文本,然后将文本加入队列中以进行翻译:


 * Detects the text in an image using the Google Vision API.
 * @param {string} bucketName Cloud Storage bucket name.
 * @param {string} filename Cloud Storage file name.
 * @returns {Promise}
const detectText = async (bucketName, filename) => {
  console.log(`Looking for text in image ${filename}`);
  const [textDetections] = await vision.textDetection(
  const [annotation] = textDetections.textAnnotations;
  const text = annotation ? annotation.description.trim() : '';
  console.log('Extracted text from image:', text);

  let [translateDetection] = await translate.detect(text);
  if (Array.isArray(translateDetection)) {
    [translateDetection] = translateDetection;
    `Detected language "${translateDetection.language}" for ${filename}`

  // Submit a message to the bus for each language we're going to translate to
  const TO_LANGS = process.env.TO_LANG.split(',');
  const topicName = process.env.TRANSLATE_TOPIC;

  const tasks = TO_LANGS.map(lang => {
    const messageData = {
      text: text,
      filename: filename,
      lang: lang,

    // Helper function that publishes translation result to a Pub/Sub topic
    // For more information on publishing Pub/Sub messages, see this page:
    //   https://cloud.google.com/pubsub/docs/publisher
    return publishResult(topicName, messageData);

  return Promise.all(tasks);


def detect_text(bucket: str, filename: str) -> None:
    """Extract the text from an image uploaded to Cloud Storage, then
    publish messages requesting subscribing services translate the text
    to each target language and save the result.

        bucket: name of GCS bucket in which the file is stored.
        filename: name of the file to be read.

    print(f"Looking for text in image {filename}")

    # Use the Vision API to extract text from the image
    image = vision.Image(
    text_detection_response = vision_client.text_detection(image=image)
    annotations = text_detection_response.text_annotations

    if annotations:
        text = annotations[0].description
        text = ""
    print(f"Extracted text {text} from image ({len(text)} chars).")

    detect_language_response = translate_client.detect_language(text)
    src_lang = detect_language_response["language"]
    print(f"Detected language {src_lang} for text {text}.")

    # Submit a message to the bus for each target language
    futures = []  # Asynchronous publish request statuses

    to_langs = os.environ.get("TO_LANG", "").split(",")
    for target_lang in to_langs:
        topic_name = os.environ.get("TRANSLATE_TOPIC")
        if src_lang == target_lang or src_lang == "und":
            topic_name = os.environ.get("RESULT_TOPIC")

        message = {
            "text": text,
            "filename": filename,
            "lang": target_lang,
            "src_lang": src_lang,

        message_data = json.dumps(message).encode("utf-8")
        topic_path = publisher.topic_path(project_id, topic_name)
        future = publisher.publish(topic_path, data=message_data)

    # Wait for each publish request to be completed before exiting
    for future in futures:


package ocr

import (

	visionpb "google.golang.org/genproto/googleapis/cloud/vision/v1"

// detectText detects the text in an image using the Google Vision API.
func detectText(ctx context.Context, bucketName, fileName string) error {
	log.Printf("Looking for text in image %v", fileName)
	maxResults := 1
	image := &visionpb.Image{
		Source: &visionpb.ImageSource{
			GcsImageUri: fmt.Sprintf("gs://%s/%s", bucketName, fileName),
	annotations, err := visionClient.DetectTexts(ctx, image, &visionpb.ImageContext{}, maxResults)
	if err != nil {
		return fmt.Errorf("DetectTexts: %w", err)
	text := ""
	if len(annotations) > 0 {
		text = annotations[0].Description
	if len(annotations) == 0 || len(text) == 0 {
		log.Printf("No text detected in image %q. Returning early.", fileName)
		return nil
	log.Printf("Extracted text %q from image (%d chars).", text, len(text))

	detectResponse, err := translateClient.DetectLanguage(ctx, []string{text})
	if err != nil {
		return fmt.Errorf("DetectLanguage: %w", err)
	if len(detectResponse) == 0 || len(detectResponse[0]) == 0 {
		return fmt.Errorf("DetectLanguage gave empty response")
	srcLang := detectResponse[0][0].Language.String()
	log.Printf("Detected language %q for text %q.", srcLang, text)

	// Submit a message to the bus for each target language
	for _, targetLang := range toLang {
		topicName := translateTopic
		if srcLang == targetLang || srcLang == "und" { // detection returns "und" for undefined language
			topicName = resultTopic
		targetTag, err := language.Parse(targetLang)
		if err != nil {
			return fmt.Errorf("language.Parse: %w", err)
		srcTag, err := language.Parse(srcLang)
		if err != nil {
			return fmt.Errorf("language.Parse: %w", err)
		message, err := json.Marshal(ocrMessage{
			Text:     text,
			FileName: fileName,
			Lang:     targetTag,
			SrcLang:  srcTag,
		if err != nil {
			return fmt.Errorf("json.Marshal: %w", err)
		topic := pubsubClient.Topic(topicName)
		ok, err := topic.Exists(ctx)
		if err != nil {
			return fmt.Errorf("Exists: %w", err)
		if !ok {
			topic, err = pubsubClient.CreateTopic(ctx, topicName)
			if err != nil {
				return fmt.Errorf("CreateTopic: %w", err)
		msg := &pubsub.Message{
			Data: []byte(message),
		log.Printf("Sending pubsub message: %s", message)
		if _, err = topic.Publish(ctx, msg).Get(ctx); err != nil {
			return fmt.Errorf("Get: %w", err)
	return nil


private void detectText(String bucket, String filename) {
  logger.info("Looking for text in image " + filename);

  List<AnnotateImageRequest> visionRequests = new ArrayList<>();
  String gcsPath = String.format("gs://%s/%s", bucket, filename);

  ImageSource imgSource = ImageSource.newBuilder().setGcsImageUri(gcsPath).build();
  Image img = Image.newBuilder().setSource(imgSource).build();

  Feature textFeature = Feature.newBuilder().setType(Feature.Type.TEXT_DETECTION).build();
  AnnotateImageRequest visionRequest = AnnotateImageRequest.newBuilder()

  // Detect text in an image using the Cloud Vision API
  AnnotateImageResponse visionResponse;
  try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
    visionResponse = client.batchAnnotateImages(visionRequests).getResponses(0);
    if (visionResponse == null || !visionResponse.hasFullTextAnnotation()) {
      logger.info(String.format("Image %s contains no text", filename));

    if (visionResponse.hasError()) {
      // Log error
          Level.SEVERE, "Error in vision API call: " + visionResponse.getError().getMessage());
  } catch (IOException e) {
    // Log error (since IOException cannot be thrown by a Cloud Function)
    logger.log(Level.SEVERE, "Error detecting text: " + e.getMessage(), e);

  String text = visionResponse.getFullTextAnnotation().getText();
  logger.info("Extracted text from image: " + text);

  // Detect language using the Cloud Translation API
  DetectLanguageRequest languageRequest = DetectLanguageRequest.newBuilder()
  DetectLanguageResponse languageResponse;
  try (TranslationServiceClient client = TranslationServiceClient.create()) {
    languageResponse = client.detectLanguage(languageRequest);
  } catch (IOException e) {
    // Log error (since IOException cannot be thrown by a function)
    logger.log(Level.SEVERE, "Error detecting language: " + e.getMessage(), e);

  if (languageResponse.getLanguagesCount() == 0) {
    logger.info("No languages were detected for text: " + text);

  String languageCode = languageResponse.getLanguages(0).getLanguageCode();
  logger.info(String.format("Detected language %s for file %s", languageCode, filename));

  // Send a Pub/Sub translation request for every language we're going to
  // translate to
  for (String targetLanguage : TO_LANGS) {
    logger.info("Sending translation request for language " + targetLanguage);
    OcrTranslateApiMessage message = new OcrTranslateApiMessage(text, filename, targetLanguage);
    ByteString byteStr = ByteString.copyFrom(message.toPubsubData());
    PubsubMessage pubsubApiMessage = PubsubMessage.newBuilder().setData(byteStr).build();
    try {
    } catch (InterruptedException | ExecutionException e) {
      // Log error
      logger.log(Level.SEVERE, "Error publishing translation request: " + e.getMessage(), e);


以下函数会翻译提取的文本,并将译文加入队列以保存回 Cloud Storage:


 * This function is exported by index.js, and is executed when
 * a message is published to the Cloud Pub/Sub topic specified
 * by the TRANSLATE_TOPIC environment variable. The function
 * translates text using the Google Translate API.
 * @param {object} cloudEvent The CloudEvent containing the Pub/Sub Message object
 * https://cloud.google.com/storage/docs/json_api/v1/objects
functions.cloudEvent('translateText', async cloudEvent => {
  const pubsubData = cloudEvent.data;
  const jsonStr = Buffer.from(pubsubData.message, 'base64').toString();
  const {text, filename, lang} = JSON.parse(jsonStr);

  if (!text) {
    throw new Error(
      'Text not provided. Make sure you have a "text" property in your request'
  if (!filename) {
    throw new Error(
      'Filename not provided. Make sure you have a "filename" property in your request'
  if (!lang) {
    throw new Error(
      'Language not provided. Make sure you have a "lang" property in your request'

  console.log(`Translating text into ${lang}`);
  const [translation] = await translate.translate(text, lang);

  console.log('Translated text:', translation);

  const messageData = {
    text: translation,
    filename: filename,
    lang: lang,

  await publishResult(process.env.RESULT_TOPIC, messageData);
  console.log(`Text translated to ${lang}`);


def translate_text(cloud_event: CloudEvent) -> None:
    """Cloud Function triggered by PubSub when a message is received from
    a subscription.

    Translates the text in the message from the specified source language
    to the requested target language, then sends a message requesting another
    service save the result.

    # Check that the received event is of the expected type, return error if not
    expected_type = "google.cloud.pubsub.topic.v1.messagePublished"
    received_type = cloud_event["type"]
    if received_type != expected_type:
        raise ValueError(f"Expected {expected_type} but received {received_type}")

    # Extract the message body, expected to be a JSON representation of a
    # dictionary, and extract the fields from that dictionary.
    data = cloud_event.data["message"]["data"]
        message_data = base64.b64decode(data)
        message = json.loads(message_data)

        text = message["text"]
        filename = message["filename"]
        target_lang = message["lang"]
        src_lang = message["src_lang"]
    except Exception as e:
        raise ValueError(f"Missing or malformed PubSub message {data}: {e}.")

    # Translate the text and publish a message with the translation
    print(f"Translating text into {target_lang}.")

    translated_text = translate_client.translate(
        text, target_language=target_lang, source_language=src_lang

    topic_name = os.environ["RESULT_TOPIC"]
    message = {
        "text": translated_text["translatedText"],
        "filename": filename,
        "lang": target_lang,
    message_data = json.dumps(message).encode("utf-8")
    topic_path = publisher.topic_path(project_id, topic_name)
    future = publisher.publish(topic_path, data=message_data)
    future.result()  # Wait for operation to complete


package ocr

import (


func init() {
	functions.CloudEvent("translate-text", TranslateText)

// TranslateText is executed when a message is published to the Cloud Pub/Sub
// topic specified by the TRANSLATE_TOPIC environment variable, and translates
// the text using the Google Translate API.
func TranslateText(ctx context.Context, cloudevent event.Event) error {
	var event MessagePublishedData
	if err := setup(ctx); err != nil {
		return fmt.Errorf("setup: %w", err)
	if err := cloudevent.DataAs(&event); err != nil {
		return fmt.Errorf("Failed to parse CloudEvent data: %w", err)
	if event.Message.Data == nil {
		log.Printf("event: %s", event)
		return fmt.Errorf("empty data")
	var message ocrMessage
	if err := json.Unmarshal(event.Message.Data, &message); err != nil {
		return fmt.Errorf("json.Unmarshal: %w", err)

	log.Printf("Translating text into %s.", message.Lang.String())
	opts := translate.Options{
		Source: message.SrcLang,
	translateResponse, err := translateClient.Translate(ctx, []string{message.Text}, message.Lang, &opts)
	if err != nil {
		return fmt.Errorf("Translate: %w", err)
	if len(translateResponse) == 0 {
		return fmt.Errorf("Empty Translate response")
	translatedText := translateResponse[0]

	messageData, err := json.Marshal(ocrMessage{
		Text:     translatedText.Text,
		FileName: message.FileName,
		Lang:     message.Lang,
		SrcLang:  message.SrcLang,
	if err != nil {
		return fmt.Errorf("json.Marshal: %w", err)

	topic := pubsubClient.Topic(resultTopic)
	ok, err := topic.Exists(ctx)
	if err != nil {
		return fmt.Errorf("Exists: %w", err)
	if !ok {
		topic, err = pubsubClient.CreateTopic(ctx, resultTopic)
		if err != nil {
			return fmt.Errorf("CreateTopic: %w", err)
	msg := &pubsub.Message{
		Data: messageData,
	if _, err = topic.Publish(ctx, msg).Get(ctx); err != nil {
		return fmt.Errorf("Get: %w", err)
	log.Printf("Sent translation: %q", translatedText.Text)
	return nil


import com.google.cloud.functions.CloudEventsFunction;
import com.google.cloud.pubsub.v1.Publisher;
import com.google.cloud.translate.v3.LocationName;
import com.google.cloud.translate.v3.TranslateTextRequest;
import com.google.cloud.translate.v3.TranslateTextResponse;
import com.google.cloud.translate.v3.TranslationServiceClient;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonDeserializationContext;
import com.google.gson.JsonDeserializer;
import com.google.gson.JsonElement;
import com.google.gson.JsonParseException;
import com.google.protobuf.ByteString;
import com.google.pubsub.v1.ProjectTopicName;
import com.google.pubsub.v1.PubsubMessage;
import functions.eventpojos.MessagePublishedData;
import io.cloudevents.CloudEvent;
import java.io.IOException;
import java.lang.reflect.Type;
import java.nio.charset.StandardCharsets;
import java.time.OffsetDateTime;
import java.util.concurrent.ExecutionException;
import java.util.logging.Level;
import java.util.logging.Logger;

public class OcrTranslateText implements CloudEventsFunction {
  private static final Logger logger = Logger.getLogger(OcrTranslateText.class.getName());

  // TODO<developer> set these environment variables
  private static final String PROJECT_ID = getenv("GCP_PROJECT");
  private static final String RESULTS_TOPIC_NAME = getenv("RESULT_TOPIC");
  private static final String LOCATION_NAME = LocationName.of(PROJECT_ID, "global").toString();

  private Publisher publisher;

  public OcrTranslateText() throws IOException {
    publisher = Publisher.newBuilder(ProjectTopicName.of(PROJECT_ID, RESULTS_TOPIC_NAME)).build();

  // Create custom deserializer to handle timestamps in event data
  class DateDeserializer implements JsonDeserializer<OffsetDateTime> {
    public OffsetDateTime deserialize(
        JsonElement json, Type typeOfT, JsonDeserializationContext context)
        throws JsonParseException {
      return OffsetDateTime.parse(json.getAsString());

  Gson gson =
      new GsonBuilder().registerTypeAdapter(OffsetDateTime.class, new DateDeserializer()).create();

  public void accept(CloudEvent event) throws InterruptedException, IOException {
    MessagePublishedData data =
            new String(event.getData().toBytes(), StandardCharsets.UTF_8),
    OcrTranslateApiMessage ocrMessage =

    String targetLang = ocrMessage.getLang();
    logger.info("Translating text into " + targetLang);

    // Translate text to target language
    String text = ocrMessage.getText();
    TranslateTextRequest request =

    TranslateTextResponse response;
    try (TranslationServiceClient client = TranslationServiceClient.create()) {
      response = client.translateText(request);
    } catch (IOException e) {
      // Log error (since IOException cannot be thrown by a function)
      logger.log(Level.SEVERE, "Error translating text: " + e.getMessage(), e);
    if (response.getTranslationsCount() == 0) {

    String translatedText = response.getTranslations(0).getTranslatedText();
    logger.info("Translated text: " + translatedText);

    // Send translated text to (subsequent) Pub/Sub topic
    String filename = ocrMessage.getFilename();
    OcrTranslateApiMessage translateMessage =
        new OcrTranslateApiMessage(translatedText, filename, targetLang);
    try {
      ByteString byteStr = ByteString.copyFrom(translateMessage.toPubsubData());
      PubsubMessage pubsubApiMessage = PubsubMessage.newBuilder().setData(byteStr).build();
      logger.info("Text translated to " + targetLang);
    } catch (InterruptedException | ExecutionException e) {
      // Log error (since these exception types cannot be thrown by a function)
      logger.log(Level.SEVERE, "Error publishing translation save request: " + e.getMessage(), e);

  // Avoid ungraceful deployment failures due to unset environment variables.
  // If you get this warning you should redeploy with the variable set.
  private static String getenv(String name) {
    String value = System.getenv(name);
    if (value == null) {
      logger.warning("Environment variable " + name + " was not set");
      value = "MISSING";
    return value;


最后,以下函数会接收译文并将其保存回 Cloud Storage:


 * This function is exported by index.js, and is executed when
 * a message is published to the Cloud Pub/Sub topic specified
 * by the RESULT_TOPIC environment variable. The function saves
 * the data packet to a file in GCS.
 * @param {object} cloudEvent The CloudEvent containing the Pub/Sub Message object.
 * https://cloud.google.com/storage/docs/json_api/v1/objects
functions.cloudEvent('saveResult', async cloudEvent => {
  const pubsubData = cloudEvent.data;
  const jsonStr = Buffer.from(pubsubData.message, 'base64').toString();
  const {text, filename, lang} = JSON.parse(jsonStr);

  if (!text) {
    throw new Error(
      'Text not provided. Make sure you have a "text" property in your request'
  if (!filename) {
    throw new Error(
      'Filename not provided. Make sure you have a "filename" property in your request'
  if (!lang) {
    throw new Error(
      'Language not provided. Make sure you have a "lang" property in your request'

  console.log(`Received request to save file ${filename}`);

  const bucketName = process.env.RESULT_BUCKET;
  const newFilename = renameImageForSave(filename, lang);
  const file = storage.bucket(bucketName).file(newFilename);

  console.log(`Saving result to ${newFilename} in bucket ${bucketName}`);

  await file.save(text);
  console.log('File saved.');


def save_result(cloud_event: CloudEvent) -> None:
    """Cloud Function triggered by PubSub when a message is received from
    a subscription.

    Saves translated text to a Cloud Storage object as requested.
    # Check that the received event is of the expected type, return error if not
    expected_type = "google.cloud.pubsub.topic.v1.messagePublished"
    received_type = cloud_event["type"]
    if received_type != expected_type:
        raise ValueError(f"Expected {expected_type} but received {received_type}")

    # Extract the message body, expected to be a JSON representation of a
    # dictionary, and extract the fields from that dictionary.
    data = cloud_event.data["message"]["data"]
        message_data = base64.b64decode(data)
        message = json.loads(message_data)

        text = message["text"]
        filename = message["filename"]
        lang = message["lang"]
    except Exception as e:
        raise ValueError(f"Missing or malformed PubSub message {data}: {e}.")

    print(f"Received request to save file {filename}.")

    # Save the translation in RESULT_BUCKET
    bucket_name = os.environ["RESULT_BUCKET"]
    result_filename = f"{filename}_{lang}.txt"
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(result_filename)

    print(f"Saving result to {result_filename} in bucket {bucket_name}.")


    print("File saved.")


package ocr

import (


func init() {
	functions.CloudEvent("save-result", SaveResult)

// SaveResult is executed when a message is published to the Cloud Pub/Sub topic
// specified by the RESULT_TOPIC environment vairable, and saves the data packet
// to a file in GCS.
func SaveResult(ctx context.Context, cloudevent event.Event) error {
	var event MessagePublishedData
	if err := setup(ctx); err != nil {
		return fmt.Errorf("ProcessImage: %w", err)
	if err := cloudevent.DataAs(&event); err != nil {
		return fmt.Errorf("Failed to parse CloudEvent data: %w", err)
	var message ocrMessage
	if event.Message.Data == nil {
		return fmt.Errorf("Empty data")
	if err := json.Unmarshal(event.Message.Data, &message); err != nil {
		return fmt.Errorf("json.Unmarshal: %w", err)
	log.Printf("Received request to save file %q.", message.FileName)

	resultFilename := fmt.Sprintf("%s_%s.txt", message.FileName, message.Lang)
	bucket := storageClient.Bucket(resultBucket)

	log.Printf("Saving result to %q in bucket %q.", resultFilename, resultBucket)

	w := bucket.Object(resultFilename).NewWriter(ctx)
	defer w.Close()
	fmt.Fprint(w, message.Text)

	log.Printf("File saved.")
	return nil


import com.google.cloud.functions.CloudEventsFunction;
import com.google.cloud.storage.BlobId;
import com.google.cloud.storage.BlobInfo;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonDeserializationContext;
import com.google.gson.JsonDeserializer;
import com.google.gson.JsonElement;
import com.google.gson.JsonParseException;
import functions.eventpojos.MessagePublishedData;
import io.cloudevents.CloudEvent;
import java.lang.reflect.Type;
import java.nio.charset.StandardCharsets;
import java.time.OffsetDateTime;
import java.util.logging.Logger;

public class OcrSaveResult implements CloudEventsFunction {
  // TODO<developer> set this environment variable
  private static final String RESULT_BUCKET = System.getenv("RESULT_BUCKET");

  private static final Storage STORAGE = StorageOptions.getDefaultInstance().getService();
  private static final Logger logger = Logger.getLogger(OcrSaveResult.class.getName());

  // Configure Gson with custom deserializer to handle timestamps in event data
  class DateDeserializer implements JsonDeserializer<OffsetDateTime> {
    public OffsetDateTime deserialize(
        JsonElement json, Type typeOfT, JsonDeserializationContext context)
        throws JsonParseException {
      return OffsetDateTime.parse(json.getAsString());

  Gson gson =
      new GsonBuilder().registerTypeAdapter(OffsetDateTime.class, new DateDeserializer()).create();

  public void accept(CloudEvent event) {
    // Unmarshal data from CloudEvent
    MessagePublishedData data =
            new String(event.getData().toBytes(), StandardCharsets.UTF_8),
    OcrTranslateApiMessage ocrMessage =

    logger.info("Received request to save file " + ocrMessage.getFilename());

    String newFileName =
        String.format("%s_to_%s.txt", ocrMessage.getFilename(), ocrMessage.getLang());

    // Save file to RESULT_BUCKET with name newFileName
    logger.info(String.format("Saving result to %s in bucket %s", newFileName, RESULT_BUCKET));
    BlobInfo blobInfo = BlobInfo.newBuilder(BlobId.of(RESULT_BUCKET, newFileName)).build();
    STORAGE.create(blobInfo, ocrMessage.getText().getBytes(StandardCharsets.UTF_8));
    logger.info("File saved");


  1. 如需部署带有 Cloud Storage 触发器的图片处理函数,请在包含示例代码(如果是 Java,则为 pom.xml 文件)的目录中运行以下命令:

    gcloud functions deploy ocr-extract \
    --gen2 \
    --runtime=nodejs20 \
    --region=REGION \
    --source=. \
    --entry-point=processImage \
    --trigger-bucket YOUR_IMAGE_BUCKET_NAME \

    使用 --runtime 标志可以指定支持的 Node.js 版本的运行时 ID 来运行您的函数。


    gcloud functions deploy ocr-extract \
    --gen2 \
    --runtime=python312 \
    --region=REGION \
    --source=. \
    --entry-point=process_image \
    --trigger-bucket YOUR_IMAGE_BUCKET_NAME \

    使用 --runtime 标志可以指定支持的 Python 版本的运行时 ID 来运行您的函数。


    gcloud functions deploy ocr-extract \
    --gen2 \
    --runtime=go121 \
    --region=REGION \
    --source=. \
    --entry-point=process-image \
    --trigger-bucket YOUR_IMAGE_BUCKET_NAME \

    使用 --runtime 标志可以指定支持的 Go 版本的运行时 ID 来运行您的函数。


    gcloud functions deploy ocr-extract \
    --gen2 \
    --runtime=java17 \
    --region=REGION \
    --source=. \
    --entry-point=functions.OcrProcessImage \
    --memory=512MB \
    --trigger-bucket YOUR_IMAGE_BUCKET_NAME \

    使用 --runtime 标志可以指定支持的 Java 版本的运行时 ID 来运行您的函数。


    • REGION:要在其中部署函数的 Google Cloud 区域的名称(例如 us-west1)。
    • YOUR_IMAGE_BUCKET_NAME:要向其中上传图片的 Cloud Storage 存储桶的名称。部署第 2 代函数时,请仅指定不含前导 gs:// 的存储桶名称;例如 --trigger-event-filters="bucket=my-bucket"
  2. 如需部署带有 Cloud Pub/Sub 触发器的文本翻译函数,请在包含示例代码(如果是 Java,则为 pom.xml 文件)的目录中运行以下命令:

    gcloud functions deploy ocr-translate \
    --gen2 \
    --runtime=nodejs20 \
    --region=REGION \
    --source=. \
    --entry-point=translateText \
    --trigger-topic YOUR_TRANSLATE_TOPIC_NAME \

    使用 --runtime 标志可以指定支持的 Node.js 版本的运行时 ID 来运行您的函数。


    gcloud functions deploy ocr-translate \
    --gen2 \
    --runtime=python312 \
    --region=REGION \
    --source=. \
    --entry-point=translate_text \
    --trigger-topic YOUR_TRANSLATE_TOPIC_NAME \

    使用 --runtime 标志可以指定支持的 Python 版本的运行时 ID 来运行您的函数。


    gcloud functions deploy ocr-translate \
    --gen2 \
    --runtime=go121 \
    --region=REGION \
    --source=. \
    --entry-point=translate-text \
    --trigger-topic YOUR_TRANSLATE_TOPIC_NAME \

    使用 --runtime 标志可以指定支持的 Go 版本的运行时 ID 来运行您的函数。


    gcloud functions deploy ocr-translate \
    --gen2 \
    --runtime=java17 \
    --region=REGION \
    --source=. \
    --entry-point=functions.OcrTranslateText \
    --memory=512MB \
    --trigger-topic YOUR_TRANSLATE_TOPIC_NAME \

    使用 --runtime 标志可以指定支持的 Java 版本的运行时 ID 来运行您的函数。

  3. 如需部署带有 Cloud Pub/Sub 触发器、将结果保存到 Cloud Storage 的函数,请在包含示例代码(如果是 Java,则为 pom.xml 文件)的目录中运行以下命令:

    gcloud functions deploy ocr-save \
    --gen2 \
    --runtime=nodejs20 \
    --region=REGION \
    --source=. \
    --entry-point=saveResult \
    --trigger-topic YOUR_RESULT_TOPIC_NAME \

    使用 --runtime 标志可以指定支持的 Node.js 版本的运行时 ID 来运行您的函数。


    gcloud functions deploy ocr-save \
    --gen2 \
    --runtime=python312 \
    --region=REGION \
    --source=. \
    --entry-point=save_result \
    --trigger-topic YOUR_RESULT_TOPIC_NAME \

    使用 --runtime 标志可以指定支持的 Python 版本的运行时 ID 来运行您的函数。


    gcloud functions deploy ocr-save \
    --gen2 \
    --runtime=go121 \
    --region=REGION \
    --source=. \
    --entry-point=save-result \
    --trigger-topic YOUR_RESULT_TOPIC_NAME \

    使用 --runtime 标志可以指定支持的 Go 版本的运行时 ID 来运行您的函数。


    gcloud functions deploy ocr-save \
    --gen2 \
    --runtime=java17 \
    --region=REGION \
    --source=. \
    --entry-point=functions.OcrSaveResult \
    --memory=512MB \
    --trigger-topic YOUR_RESULT_TOPIC_NAME \

    使用 --runtime 标志可以指定支持的 Java 版本的运行时 ID 来运行您的函数。


  1. 将一张图片上传到您的图片 Cloud Storage 存储分区:



    • PATH_TO_IMAGE 是本地系统上的图片文件(其中包含文本)的路径。
    • YOUR_IMAGE_BUCKET_NAME 是您要向其中上传图片的存储分区的名称。


  2. 查看日志以确保执行完成:

    gcloud functions logs read --limit 100
  3. 您可以在用于 YOUR_RESULT_BUCKET_NAME 的 Cloud Storage 存储分区中查看保存的译文。


为避免因本教程中使用的资源导致您的 Google Cloud 账号产生费用,请删除包含这些资源的项目,或者保留项目但删除各个资源。




  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

删除 Cloud Functions 函数

删除 Cloud Functions 函数不会移除存储在 Cloud Storage 中的任何资源。

如需删除您在本教程中创建的 Cloud Functions 函数,请运行以下命令:

gcloud functions delete ocr-extract
gcloud functions delete ocr-translate
gcloud functions delete ocr-save

您也可以通过 Google Cloud 控制台删除 Cloud Functions 函数。