このページは Cloud Translation API によって翻訳されました。

写真のテキストの翻訳

このページでは、画像からテキストを検出する方法、翻訳をカスタマイズする方法、テキストから合成音声を生成する方法を説明します。このチュートリアルでは、Cloud Vision を使用して画像ファイルからテキストを検出します。次に、Cloud Translation を使用して、検出されたテキストのカスタム翻訳を提供する方法を説明します。最後に、Text-to-Speech を使用して、翻訳テキストのマシンディクテーションを生成します。

目標

Cloud Vision API により認識されたテキストを Cloud Translation API に渡します。
Cloud Translation の用語集を作成し、それを使用して Cloud Translation API の翻訳をカスタマイズします。
Text-to-Speech API を使用して、翻訳されたテキストの音声表現を作成します。

料金

Google Cloud API ごとに料金体系が異なります。

料金の詳細は、Cloud Vision の料金ガイド、Cloud Translation の料金ガイド、Text-to Speech の料金ガイドをご覧ください。

始める前に

以下の準備を行います。

Vision API、Cloud Translation API、Text-to-Speech API が有効化されている Google Cloud コンソールのプロジェクト
Python または NodeJS プログラミングの基本的な知識

クライアントライブラリの設定

このチュートリアルでは、Vision、Translation、Text-to-Speech の各クライアントライブラリを使用します。

関連するクライアントライブラリをインストールするには、次のコマンドをターミナルから実行します。

Python

  pip install --upgrade google-cloud-vision
  pip install --upgrade google-cloud-translate
  pip install --upgrade google-cloud-texttospeech

Node.js

  npm install --save @google-cloud/vision
  npm install --save @google-cloud/translate
  npm install --save @google-cloud/text-to-speech

用語集の作成権限を設定する

翻訳用語集を作成するには、「Cloud Translation API 編集者」の権限が付与されたサービスアカウントキーを使用する必要があります。

Cloud Translation API 編集者権限を持つサービスアカウントキーを設定するには、次の手順に従います。

サービスアカウントの作成:
1. Google Cloud コンソールで、[サービスアカウント] ページに移動します。
  
  [サービスアカウント] に移動
2. プロジェクトを選択します。
3. [サービスアカウントを作成] をクリックします。
4. [サービスアカウント名] フィールドに名前を入力します。Google Cloud コンソールでは、この名前に基づいて [サービスアカウント ID] フィールドに値が設定されます。
5. 省略可: [サービスアカウントの説明] 欄に、サービスアカウントの説明を入力します。
6. [作成して続行] をクリックします。
7. [ロールを選択] フィールドをクリックし、[Cloud Translation] > [Cloud Translation API 編集者] を選択します。
8. [完了] をクリックして、サービスアカウントの作成を完了します。
  
  ブラウザウィンドウは閉じないでください。次のステップでこれを使用します。
作成したサービスアカウントの JSON キーをダウンロードします。
1. Google Cloud コンソールで、作成したサービスアカウントのメールアドレスをクリックします。
2. [キー] をクリックします。
3. [鍵を追加]、[新しい鍵を作成] の順にクリックします。
4. [CREATE] をクリックします。JSON キーファイルがパソコンにダウンロードされます。
  
  鍵ファイルは、サービスアカウントとしての認証で使用できるため、安全な場所に保管してください。このファイルは任意の場所に移動できます。名前の変更も可能です。
5. [閉じる] をクリックします。
ターミナルで、次のコマンドを使用して GOOGLE_APPLICATION_CREDENTIALS 変数を設定します。 path_to_key は、新しいサービスアカウントキーが含まれている、ダウンロードした JSON ファイルへのパスに置き換えます。
Linux または macOS
```
export GOOGLE_APPLICATION_CREDENTIALS=path_to_key
```
Windows
```
set GOOGLE_APPLICATION_CREDENTIALS=path_to_key
```

ライブラリのインポート

このチュートリアルでは、次のシステムインポートとクライアントライブラリインポートを使用します。

Python

このサンプルを試す前に、Cloud Translation クイックスタート: クライアントライブラリの使用にある Python の設定手順を完了してください。詳細については、Cloud Translation Python API リファレンスドキュメントをご覧ください。

Cloud Translation に対する認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証を設定するをご覧ください。

import html
import os

# Imports the Google Cloud client libraries
from google.api_core.exceptions import AlreadyExists
from google.cloud import texttospeech
from google.cloud import translate_v3beta1 as translate
from google.cloud import vision

Node.js

このサンプルを試す前に、Cloud Translation クイックスタート: クライアントライブラリの使用にある Node.js の設定手順を完了してください。詳細については、Cloud Translation Node.js API リファレンスドキュメントをご覧ください。

// Imports the Google Cloud client library
const textToSpeech = require('@google-cloud/text-to-speech');
const translate = require('@google-cloud/translate').v3beta1;
const vision = require('@google-cloud/vision');

// Import other required libraries
const fs = require('fs');
//const escape = require('escape-html');
const util = require('util');

プロジェクト ID の設定

Google Cloud API へのリクエストごとに、Google Cloud プロジェクトを関連付ける必要があります。GCLOUD_PROJECT 環境変数をターミナルから設定して、Google Cloud プロジェクトを指定します。

次のコマンドで、project-id は、 Google Cloud プロジェクト ID に置き換えます。次のコマンドをターミナルから実行します。

Linux または macOS

export GCLOUD_PROJECT=project-id

Windows

set GCLOUD_PROJECT=project-id

Vision を使用して画像からテキストを検出する

Vision API を使用して、画像からテキストを検出して抽出します。Vision API は光学式文字認識（OCR）を採用し、高密度テキスト検出機能（DOCUMENT_TEXT_DETECTION）とスパーステキスト検出機能（TEXT_DETECTION）の 2 つのテキスト検出機能をサポートします。

次のコードは、Vision API の DOCUMENT_TEXT_DETECTION 機能を使用して高密度テキストを含む写真からテキストを検出する方法を示しています。

Python

def pic_to_text(infile: str) -> str:
    """Detects text in an image file

    Args:
    infile: path to image file

    Returns:
    String of text detected in image
    """

    # Instantiates a client
    client = vision.ImageAnnotatorClient()

    # Opens the input image file
    with open(infile, "rb") as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    # For dense text, use document_text_detection
    # For less dense text, use text_detection
    response = client.document_text_detection(image=image)
    text = response.full_text_annotation.text
    print(f"Detected text: {text}")

    return text

Node.js

/**
 * Detects text in an image file
 *
 * ARGS
 * inputFile: path to image file
 * RETURNS
 * string of text detected in the input image
 **/
async function picToText(inputFile) {
  // Creates a client
  const client = new vision.ImageAnnotatorClient();

  // Performs text detection on the local file
  const [result] = await client.textDetection(inputFile);
  return result.fullTextAnnotation.text;
}

Translation で用語集を使用する

テキストを画像から抽出したら、翻訳用語集を使用して、抽出したテキストの翻訳をカスタマイズします。Cloud Translation API によって翻訳された訳語は、用語集に事前に定義された訳語により上書きされます。

用語集には、次のようなユースケースがあります。

商品名: たとえば、「Google Home」は「Google Home」と翻訳されます。
曖昧語: たとえば、「bat」という単語は、スポーツ用具のバット、または動物のコウモリを意味します。スポーツに関する内容を翻訳している場合は、「bat」の訳として動物用語ではなくスポーツ用語を Cloud Translation API に提示できる用語集を使用する必要があります。
借用語: たとえば、フランス語の「bouillabaisse」の英語訳は「bouillabaisse」です。これは、英語がフランス語の「bouillabaisse」という語を借用しているためです。フランスの文化的背景を知らない英語の話者は、「bouillabaisse」がフランスの魚介類の煮込み料理であることを知らないことがあります。用語集を使用することで、フランス語の「bouillabaisse」が英語で「fish stew」と訳されるように、オーバーライドできます。

用語集ファイルを作成する

Cloud Translation API は TSV、CSV、TMX 形式の用語集ファイルに対応しています。このチュートリアルでは、Cloud Storage にアップロードされた CSV ファイルを使用して対訳集を定義します。

用語集 CSV ファイルを作成するには:

列の言語を指定するには、ISO-639 と BCP-47 のいずれかの言語コードを CSV ファイルの 1 行目で使用します。
```
fr,en,
```
CSV ファイルの各行で対訳のペアをリストします。対訳となる語句をカンマで区切ります。次の例では、いくつかのフランス語の料理用語の英語訳を定義しています。
```
fr,en,
chèvre,goat cheese,
crème brulée,crème brulée,
bouillabaisse,fish stew,
steak frites,steak with french fries,
```

単語の異形を定義します。Cloud Translation API では、大文字と小文字、アクセント付き単語などの特殊文字が区別されます。単語のいろいろなスペルを明示的に定義して、単語のバリエーションに用語集が対処できるようにします。

fr,en,
chevre,goat cheese,
Chevre,Goat cheese,
chèvre,goat cheese,
Chèvre,Goat cheese,
crème brulée,crème brulée,
Crème brulée,Crème brulée,
Crème Brulée,Crème Brulée,
bouillabaisse,fish stew,
Bouillabaisse,Fish stew,
steak frites,steak with french fries,
Steak frites,Steak with french fries,
Steak Frites,Steak with French Fries,

用語集を Cloud Storage バケットにアップロードします。このチュートリアルでは、用語集ファイルの Cloud Storage バケットへのアップロードや、Cloud Storage バケットの作成を行う必要はありません。代わりにこのチュートリアルのために作成された一般公開の用語集ファイルを使用して、Cloud Storage の料金が発生しないようにします。Cloud Storage の用語集ファイルの URI を Cloud Translation API に送信して、用語集リソースを作成します。このチュートリアル用に一般公開されている用語集ファイルの URI はgs://cloud-samples-data/translation/bistro_glossary.csvです。用語集をダウンロードするには、上の URI リンクをクリックします。ただし、新しいタブで開かないでください。

用語集リソースを作成する

用語集を使用するには、Cloud Translation API を使用して用語集リソースを作成する必要があります。用語集リソースを作成するには、Cloud Storage にある用語集ファイルの URI を Cloud Translation API に送信します。

必ず「Cloud Translation API 編集者」の権限があるサービスアカウントキーを使用してください。また、プロジェクト ID をターミナルから設定したことを確認してください。

次の関数は、用語集リソースを作成します。このチュートリアルの次のステップでは、この用語集リソースを使用して、翻訳リクエストをカスタマイズします。

Python

def create_glossary(
    languages: list,
    project_id: str,
    glossary_name: str,
    glossary_uri: str,
) -> str:
    """Creates a GCP glossary resource
    Assumes you've already manually uploaded a glossary to Cloud Storage

    Args:
    languages: list of languages in the glossary
    project_id: GCP project id
    glossary_name: name you want to give this glossary resource
    glossary_uri: the uri of the glossary you uploaded to Cloud Storage

    Returns:
    name of the created or existing glossary
    """

    # Instantiates a client
    client = translate.TranslationServiceClient()

    # Designates the data center location that you want to use
    location = "us-central1"

    # Set glossary resource name
    name = client.glossary_path(project_id, location, glossary_name)

    # Set language codes
    language_codes_set = translate.Glossary.LanguageCodesSet(language_codes=languages)

    gcs_source = translate.GcsSource(input_uri=glossary_uri)

    input_config = translate.GlossaryInputConfig(gcs_source=gcs_source)

    # Set glossary resource information
    glossary = translate.Glossary(
        name=name, language_codes_set=language_codes_set, input_config=input_config
    )

    parent = f"projects/{project_id}/locations/{location}"

    # Create glossary resource
    # Handle exception for case in which a glossary
    #  with glossary_name already exists
    try:
        operation = client.create_glossary(parent=parent, glossary=glossary)
        operation.result(timeout=90)
        print("Created glossary " + glossary_name + ".")
    except AlreadyExists:
        print(
            "The glossary "
            + glossary_name
            + " already exists. No new glossary was created."
        )

    return glossary_name

Node.js

/** Creates a GCP glossary resource
 * Assumes you've already manually uploaded a glossary to Cloud Storage
 *
 * ARGS
 * languages: list of languages in the glossary
 * projectId: GCP project id
 * glossaryName: name you want to give this glossary resource
 * glossaryUri: the uri of the glossary you uploaded to Cloud Storage
 * RETURNS
 * nothing
 **/
async function createGlossary(
  languages,
  projectId,
  glossaryName,
  glossaryUri
) {
  // Instantiates a client
  const translationClient = await new translate.TranslationServiceClient();

  // Construct glossary
  const glossary = {
    languageCodesSet: {
      languageCodes: languages,
    },
    inputConfig: {
      gcsSource: {
        inputUri: glossaryUri,
      },
    },
    name: translationClient.glossaryPath(
      projectId,
      'us-central1',
      glossaryName
    ),
  };

  // Construct request
  const request = {
    parent: translationClient.locationPath(projectId, 'us-central1'),
    glossary: glossary,
  };

  // Create glossary using a long-running operation.
  try {
    const [operation] = await translationClient.createGlossary(request);
    // Wait for operation to complete.
    await operation.promise();
    console.log('Created glossary ' + glossaryName + '.');
  } catch (AlreadyExists) {
    console.log(
      'The glossary ' +
        glossaryName +
        ' already exists. No new glossary was created.'
    );
  }
}

用語集を使用して翻訳する

用語集リソースを作成したら、この用語集リソースを使用して Cloud Translation API に送信するテキストの翻訳をカスタマイズできます。

次の関数は、前に作成した用語集リソースを使用してテキストの翻訳をカスタマイズします。

Python

def translate_text(
    text: str,
    source_language_code: str,
    target_language_code: str,
    project_id: str,
    glossary_name: str,
) -> str:
    """Translates text to a given language using a glossary

    Args:
    text: String of text to translate
    source_language_code: language of input text
    target_language_code: language of output text
    project_id: GCP project id
    glossary_name: name you gave your project's glossary
        resource when you created it

    Return:
    String of translated text
    """

    # Instantiates a client
    client = translate.TranslationServiceClient()

    # Designates the data center location that you want to use
    location = "us-central1"

    glossary = client.glossary_path(project_id, location, glossary_name)

    glossary_config = translate.TranslateTextGlossaryConfig(glossary=glossary)

    parent = f"projects/{project_id}/locations/{location}"

    result = client.translate_text(
        request={
            "parent": parent,
            "contents": [text],
            "mime_type": "text/plain",  # mime types: text/plain, text/html
            "source_language_code": source_language_code,
            "target_language_code": target_language_code,
            "glossary_config": glossary_config,
        }
    )

    # Extract translated text from API response
    return result.glossary_translations[0].translated_text

Node.js

/**
 * Translates text to a given language using a glossary
 *
 * ARGS
 * text: String of text to translate
 * sourceLanguageCode: language of input text
 * targetLanguageCode: language of output text
 * projectId: GCP project id
 * glossaryName: name you gave your project's glossary
 *     resource when you created it
 * RETURNS
 * String of translated text
 **/
async function translateText(
  text,
  sourceLanguageCode,
  targetLanguageCode,
  projectId,
  glossaryName
) {
  // Instantiates a client
  const translationClient = new translate.TranslationServiceClient();
  const glossary = translationClient.glossaryPath(
    projectId,
    'us-central1',
    glossaryName
  );
  const glossaryConfig = {
    glossary: glossary,
  };
  // Construct request
  const request = {
    parent: translationClient.locationPath(projectId, 'us-central1'),
    contents: [text],
    mimeType: 'text/plain', // mime types: text/plain, text/html
    sourceLanguageCode: sourceLanguageCode,
    targetLanguageCode: targetLanguageCode,
    glossaryConfig: glossaryConfig,
  };

  // Run request
  const [response] = await translationClient.translateText(request);
  // Extract the string of translated text
  return response.glossaryTranslations[0].translatedText;
}

音声合成マークアップ言語を利用してテキスト読み上げを使用する

画像検出テキストの翻訳をカスタマイズしたので、Text-to-Speech API を使用する準備ができました。Text-to-Speech API は、翻訳したテキストの合成音声を作成できます。

Text-to-Speech API は合成音声を、書式なしテキストの文字列や、音声合成マークアップ言語（SSML）でマークアップされたテキストの文字列から生成します。SSML は、SSML タグでテキストに注釈を付けるマークアップ言語です。SSML タグを使用すると、Text-to-Speech API が合成音声の作成のフォーマット化をどのように行うかを指定できます。

次の関数によって、SSML の文字列が合成音声の MP3 ファイルに変換されます。

Python

def text_to_speech(text: str, outfile: str) -> str:
    """Converts plaintext to SSML and
    generates synthetic audio from SSML

    Args:

    text: text to synthesize
    outfile: filename to use to store synthetic audio

    Returns:
    String of synthesized audio
    """

    # Replace special characters with HTML Ampersand Character Codes
    # These Codes prevent the API from confusing text with
    # SSML commands
    # For example, '<' --> '&lt;' and '&' --> '&amp;'
    escaped_lines = html.escape(text)

    # Convert plaintext to SSML in order to wait two seconds
    #   between each line in synthetic speech
    ssml = "<speak>{}</speak>".format(
        escaped_lines.replace("\n", '\n<break time="2s"/>')
    )

    # Instantiates a client
    client = texttospeech.TextToSpeechClient()

    # Sets the text input to be synthesized
    synthesis_input = texttospeech.SynthesisInput(ssml=ssml)

    # Builds the voice request, selects the language code ("en-US") and
    # the SSML voice gender ("MALE")
    voice = texttospeech.VoiceSelectionParams(
        language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.MALE
    )

    # Selects the type of audio file to return
    audio_config = texttospeech.AudioConfig(
        audio_encoding=texttospeech.AudioEncoding.MP3
    )

    # Performs the text-to-speech request on the text input with the selected
    # voice parameters and audio file type

    request = texttospeech.SynthesizeSpeechRequest(
        input=synthesis_input, voice=voice, audio_config=audio_config
    )

    response = client.synthesize_speech(request=request)

    # Writes the synthetic audio to the output file.
    with open(outfile, "wb") as out:
        out.write(response.audio_content)
        print("Audio content written to file " + outfile)

Node.js

/**
 * Generates synthetic audio from plaintext tagged with SSML.
 *
 * Given the name of a text file and an output file name, this function
 * tags the text in the text file with SSML. This function then
 * calls the Text-to-Speech API. The API returns a synthetic audio
 * version of the text, formatted according to the SSML commands. This
 * function saves the synthetic audio to the designated output file.
 *
 * ARGS
 * text: String of plaintext
 * outFile: String name of file under which to save audio output
 * RETURNS
 * nothing
 *
 */
async function syntheticAudio(text, outFile) {
  // Replace special characters with HTML Ampersand Character Codes
  // These codes prevent the API from confusing text with SSML tags
  // For example, '<' --> '&lt;' and '&' --> '&amp;'
  let escapedLines = text.replace(/&/g, '&amp;');
  escapedLines = escapedLines.replace(/"/g, '&quot;');
  escapedLines = escapedLines.replace(/</g, '&lt;');
  escapedLines = escapedLines.replace(/>/g, '&gt;');

  // Convert plaintext to SSML
  // Tag SSML so that there is a 2 second pause between each address
  const expandedNewline = escapedLines.replace(/\n/g, '\n<break time="2s"/>');
  const ssmlText = '<speak>' + expandedNewline + '</speak>';

  // Creates a client
  const client = new textToSpeech.TextToSpeechClient();

  // Constructs the request
  const request = {
    // Select the text to synthesize
    input: {ssml: ssmlText},
    // Select the language and SSML Voice Gender (optional)
    voice: {languageCode: 'en-US', ssmlGender: 'MALE'},
    // Select the type of audio encoding
    audioConfig: {audioEncoding: 'MP3'},
  };

  // Performs the Text-to-Speech request
  const [response] = await client.synthesizeSpeech(request);
  // Write the binary audio content to a local file
  const writeFile = util.promisify(fs.writeFile);
  await writeFile(outFile, response.audioContent, 'binary');
  console.log('Audio content written to file ' + outFile);
}

すべてを組み合わせる

前のステップでは、Vision、Translation、Text-to-Speech を使用する関数を hybrid_glossaries.py に定義しました。これで、これらの関数を使用して次の写真から翻訳テキストの合成音声を生成する準備ができました。

次のコードは hybrid_glossaries.py で定義された関数を呼び出して以下を行います:

Cloud Translation API の用語集リソースの作成する
Vision API を使用して上記画像からテキストを検出する
Cloud Translation API の用語集を使用して検出されたテキストを翻訳する
翻訳されたテキストの Text-to-Speech 合成音声を生成する

Python

def main() -> None:
    """This method is called when the tutorial is run in the Google Cloud
    Translation API. It creates a glossary, translates text to
    French, and speaks the translated text.

    Args:
    None

    Returns:
    None
    """
    # Photo from which to extract text
    infile = "resources/example.png"
    # Name of file that will hold synthetic speech
    outfile = "resources/example.mp3"

    # Defines the languages in the glossary
    # This list must match the languages in the glossary
    #   Here, the glossary includes French and English
    glossary_langs = ["fr", "en"]
    # Name that will be assigned to your project's glossary resource
    glossary_name = "bistro-glossary"
    # uri of .csv file uploaded to Cloud Storage
    glossary_uri = "gs://cloud-samples-data/translation/bistro_glossary.csv"

    created_glossary_name = create_glossary(
        glossary_langs, PROJECT_ID, glossary_name, glossary_uri
    )

    # photo -> detected text
    text_to_translate = pic_to_text(infile)
    # detected text -> translated text
    text_to_speak = translate_text(
        text_to_translate, "fr", "en", PROJECT_ID, created_glossary_name
    )
    # translated text -> synthetic audio
    text_to_speech(text_to_speak, outfile)

Node.js

await createGlossary(glossaryLangs, projectId, glossaryName, glossaryUri);
const text = await picToText(inFile);
const translatedText = await translateText(
  text,
  'fr',
  'en',
  projectId,
  glossaryName
);
syntheticAudio(translatedText, outFile);

コードの実行

コードを実行するには、ターミナルのコードが存在するディレクトリで次のコマンドを入力します。

Python

python hybrid_tutorial.py

Node.js

  node hybridGlossaries.js

次の出力が表示されます。

Created glossary bistro-glossary.
Audio content written to file resources/example.mp3

コードの実行後、hybrid_glossaries ディレクトリから resources ディレクトリに移動します。resources ディレクトリで example.mp3 ファイルを調べます。

次の音声クリップを聴いて example.mp3 ファイルが同じように聞こえることを確認します。

エラーメッセージのトラブルシューティング

```
403 IAM permission 'cloudtranslate.glossaries.create' denied.
```
この例外は、「Cloud Translation API 編集者」権限がないサービスアカウントキーを使用した場合に発生します。
```
KeyError: 'GCLOUD_PROJECT'
```
このエラーは、GCLOUD_PROJECT 変数を設定していない場合に発生します。
```
400 Invalid resource name project id
```
この例外は、小文字、数字、ピリオド、コロン、ハイフン以外の文字を含む用語集名を使用した場合や、「Cloud Translation API 編集者」権限のないサービスアカウントキーを使用した場合に発生します。
```
File filename was not found.
```
この例外は、GOOGLE_APPLICATION_CREDENTIALS 変数が無効なファイルパスに設定されている場合に発生します。

Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application

この例外は、GOOGLE_APPLICATION_CREDENTIALS 変数を設定していない場合に発生します。

```
Forbidden: 403 POST API has not been used or is disabled
```
この警告は、Cloud Translation API、Cloud Vision API、Text-to-Speech API を呼び出すときにそれぞれの API を有効化していない場合に発生します。
```
AttributeError: 'module' object has no attribute 'escape'
```
Python 2.7.10 以前は HTML に対応していません。このエラーを解決するには、Python 仮想環境を使用します。仮想環境では、最新バージョンの Python を使用します。
```
UnicodeEncodeError
```
Python 2.7.10 以前は HTML に対応していません。このエラーを解決するには、Python 仮想環境を使用します。仮想環境では、最新バージョンの Python を使用します。

クリーンアップ

プロジェクトを必要としない場合は、Google Cloud コンソールを使用して削除してください。プロジェクトを削除すると、このチュートリアルで使用したリソースに対して Cloud 請求先アカウントに追加料金が発生しなくなります。

プロジェクトの削除

Google Cloud コンソールで、[プロジェクト] ページに移動します。
プロジェクトリストで、削除するプロジェクトを選択し、[削除] をクリックします。
ダイアログボックスで、プロジェクト ID を入力して、[シャットダウン] をクリックしてプロジェクトを削除します。

次のステップ

これで完了です。Vision OCR を使用して画像からテキストを検出しました。次に、翻訳用語集を作成し、その用語集を使った翻訳を行いました。その後、Text-to-Speech を使用して翻訳テキストの合成音声を生成しました。

Vision、Translation、テキスト読み上げの知識を活かすには:

自分の用語集を作成します。Cloud Storage バケットを作成する方法と用語集 CSV ファイルをバケットにアップロードする方法をご覧ください。
翻訳用語集の他の使用方法を試してみましょう。
Cloud Vision OCR で Cloud Storage を使用する方法をご覧ください。
テキスト読み上げで SSML を使用する方法の詳細をご覧ください。
Vision API の imageContext フィールドを使用して Vision OCR を使用するときに、写真に関する追加コンテキストを渡す方法をご覧ください。
コミュニティチュートリアルをご覧ください。

写真のテキストの翻訳

目標

料金

始める前に

クライアント ライブラリの設定

Python

Node.js

用語集の作成権限を設定する

Linux または macOS

Windows

ライブラリのインポート

Python

Node.js

プロジェクト ID の設定

Linux または macOS

Windows

Vision を使用して画像からテキストを検出する

Python

Node.js

Translation で用語集を使用する

用語集ファイルを作成する

用語集リソースを作成する

Python

Node.js

用語集を使用して翻訳する

Python

Node.js

音声合成マークアップ言語を利用してテキスト読み上げを使用する

Python

Node.js

すべてを組み合わせる

Python

Node.js

コードの実行

Python

Node.js

エラー メッセージのトラブルシューティング

クリーンアップ

プロジェクトの削除

次のステップ

クライアントライブラリの設定

エラーメッセージのトラブルシューティング