转录 Cloud Storage 中的多语言文件（Beta 版）

转录存储在 Cloud Storage 中包含多个语言的音频文件。

深入探索

如需查看包含此代码示例的详细文档，请参阅以下内容：

在 Speech-to-Text 中启用语言识别

代码示例

Java

如需了解如何安装和使用 Speech-to-Text 客户端库，请参阅 Speech-to-Text 客户端库。如需了解详情，请参阅 Speech-to-Text Java API 参考文档。

如需向 Speech-to-Text 进行身份验证，请设置应用默认凭证。如需了解详情，请参阅为本地开发环境设置身份验证。

/**
 * Transcribe a remote audio file with multi-language recognition
 *
 * @param gcsUri the path to the remote audio file
 */
public static void transcribeMultiLanguageGcs(String gcsUri) throws Exception {
  try (SpeechClient speechClient = SpeechClient.create()) {

    ArrayList<String> languageList = new ArrayList<>();
    languageList.add("es-ES");
    languageList.add("en-US");

    // Configure request to enable multiple languages
    RecognitionConfig config =
        RecognitionConfig.newBuilder()
            .setEncoding(AudioEncoding.LINEAR16)
            .setSampleRateHertz(16000)
            .setLanguageCode("ja-JP")
            .addAllAlternativeLanguageCodes(languageList)
            .build();

    // Set the remote path for the audio file
    RecognitionAudio audio = RecognitionAudio.newBuilder().setUri(gcsUri).build();

    // Use non-blocking call for getting file transcription
    OperationFuture<LongRunningRecognizeResponse, LongRunningRecognizeMetadata> response =
        speechClient.longRunningRecognizeAsync(config, audio);

    while (!response.isDone()) {
      System.out.println("Waiting for response...");
      Thread.sleep(10000);
    }

    for (SpeechRecognitionResult result : response.get().getResultsList()) {

      // There can be several alternative transcripts for a given chunk of speech. Just use the
      // first (most likely) one here.
      SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);

      // Print out the result
      System.out.printf("Transcript : %s\n\n", alternative.getTranscript());
    }
  }
}

Node.js

如需了解如何安装和使用 Speech-to-Text 客户端库，请参阅 Speech-to-Text 客户端库。如需了解详情，请参阅 Speech-to-Text Node.js API 参考文档。

如需向 Speech-to-Text 进行身份验证，请设置应用默认凭证。如需了解详情，请参阅为本地开发环境设置身份验证。

// Imports the Google Cloud client library
const speech = require('@google-cloud/speech').v1p1beta1;

// Creates a client
const client = new speech.SpeechClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const uri = path to GCS audio file e.g. `gs:/bucket/audio.wav`;

const config = {
  encoding: 'LINEAR16',
  sampleRateHertz: 44100,
  languageCode: 'en-US',
  alternativeLanguageCodes: ['es-ES', 'en-US'],
};

const audio = {
  uri: gcsUri,
};

const request = {
  config: config,
  audio: audio,
};

const [operation] = await client.longRunningRecognize(request);
const [response] = await operation.promise();
const transcription = response.results
  .map(result => result.alternatives[0].transcript)
  .join('\n');
console.log(`Transcription: ${transcription}`);

Python

如需了解如何安装和使用 Speech-to-Text 客户端库，请参阅 Speech-to-Text 客户端库。如需了解详情，请参阅 Speech-to-Text Python API 参考文档。

如需向 Speech-to-Text 进行身份验证，请设置应用默认凭证。如需了解详情，请参阅为本地开发环境设置身份验证。


from google.cloud import speech_v1p1beta1 as speech


def transcribe_file_with_multilanguage_gcs(audio_uri: str) -> str:
    """Transcribe a remote audio file with multi-language recognition
    Args:
        audio_uri (str): The Google Cloud Storage path to an audio file.
            E.g., gs://[BUCKET]/[FILE]
    Returns:
        str: The generated transcript from the audio file provided.
    """

    client = speech.SpeechClient()

    first_language = "es-ES"
    alternate_languages = ["en-US", "fr-FR"]

    # Configure request to enable multiple languages
    recognition_config = speech.RecognitionConfig(
        encoding=speech.RecognitionConfig.AudioEncoding.FLAC,
        sample_rate_hertz=44100,
        language_code=first_language,
        alternative_language_codes=alternate_languages,
    )

    # Set the remote path for the audio file
    audio = speech.RecognitionAudio(uri=audio_uri)

    # Use non-blocking call for getting file transcription
    response = client.long_running_recognize(
        config=recognition_config, audio=audio
    ).result(timeout=300)

    transcript_builder = []
    for i, result in enumerate(response.results):
        alternative = result.alternatives[0]
        transcript_builder.append("-" * 20 + "\n")
        transcript_builder.append(f"First alternative of result {i}: {alternative}")
        transcript_builder.append(f"Transcript: {alternative.transcript} \n")

    transcript = "".join(transcript_builder)
    print(transcript)

    return transcript

后续步骤

如需搜索和过滤其他 Google Cloud 产品的代码示例，请参阅Google Cloud 示例浏览器。