Media Translation API 已被弃用，2024 年 7 月 1 日之后将不再在 Google Cloud 上提供。您可以通过组合使用 Cloud Speech-to-Text 和 Cloud Translation API 等其他 Google Cloud 服务来复制 Media Translation API 的功能。

将流式音频翻译成文本

Media Translation 可将音频文件或语音流翻译为另一种语言的文本。本页面提供的代码示例展示了如何使用 Media Translation 客户端库将流式音频翻译成文本。

设置项目

在使用 Media Translation 之前，您需要先设置一个 Google Cloud 项目，并为该项目启用 Media Translation API。

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector
Make sure that billing is enabled for your Google Cloud project.
Enable the Media Translation API.
Enable the API

Create a service account:
1. In the Google Cloud console, go to the Create service account page.
  Go to Create service account
2. Select your project.
3. In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
  
  In the Service account description field, enter a description. For example, Service account for quickstart.
4. Click Create and continue.
5. Grant the Project > Owner role to the service account.
  
  To grant the role, find the Select a role list, then select Project > Owner.
  
  Note: The Role field affects which resources the service account can access in your project. You can revoke these roles or grant additional roles later. In production environments, do not grant the Owner, Editor, or Viewer roles. Instead, grant a predefined role or custom role that meets your needs.
6. Click Continue.
7. Click Done to finish creating the service account.
  
  Do not close your browser window. You will use it in the next step.
Create a service account key:
1. In the Google Cloud console, click the email address for the service account that you created.
2. Click Keys.
3. Click Add key, and then click Create new key.
4. Click Create. A JSON key file is downloaded to your computer.
5. Click Close.
Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file that contains your credentials. This variable applies only to your current shell session, so if you open a new session, set the variable again.
Example: Linux or macOS
```
export GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"
```
Replace KEY_PATH with the path of the JSON file that contains your credentials.

For example:
```
export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/service-account-file.json"
```
Example: Windows

For PowerShell:
```
$env:GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"
```
Replace KEY_PATH with the path of the JSON file that contains your credentials.

For example:
```
$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\service-account-file.json"
```
For command prompt:
```
set GOOGLE_APPLICATION_CREDENTIALS=KEY_PATH
```
Replace KEY_PATH with the path of the JSON file that contains your credentials.
Install the Google Cloud CLI.
To initialize the gcloud CLI, run the following command:
```
gcloud init
```
根据您的首选语言安装客户端库。

翻译语音内容

以下代码示例展示了如何翻译通过包含最长五分钟音频的文件或直播麦克风采集到的语音。如需有关如何提供语音数据以取得最佳识别准确率的建议，请参阅最佳做法。

无论音频源如何，主要操作步骤都是相同的：

初始化一个 SpeechTranslationServiceClient 客户端，以用于向 Media Translation 发送请求。

您可以使用同一个客户端重复发出多个请求。
创建一个 StreamingTranslateSpeechConfig 请求对象，以指定如何处理音频。

StreamingTranslateSpeechConfig 对象包含一个 TranslateSpeechConfig 对象（提供有关音频源文件的信息）和一个 single_utterance 标志（用于指定在讲话人暂停讲话时，Media Translation 是否继续执行翻译）。

TranslateSpeechConfig 对象提供音频源的技术规范（例如其编码和采样率）、设置翻译的源语言和目标语言（使用相应的 BCP-47 语言代码指定），并定义 Media Translation 使用何种翻译模型处理转录。

注意：您可以视需要指定翻译模型。Media Translation 为大多数请求使用基本默认模型。对于某些语言对，我们提供了针对电话通话或视频而优化的增强型模型。如需了解可用模型，请参阅语言支持页面。
发送一系列 StreamingTranslateSpeechRequest 请求对象。

您需要为待翻译的每个音频文件发送一系列请求。第一个请求提供请求的 StreamingTranslateSpeechConfig 对象，随后的请求则以流式传输方式提供音频内容。
接收 StreamingTranslateSpeechResult 响应对象。

虽然会接收 text_translation_result.is_final 值为 false 的任何响应，但最新的翻译结果会覆盖上一个结果。

当 Media Translation 生成最终结果时，text_translation_result.is_final 字段设置为 true，后续收到的任何翻译结果都会附加到上一个结果之后。（在本例中不会覆盖上一个结果）。您可以输出完成的翻译，并从新的部分开始，处理下一部分转录和相应音频。

当讲话人停止时，如果 single_utterance 请求对象中的 StreamingTranslateSpeechConfig 字段设为 true，Media Translation 将为响应中的 speech_event_type 事件返回 END_OF_SINGLE_UTTERANCE 事件。客户端将停止发送请求，但仍会继续接收响应，直到翻译完成。
流式传输的时长上限为 5 分钟。如果超出此上限，系统将返回 OUT_OF_RANGE 错误。