Media Translation API는 지원 중단되었으며 2024년 7월 1일 이후에는 Google Cloud에서 더 이상 사용할 수 없습니다. Cloud Speech-to-Text 및 Cloud Translation API와 같은 다른 Google Cloud 서비스의 조합을 통해 Media Translation API의 기능을 복제할 수 있습니다.

스트리밍 오디오를 텍스트로 번역

Media Translation은 오디오 파일 또는 음성 스트림을 다른 언어의 텍스트로 번역합니다. 이 페이지에서는 Media Translation 클라이언트 라이브러리를 사용하여 스트리밍 오디오를 텍스트로 번역하는 방법을 보여주는 코드 샘플을 제공합니다.

프로젝트 설정

Media Translation을 사용하려면 먼저 Google Cloud 프로젝트를 설정하고 해당 프로젝트에 Media Translation API를 사용 설정해야 합니다.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector
Make sure that billing is enabled for your Google Cloud project.
Enable the Media Translation API.
Enable the API

Create a service account:
1. In the Google Cloud console, go to the Create service account page.
  Go to Create service account
2. Select your project.
3. In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
  
  In the Service account description field, enter a description. For example, Service account for quickstart.
4. Click Create and continue.
5. Grant the Project > Owner role to the service account.
  
  To grant the role, find the Select a role list, then select Project > Owner.
  
  Note: The Role field affects which resources the service account can access in your project. You can revoke these roles or grant additional roles later. In production environments, do not grant the Owner, Editor, or Viewer roles. Instead, grant a predefined role or custom role that meets your needs.
6. Click Continue.
7. Click Done to finish creating the service account.
  
  Do not close your browser window. You will use it in the next step.
Create a service account key:
1. In the Google Cloud console, click the email address for the service account that you created.
2. Click Keys.
3. Click Add key, and then click Create new key.
4. Click Create. A JSON key file is downloaded to your computer.
5. Click Close.
Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file that contains your credentials. This variable applies only to your current shell session, so if you open a new session, set the variable again.
Example: Linux or macOS
```
export GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"
```
Replace KEY_PATH with the path of the JSON file that contains your credentials.

For example:
```
export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/service-account-file.json"
```
Example: Windows

For PowerShell:
```
$env:GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"
```
Replace KEY_PATH with the path of the JSON file that contains your credentials.

For example:
```
$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\service-account-file.json"
```
For command prompt:
```
set GOOGLE_APPLICATION_CREDENTIALS=KEY_PATH
```
Replace KEY_PATH with the path of the JSON file that contains your credentials.
Install the Google Cloud CLI.
To initialize the gcloud CLI, run the following command:
```
gcloud init
```
원하는 언어의 클라이언트 라이브러리를 설치합니다.

음성 번역

아래의 코드 샘플은 최대 5분 길이의 오디오 파일에 포함된 음성이나 마이크를 통해 말한 음성을 인식하여 번역하는 방법을 보여줍니다. 가장 정확도가 높은 음성 데이터를 제공하는 방법을 알아보려면 권장사항을 참조하세요.

기본 단계는 오디오 소스에 관계없이 동일합니다.

Media Translation에 요청을 보내는 데 사용할 SpeechTranslationServiceClient 클라이언트를 초기화합니다.

동일한 요청을 여러 요청에 재사용할 수 있습니다.
오디오 처리 방법을 지정하는 StreamingTranslateSpeechConfig 요청 객체를 만듭니다.

StreamingTranslateSpeechConfig 객체는 오디오 소스 파일에 대한 정보를 제공하는 TranslateSpeechConfig 객체, 그리고 화자가 말을 잠시 멈출 때 Media Translation에서 번역을 계속할지 여부를 지정하는 single_utterance 플래그로 구성됩니다.

TranslateSpeechConfig 객체는 인코딩 및 샘플링 레이트와 같은 오디오 소스의 기술 사양을 제공하고, BCP-47 언어 코드를 사용하여 번역의 출발어와 도착어를 설정하며, Media Translation에서 텍스트 변환에 사용하는 번역 모델을 정의합니다.

참고: 번역 모델 지정은 선택사항입니다. Media Translation은 대부분의 요청에 기본 모델을 사용합니다. 일부 언어 조합에는 전화 통화 또는 동영상에 최적화된 고급 모델을 사용할 수 있습니다. 사용 가능한 모델은 언어 지원 페이지를 참조하세요.
StreamingTranslateSpeechRequest 요청 객체의 시퀀스를 보냅니다.

번역할 오디오 파일마다 요청을 순서대로 전송합니다. 첫 번째 요청은 요청의 StreamingTranslateSpeechConfig 객체를 제공하고 후속 요청은 스트리밍의 오디오 콘텐츠를 제공합니다.
StreamingTranslateSpeechResult 응답 객체를 수신합니다.

text_translation_result.is_final 값이 false인 응답이 수신되는 동안 최신 번역 결과는 이전 결과를 덮어씁니다.

Media Translation에 최종 결과가 있는 경우 text_translation_result.is_final 필드가 true로 설정되고 이후에 수신되는 번역 결과가 이전 결과에 추가됩니다. (이 경우 이전 결과를 덮어쓰지 않습니다.) 완료된 번역을 출력하고 텍스트 변환 및 해당 오디오의 다음 부분은 새 섹션에서 시작할 수 있습니다.

화자가 말을 멈췄을 때 StreamingTranslateSpeechConfig 요청 객체에서 single_utterance 필드가 true로 설정되어 있으면 Media Translation은 응답에서 speech_event_type에 대한 END_OF_SINGLE_UTTERANCE 이벤트를 반환합니다. 클라이언트는 요청 전송을 중지하지만 번역이 완료될 때까지 계속 응답을 수신합니다.
스트리밍에는 5분 제한이 적용됩니다. 이 한도를 초과하면 OUT_OF_RANGE 오류가 반환됩니다.