Media Translation translates an audio file or stream of speech into text of another language. This page provides code samples demonstrating how to translate streaming audio into text using Media Translation client libraries.
Set up your project
Before you can use Media Translation, you need to set up a Google Cloud project and enable the Media Translation API for that project.
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Media Translation API.
-
Create a service account:
-
In the Google Cloud console, go to the Create service account page.
Go to Create service account - Select your project.
-
In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
In the Service account description field, enter a description. For example,
Service account for quickstart
. - Click Create and continue.
-
Grant the Project > Owner role to the service account.
To grant the role, find the Select a role list, then select Project > Owner.
- Click Continue.
-
Click Done to finish creating the service account.
Do not close your browser window. You will use it in the next step.
-
-
Create a service account key:
- In the Google Cloud console, click the email address for the service account that you created.
- Click Keys.
- Click Add key, and then click Create new key.
- Click Create. A JSON key file is downloaded to your computer.
- Click Close.
-
Set the environment variable
GOOGLE_APPLICATION_CREDENTIALS
to the path of the JSON file that contains your credentials. This variable applies only to your current shell session, so if you open a new session, set the variable again. - Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Media Translation API.
-
Create a service account:
-
In the Google Cloud console, go to the Create service account page.
Go to Create service account - Select your project.
-
In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
In the Service account description field, enter a description. For example,
Service account for quickstart
. - Click Create and continue.
-
Grant the Project > Owner role to the service account.
To grant the role, find the Select a role list, then select Project > Owner.
- Click Continue.
-
Click Done to finish creating the service account.
Do not close your browser window. You will use it in the next step.
-
-
Create a service account key:
- In the Google Cloud console, click the email address for the service account that you created.
- Click Keys.
- Click Add key, and then click Create new key.
- Click Create. A JSON key file is downloaded to your computer.
- Click Close.
-
Set the environment variable
GOOGLE_APPLICATION_CREDENTIALS
to the path of the JSON file that contains your credentials. This variable applies only to your current shell session, so if you open a new session, set the variable again. - Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
- Install the client library for your preferred language.
Translate speech
The code samples below demonstrate how to translate speech from a file containing up to five minutes of audio or from a live microphone. See Best practices for recommendations about how to provide speech data for the best accuracy in recognition.
The main steps are the same regardless of the audio source:
Initialize a
SpeechTranslationServiceClient
client to use for sending requests to Media Translation.You can reuse the same client for multiple requests.
Create a
StreamingTranslateSpeechConfig
request object that specifies how to process the audio.The
StreamingTranslateSpeechConfig
object consists of aTranslateSpeechConfig
object that provides information about the audio source file and asingle_utterance
property that specifies whether or not Media Translation continues translating when the speaker pauses.The
TranslateSpeechConfig
object provides technical specifications for the audio source (such as its encoding and sample rate), sets the source and target languages for the translation (using their BCP-47 language codes), and defines which translation model Media Translation uses for transcription.Send a sequence of
StreamingTranslateSpeechRequest
request objects.You send a sequence of requests for each audio file you want to translate. The first request provides the
StreamingTranslateSpeechConfig
object for the request and the following requests provide the audio content in streaming.Receive the
StreamingTranslateSpeechResult
response object.While any response with a
text_translation_result.is_final
value offalse
is received, the latest translated result overwrites the previous result.When Media Translation has a final result, the
text_translation_result.is_final
field is set totrue
, and any subsequently received translation result is appended to the previous result. (In this instance, the previous result is not overwritten). You can output the completed translation, and start with a new section for the next portion of the transcription and corresponding audio.When the speaker has stopped, if the
single_utterance
field is set to true in theStreamingTranslateSpeechConfig
request object, Media Translation will return anEND_OF_SINGLE_UTTERANCE
event for thespeech_event_type
in the response. The client will stop sending requests but will continue to receive responses until the translation is finished.Streaming has a 5-min limit. Exceeding this limit would return an OUT_OF_RANGE error.
Code samples
Translating speech from an audio file
Java
To learn how to install and use the client library for Media Translation, see Media Translation client libraries. For more information, see the Media Translation Java API reference documentation.
To authenticate to Media Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Media Translation, see Media Translation client libraries. For more information, see the Media Translation Node.js API reference documentation.
To authenticate to Media Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Media Translation, see Media Translation client libraries. For more information, see the Media Translation Python API reference documentation.
To authenticate to Media Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Translating speech from a microphone
Java
To learn how to install and use the client library for Media Translation, see Media Translation client libraries. For more information, see the Media Translation Java API reference documentation.
To authenticate to Media Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Media Translation, see Media Translation client libraries. For more information, see the Media Translation Node.js API reference documentation.
To authenticate to Media Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Media Translation, see Media Translation client libraries. For more information, see the Media Translation Python API reference documentation.
To authenticate to Media Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.