Media TranslationBeta

Add real-time audio translation directly to your content and applications.

Media Translation Overview logo

Scale quickly, globally with dynamic audio translation

Media Translation API delivers real-time speech translation to your content and applications directly from your audio data. Leveraging Google’s machine learning technologies, the API offers enhanced accuracy and simplified integration while equipping you with a comprehensive set of features to further refine your translation results. Improve user experience with low-latency streaming translation and scale quickly with straightforward internationalization.

What's new

Proven record of quality logo

Proven record of quality

Google Cloud’s translation and speech recognition technologies have been widely recognized for their quality, thanks to Google’s machine learning expertise. Bringing cutting-edge technologies together, Media Translation API provides you with state-of-the-art audio translation along with the features of our popular Translation API and Speech-to-Text API.

Streamlined content translation logo

Seamless content translation

Translate content directly from your audio data. Media Translation API enhances the accuracy of interpretation by optimizing model integrations from audio to text and abstracts potential frictions you may face initiating multiple API calls. Simply make one API call, and Media Translation takes care of the rest.

Streaming translation at speed logo

Streaming translation at speed

Stream translation output as you supply audio from a microphone or prerecorded audio file. Media Translation API minimizes the latency between input and translation results—enhancing user experience and enabling real-time engagement across languages and/or geographies.


Streaming translation

Real-time translation is available during streaming audio input from a microphone or prerecorded audio files, and the API optimizes the integration for reduced latency.

Automatic punctuation

The API accurately punctuates your translation results (e.g., commas, periods, question marks).

Enhanced models

Media Translation API comes with two enhanced models (video, phone call), so you can optimize accuracy for your specific audio use case.

Language support

Media Translation API supports 12 languages.

At OnePlus, we aim to share the best technology with the world, hand in hand with our users. One important feature for our product is face-to-face communication across countries, time zones, and even languages. With Google Cloud’s Media Translation API, we are now able to provide real-time streaming translation for video chat with a simple API integration and ensure our customers feel effortlessly connected with minimal latency.

Gary Chen, Head of Software Product, OnePlus



Media Translation API is priced monthly based on the amount of audio translation successfully processed by the service and on the model used for translation. Usage is measured in increments rounded up to 15 seconds.

View pricing details

Take the next step

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Need help getting started?
Work with a trusted partner
Continue browsing

This product is in beta. For more information on our product launch stages, see here.

Cloud AI products comply with the SLA policies listed here. They may offer different latency or availability guarantees from other Google Cloud services.