Automatically detect language

Stay organized with collections Save and categorize content based on your preferences.

This page describes how to set up a recognizer to automatically recognize the language spoken in an audio file, from a preset list of potential languages.

In some situations, you don't know for certain what language your audio recordings contain. For example, if you publish your service, app, or product in a country with multiple official languages, you can potentially receive audio input from users in a variety of languages. This can make specifying a single language code for transcription requests significantly more difficult.

Multiple language recognition

Speech-to-Text offers a way for you to specify a set of languages that your audio data might contain. When creating a Recognizer, you can provide one or more languages that the audio data might include in the languageCodes field. When you then use the Recognizer in a transcription request, Speech-to-Text attempts to transcribe the audio using the best-fit language from the list of alternates you provided. Speech-to-Text then labels the transcription results with the predicted language code.

This feature is ideal for apps that need to transcribe short statements like voice commands or search. You can list up to four languages for automatic language recognition.

Enable language recognition in audio transcription requests

Specifying multiple languages in your Recognizer resource works just like specifying a single language: simply add the new language codes to the languageCodes field. Speech-to-Text supports alternative language codes for all speech recognition methods: Recognize and StreamingRecognize.