- Automatic Speech Recognition
- Automatic Speech Recognition (ASR) powered by deep learning neural networking to power your applications like voice search or speech transcription.
- Global Vocabulary
- Recognizes over 110 languages and variants with an extensive vocabulary.
- Streaming Recognition
- Returns recognition results while the user is still speaking.
- Word Hints
- Speech recognition can be customized to a specific context by providing a set of words and phrases that are likely to be spoken. Especially useful for adding custom words and names to the vocabulary and in voice-control use cases.
- Real-time or Pre-recorded Audio Support
- Audio input can be captured by an application’s microphone or sent from a pre-recorded audio file. Multiple audio encodings are supported, including FLAC, AMR, PCMU and Linear-16.
- Noise Robustness
- Handles noisy audio from many environments without requiring additional noise cancellation.
- Inappropriate Content Filtering
- Filter inappropriate content in text results for some languages.
- Integrated API
- Audio files can be uploaded in the request or integrated with Google Cloud Storage.
|Monthly Usage||Price Per 15 seconds*|
|0 - 60 minutes||Free|
|61 - 1,000,000 minutes**||$0.006|
* This pricing is for applications on personal systems (e.g., phones, tablets, laptops, desktops). Please contact us for approval and pricing to use Speech API on embedded devices (e.g., cars, TVs, appliances, or speakers).
** Monthly usage is capped at 1 million minutes per month