Cloud Speech API is now generally available
Dan Aharon
Product Manager, Speech
Last summer, we launched an open beta for Cloud Speech API, our Automatic Speech Recognition (ASR) service. Since then, we’ve had thousands of customers help us improve the quality of service, and we’re proud to announce that as of today Cloud Speech API is now generally available.
Cloud Speech API is built on the core technology that powers speech recognition for other Google products (e.g., Google Search, Google Now, Google Assistant), but has been adapted to better fit the needs of Google Cloud customers. Cloud Speech API is one of several pre-trained machine-learning models available for common tasks like video analysis, image analysis, text analysis and dynamic translation.
With great feedback from customers and partners, we're happy to share that we have new features and performance to announce:
- Improved transcription accuracy for long-form audio
- Faster processing, typically 3x faster than the prior version for batch scenarios
- Expanded file format support, now including WAV, Opus and Speex
Among early adopters of Cloud Speech API, we have seen two main use cases emerge: speech as a control method for applications and devices like voice search, voice commands and Interactive Voice Response (IVR); and also in speech analytics. Speech analytics opens up a hugely interesting set of capabilities around difficult problems e.g., real-time insights from call centers.
Houston, Texas based InteractiveTel is using Cloud Speech API in solutions that track, monitor and report on dealer-customer interactions by telephone.
Google Cloud Speech API performs highly accurate speech-to-text transcription in near-real-time. The higher accuracy rates mean we can help dealers get the most out of phone interactions with their customers and increase sales.
— Gary Graves, CTO and Co-Founder, InterActiveTel
Cloud Speech API is available today. Please click here to learn more.