Google Cloud

Cloud Speech API is now generally available

April 18, 2017

Dan Aharon

Product Manager, Speech

Last summer, we launched an open beta for Cloud Speech API, our Automatic Speech Recognition (ASR) service. Since then, we’ve had thousands of customers help us improve the quality of service, and we’re proud to announce that as of today Cloud Speech API is now generally available.

Cloud Speech API is built on the core technology that powers speech recognition for other Google products (e.g., Google Search, Google Now, Google Assistant), but has been adapted to better fit the needs of Google Cloud customers. Cloud Speech API is one of several pre-trained machine-learning models available for common tasks like video analysis, image analysis, text analysis and dynamic translation.

With great feedback from customers and partners, we're happy to share that we have new features and performance to announce:

Improved transcription accuracy for long-form audio
Faster processing, typically 3x faster than the prior version for batch scenarios
Expanded file format support, now including WAV, Opus and Speex

Among early adopters of Cloud Speech API, we have seen two main use cases emerge: speech as a control method for applications and devices like voice search, voice commands and Interactive Voice Response (IVR); and also in speech analytics. Speech analytics opens up a hugely interesting set of capabilities around difficult problems e.g., real-time insights from call centers.

Houston, Texas based InteractiveTel is using Cloud Speech API in solutions that track, monitor and report on dealer-customer interactions by telephone.