Release Notes

This page documents production updates to Cloud Speech API. You can periodically check this page for announcements about new or updated features, bug fixes, known issues, and deprecated functionality.

Subscribe to the Cloud Speech API release notes. Subscribe

January 16, 2018

Support for the OGG_OPUS audio encoding has been expanded to support 8000 Hz, 12000 Hz, 16000 Hz, 24000 Hz, or 48000 Hz.

August 10, 2017

Time offsets (timestamps) are now available. Set the enableWordTimeOffsets parameter to true in your request configuration and the Cloud Speech API will include time offset values for the beginning and end of each spoken word that is recognized in the audio for your request. For more information, see Time offsets (timestamps).

The Cloud Speech API has added recognition support for 30 new languages. For a complete list of all supported languages, see Language Support.

The limit on the length of audio that you can send with an asynchronous recognition request has been increased from ~80 to ~180 minutes. For information on Cloud Speech API limits, see Quotas & Limits. For information on asynchronous recognition requests, see

April 18, 2017

Release of the Cloud Speech API v1.

The v1beta1 release of the Cloud Speech API has been deprecated. v1beta1 continues to be available for a period of time as defined in the terms of service. To avoid being impacted when the v1beta1 is discontinued, replace references to v1beta1 in your code with v1 and update your code with valid v1 API names and values.

A language_code is now required with requests to the Cloud Speech API. Requests with a missing or invalid language_code will return an error. (Pre-release versions of the API used en-US if the language_code was omitted from the request.)

SyncRecognize is renamed to Recognize. v1beta1/speech:syncrecognize is renamed to v1/speech:recognize. The behavior is unchanged.

AsyncRecognize is renamed to LongRunningRecognize. v1beta1/speech:asyncrecognize is renamed to v1/speech:longrunningrecognize. The behavior is unchanged except that the LongRunningRecognize method now supports all of the AudioEncoding enum values. (Pre-release versions only supported the LINEAR16 audio encoding.)

The sample_rate field has been renamed to sample_rate_hertz. The behavior is unchanged.

The EndpointerType enum has been renamed to SpeechEventType.

The following SpeechEventType enums have been removed.

  • START_OF_SPEECH
  • END_OF_SPEECH
  • END_OF_AUDIO

The END_OF_UTTERANCE enum has been renamed to END_OF_SINGLE_UTTERANCE. The behavior is unchanged.

The result_index field has been removed.

The speech_context field has been replaced by the speech_contexts field, which is a repeated field. However, you can specify, at most, one speech context. The behavior is unchanged.

The SPEEX_WITH_HEADER_BYTE and OGG_OPUS codecs have been added to support audio encoder implementations for legacy applications. We do not recommend using lossy codes, as they result in a lower-quality speech transcription. If you must use a low-bitrate encoder, OGG_OPUS is preferred.

You are no longer required to specify the encoding and sample rate for WAV or FLAC files. If omitted, the Cloud Speech API automatically determines the encoding and sample rate for WAV or FLAC files based on the file header. If you specify an encoding or sample rate value that does not match the value in the file header, then the Cloud Speech API will return an error. This change is backwards-compatible and will not invalidate any currently valid requests.

Send feedback about...

Google Cloud Speech API