This section demonstrates how to transcribe streaming audio, like the input from a microphone, to text.
Streaming speech recognition allows you to stream audio to Speech-to-Text and receive a stream speech recognition results in real time as the audio is processed. See also the audio limits for streaming speech recognition requests. Streaming speech recognition is available via gRPC only.
For more information about recognizers and sending recognition requests, see the reference documentation.
Perform streaming speech recognition on a local file
Below is an example of performing streaming speech recognition on a local audio
file. There is a 10 MB limit on all streaming requests sent to the API. This
limit applies to to both the initial StreamingRecognize
request
and the size of each individual message in the stream. Exceeding this limit will
throw an error.
Python
To authenticate to Speech-to-Text, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
While you can stream a local audio file to the Speech-to-Text API, it is recommended that you perform synchronous audio recognition.
Try it for yourself
If you're new to Google Cloud, create an account to evaluate how Speech-to-Text performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
Try Speech-to-Text free