Transcribe speech to text by using the gcloud CLI
This page shows you how to send a speech recognition request to
Speech-to-Text using the
gcloud
tool from the command
line.
Speech-to-Text enables easy integration of Google speech recognition technologies into developer applications. You can send audio data to the Speech-to-Text API, which then returns a text transcription of that audio file. For more information about the service, see Speech-to-Text basics.
Before you begin
Before you can send a request to the Speech-to-Text API, you must have completed the following actions. See the before you begin page for details.
- Enable Speech-to-Text on a GCP project.
- Make sure billing is enabled for Speech-to-Text.
-
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
- (Optional) Create a new Google Cloud Storage bucket to store your audio data.
Make an audio transcription request
Now you can use Speech-to-Text to transcribe an audio file
to text. Use the following code sample to send a
recognize
request to the Speech-to-Text API.
Open the command line shell and run the following command.
gcloud ml speech recognize gs://cloud-samples-tests/speech/brooklyn.flac \ --language-code=en-US
This command requests that Speech-to-Text transcribe the audio contained in a FLAC hosted at a publicly accessible location.
If the request is successful, the server returns a response in JSON format:
{ "results": [ { "alternatives": [ { "confidence": 0.9840146, "transcript": "how old is the Brooklyn Bridge" } ] } ] }
Congratulations! You've sent your first request to Speech-to-Text.
If you receive an error or an empty response from Speech-to-Text, take a look at the troubleshooting and error mitigation steps.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
- Use the Google Cloud console to delete your project if you do not need it.
What's next
- Practice transcribing short audio files.
- Learn how to batch long audio files for speech recognition.
- Learn how to transcribe streaming audio like from a microphone.
- Get started with the Speech-to-Text in your language of choice by using a Speech-to-Text client library.
- Work through the sample applications.
- For best performance, accuracy, and other tips, see the best practices documentation.