This page shows you how to send a speech recognition request to
Speech-to-Text using the REST interface
and the curl
command.
Speech-to-Text enables easy integration of Google speech recognition technologies into developer applications. You can send audio data to the Speech-to-Text API, which then returns a text transcription of that audio file. For more information about the service, see Speech-to-Text basics.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
Set up a Cloud Console project.
Click to:
- Create or select a project.
- Enable the Speech-to-Text API for that project.
- Create a service account.
- Download a private key as JSON.
You can view and manage these resources at any time in the Cloud Console.
-
Set the environment variable
GOOGLE_APPLICATION_CREDENTIALS
to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again. - Install and initialize the Cloud SDK.
Make an audio transcription request
Now you can use Speech-to-Text to transcribe an audio file
to text. Use the following code sample to send a
recognize
REST request to the Speech-to-Text API.
-
Create a JSON request file with the following text, and save it as a
sync-request.json
plain text file:{ "config": { "encoding":"FLAC", "sampleRateHertz": 16000, "languageCode": "en-US", "enableWordTimeOffsets": false }, "audio": { "uri":"gs://cloud-samples-tests/speech/brooklyn.flac" } }
This JSON snippet indicates that the audio file has a FLAC encoding format, a sample rate of 16000 Hz, and that the audio file is stored on Google Cloud Storage at the given URI. The audio file is publicly accessible, so you don't need authentication credentials to access the file.
-
Use
curl
to make aspeech:recognize
request, passing it the filename of the JSON request you set up in step 1:The sample
curl
command uses thegcloud auth application-default print-access-token
command to get an authentication token.curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1/speech:recognize \ -d @sync-request.json
Note that to pass a filename to
curl
you use the-d
option (for "data") and precede the filename with an@
sign. This file should be in the same directory in which you execute thecurl
command.You should see a response similar to the following:
{ "results": [ { "alternatives": [ { "transcript": "how old is the Brooklyn Bridge", "confidence": 0.98267895 } ] } ] }
Congratulations! You've sent your first request to Speech-to-Text.
If you receive an error or an empty response from Speech-to-Text, take a look at the troubleshooting and error mitigation steps.
What's next
- Practice transcribing short audio files.
- Learn how to batch long audio files for speech recognition.
- Learn how to transcribe streaming audio like from a microphone.
- Get started with the Speech-to-Text in your language of choice by using a Speech-to-Text client library.
- Work through the sample applications.
- For best performance, accuracy, and other tips, see the best practices documentation.