This page demonstrates how to transcribe a short audio file to text using synchronous speech recognition.
Synchronous speech recognition returns the recognized text for short audio (less than ~1 minute) in the response as soon as it is processed. To process a speech recognition request for long audio, use Asynchronous Speech Recognition.
Audio content can be sent directly to Speech-to-Text, or it can process audio content that already resides in Google Cloud Storage. See also the audio limits for synchronous speech recognition requests.
Performing synchronous speech recognition on a local file
Here is an example of performing synchronous speech recognition on a local audio file:
Protocol
Refer to the speech:recognize
API endpoint for complete
details.
To perform synchronous speech recognition, make a POST
request and provide the
appropriate request body. The following shows an example of a POST
request using
curl
. The example uses the access token for a service account set up for the
project using the Google Cloud
Cloud SDK. For instructions on installing the Cloud SDK,
setting up a project with a service account, and obtaining an access token,
see the quickstart.
curl -X POST \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'config': { 'encoding': 'LINEAR16', 'sampleRateHertz': 16000, 'languageCode': 'en-US', 'enableWordTimeOffsets': false }, 'audio': { 'content': '/9j/7QBEUGhvdG9zaG9...base64-encoded-audio-content...fXNWzvDEeYxxxzj/Coa6Bax//Z' } }" "https://speech.googleapis.com/v1/speech:recognize"
See the RecognitionConfig reference documentation for more information on configuring the request body.
The audio content supplied in the request body is base64-encoded.
For more information on how to base64-encode
audio, see Base64 Encoding Audio Content. For more information
on the content
field, see RecognitionAudio.
If the request is successful, the server returns a 200 OK
HTTP status code and
the response in JSON format:
{ "results": [ { "alternatives": [ { "transcript": "how old is the Brooklyn Bridge", "confidence": 0.98267895 } ] } ] }
gcloud
Refer to
recognize
command for complete details.
To perform speech recognition on a local file, use the gcloud
command line tool, passing in the local filepath of the file to perform
speech recognition on.
gcloud ml speech recognize PATH-TO-LOCAL-FILE --language-code='en-US'
If the request is successful, the server returns a response in JSON format:
{ "results": [ { "alternatives": [ { "confidence": 0.9840146, "transcript": "how old is the Brooklyn Bridge" } ] } ] }
C#
Go
Java
Node.js
PHP
Python
Ruby
Performing synchronous speech recognition on a remote file
For your convenience, Speech-to-Text API can perform synchronous speech recognition directly on an audio file located in Google Cloud Storage, without the need to send the contents of the audio file in the body of your request.
Here is an example of performing synchronous speech recognition on a file located in Cloud Storage:
Protocol
Refer to the speech:recognize
API endpoint for complete
details.
To perform synchronous speech recognition, make a POST
request and provide the
appropriate request body. The following shows an example of a POST
request using
curl
. The example uses the access token for a service account set up for the
project using the Google Cloud
Cloud SDK. For instructions on installing the Cloud SDK,
setting up a project with a service account, and obtaining an access token,
see the quickstart.
curl -X POST -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'config': { 'encoding': 'LINEAR16', 'sampleRateHertz': 16000, 'languageCode': 'en-US' }, 'audio': { 'uri': 'gs://YOUR_BUCKET_NAME/YOUR_FILE_NAME' } }" "https://speech.googleapis.com/v1/speech:recognize"
See the RecognitionConfig reference documentation for more information on configuring the request body.
If the request is successful, the server returns a 200 OK
HTTP status code and
the response in JSON format:
{ "results": [ { "alternatives": [ { "transcript": "how old is the Brooklyn Bridge", "confidence": 0.98267895 } ] } ] }
gcloud
Refer to
recognize
command for complete details.
To perform speech recognition on a local file, use the gcloud
command line tool, passing in the local filepath of the file to perform
speech recognition on.
gcloud ml speech recognize 'gs://cloud-samples-tests/speech/brooklyn.flac' \ --language-code='en-US'
If the request is successful, the server returns a response in JSON format:
{ "results": [ { "alternatives": [ { "confidence": 0.9840146, "transcript": "how old is the Brooklyn Bridge" } ] } ] }
C#
Go
Java
Node.js
PHP
Ruby