Transcribe speech to text by using the command line
This page shows you how to send a speech recognition request to
Speech-to-Text using the REST interface
and the curl
command.
Speech-to-Text enables easy integration of Google speech recognition technologies into developer applications. You can send audio data to the Speech-to-Text API, which then returns a text transcription of that audio file. For more information about the service, see Speech-to-Text basics.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the Speech-to-Text APIs.
-
Make sure that you have the following role or roles on the project: Cloud Speech Administrator
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
Find the row that has your email address in the Principal column.
If your email address isn't in that column, then you do not have any roles.
- In the Role column for the row with your email address, check whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
- In the New principals field, enter your email address.
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
- Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the Speech-to-Text APIs.
-
Make sure that you have the following role or roles on the project: Cloud Speech Administrator
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
Find the row that has your email address in the Principal column.
If your email address isn't in that column, then you do not have any roles.
- In the Role column for the row with your email address, check whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
- In the New principals field, enter your email address.
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
- Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
Set up authentication
Client libraries can use Application Default Credentials to easily authenticate with Google APIs and send requests to those APIs. With Application Default Credentials, you can test your application locally and deploy it without changing the underlying code. For more information, including code samples, see Google Cloud Auth Guide.
Create authentication credentials for your Google Account:
gcloud auth application-default login
Create a recognizer
To send a recognition request, you must first create a Recognizer. Use the following command to create a Recognizer.
Replace PROJECT_ID
with your Google Cloud project ID, and RECOGNIZER_ID
with an identifier for your Recognizer.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data "{\"languageCodes\": \"en-US\", \"model\": \"latest_long\"}" \ https://speech.googleapis.com/v2/projects/PROJECT_ID/locations/global/recognizers?recognizer_id=RECOGNIZER_ID
Make an audio transcription request
Now you can use Speech-to-Text to transcribe an audio file
to text. Use the following code sample to send a
recognize
REST request to the Speech-to-Text API.
-
Create a JSON request file with the following text, and save it as a
sync-request.json
plain text file. Replace/full/path/to/audio/file.wav
with the path to the audio file you want to transcribe:{ \"config\": { \"auto_decoding_config": {} }, \"content\": \"$(base64 -w 0 /full/path/to/audio/file.wav | sed 's/+/-/g; s/\//_/g')\" }
-
Use
curl
to make aspeech:recognize
request, passing it the filename of the JSON request you set up in step 1:curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v2/projects/PROJECT_ID/locations/global/recognizers/RECOGNIZER_ID:recognize \ -d @sync-request.json
Note that to pass a filename to
curl
you use the-d
option (for "data") and precede the filename with an@
sign. This file should be in the same directory in which you execute thecurl
command.You should see a response similar to the following:
{ "results": [ { "alternatives": [ { "transcript": "how old is the Brooklyn Bridge", "confidence": 0.98267895 } ] } ] }
Congratulations! You've sent your first request to Speech-to-Text.
If you receive an error or an empty response from Speech-to-Text, take a look at the troubleshooting and error mitigation steps.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
-
Optional: Revoke the authentication credentials that you created, and delete the local credential file.
gcloud auth application-default revoke
-
Optional: Revoke credentials from the gcloud CLI.
gcloud auth revoke
Console
Delete a Cloud project:gcloud CLI
gcloud projects delete PROJECT_ID
What's next
- Practice transcribing short audio files.
- Learn how to transcribe streaming audio like from a microphone.
- Get started with the Speech-to-Text in your language of choice by using a Speech-to-Text client library.
- For best performance, accuracy, and other tips, see the best practices documentation.