Quickstart: Use the gcloud tool

This page shows you how to send a speech recognition request to Speech-to-Text using the gcloud tool from the command line.

Speech-to-Text enables easy integration of Google speech recognition technologies into developer applications. You can send audio data to the Speech-to-Text API, which then returns a text transcription of that audio file. For more information about the service, see Speech-to-Text basics.

Before you begin

Before you can send a request to the Speech-to-Text API, you must have completed the following actions. See the before you begin page for details.

  • Enable Speech-to-Text on a GCP project.
    1. Make sure billing is enabled for Speech-to-Text.
    2. Create and/or assign one or more service accounts to Speech-to-Text.
    3. Download a service account credential key.
  • Set your authentication environment variable.
  • (Optional) Create a new Google Cloud Storage bucket to store your audio data.

Make an audio transcription request

Now you can use Speech-to-Text to transcribe an audio file to text. Use the following code sample to send a recognize request to the Speech-to-Text API.

Open the command line shell and run the following command.

gcloud ml speech recognize gs://cloud-samples-tests/speech/brooklyn.flac \
    --language-code=en-US

This command requests that Speech-to-Text transcribe the audio contained in a FLAC hosted at a publicly accessible location.

If the request is successful, the server returns a response in JSON format:

{
  "results": [
    {
      "alternatives": [
        {
          "confidence": 0.9840146,
          "transcript": "how old is the Brooklyn Bridge"
        }
      ]
    }
  ]
}

Congratulations! You've sent your first request to Speech-to-Text.

If you receive an error or an empty response from Speech-to-Text, take a look at the troubleshooting and error mitigation steps.

What's next