Quickstart

The Google Cloud Speech API enables easy integration of Google speech recognition technologies into developer applications. The Speech API allows you to send audio and receive a text transcription from the service (see What is the Google Cloud Speech API? for more information).

Installing the Google Cloud SDK

The Google Speech API makes use of the gcloud command line tool, which is distributed within the Google Cloud Platform Cloud SDK. Follow the instructions on that page to install and set up the Cloud SDK.

Need a command prompt? You can use the Google Cloud Shell. The Google Cloud Shell is a command line environment that already includes the Google Cloud SDK, so you don't need to install it. (The Google Cloud SDK also comes preinstalled on Google Compute Engine Virtual Machines.)

Set up your project

If you haven't already done so:

  1. Sign in to your Google account.

    If you don't already have one, sign up for a new account.

  2. Set up a Cloud Platform Console project.

    Set up a project

    Click to:

    • Create or select a project.
    • Enable the Cloud Speech API for that project.
    • Create a service account.
    • Download a private key as JSON.

    You can view and manage these resources at any time in the Cloud Platform Console.

Make a Speech API request

We will make a Speech API Request using the recognize REST method.

  1. Create a JSON request file with the following text, and save it as a sync-request.json plain text file:

    {
      "config": {
          "encoding":"FLAC",
          "sampleRateHertz": 16000,
          "languageCode": "en-US",
          "enableWordTimeOffsets": false
      },
      "audio": {
          "uri":"gs://cloud-samples-tests/speech/brooklyn.flac"
      }
    }
      

    This JSON snippet indicates that the audio file has a FLAC encoding format, a sample rate of 16000 Hz, and that the audio file is stored on Google Cloud Storage at the given URI. The audio file is publicly accessible, so you will not need authentication credentials to access the file (though you will need authentication credentials to use the API).

  2. Authenticate to your service account, passing the location of your service account key file:

    gcloud auth activate-service-account --key-file=service-account-key-file
      
  3. Obtain an authorization token using your service account:

    gcloud auth application-default print-access-token
    access_token
      
  4. Use curl to make a speech:recognize request, passing it the access token you printed, and the filename of the JSON request you set up in step 1:

    curl -s -H "Content-Type: application/json" \
        -H "Authorization: Bearer access_token" \
        https://speech.googleapis.com/v1/speech:recognize \
        -d @sync-request.json
      

    Note that to pass a filename to curl you use the -d option (for "data") and precede the filename with an @ sign. This file should be in the same directory in which you execute the curl command.

    You should see a response similar to the following:

    {
      "results": [
        {
          "alternatives": [
            {
              "transcript": "how old is the Brooklyn Bridge",
              "confidence": 0.98267895
            }
          ]
        }
      ]
    }
      

Congratulations! You've sent your first request to the Cloud Speech API!

What's next

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Google Cloud Speech API Documentation