Before You Begin

Speech-to-Text is an API that is powered by Google's artificial intelligence (AI) technology. You send your audio data to Speech-to-Text, then receive a text transcription of your audio data in response. For more information on how Speech-to-Text works, see the basics page.

There are two ways to access the service: by using the REST API, or by using the Speech-to-Text Console. We provide code samples that show you how to make a request to the REST API and receive a response. You can learn how to use these samples by following the Speech-to-Text quickstarts and how-to guides. If you prefer to use Speech-to-Text with minimal coding, you can use the Cloud Speech-to-Text Console.

This guide walks you through the steps necessary to start sending requests to the REST API. To get started using the Speech-to-Text Console, see the UI Console quickstart.

Overview

Before you can begin sending requests to Speech-to-Text, you must enable the API in the Google Cloud Platform Console. The steps below walk you through the following actions:

  • Enable Speech-to-Text on a project.
    1. Make sure billing is enabled for Speech-to-Text.
    2. Make sure your project has at least one service account.
    3. Download a service account credential key.
  • Set your authentication environment variable.
  • (Optional) Create a new Google Cloud Storage bucket to store your audio data.

Setting up your Google Cloud Platform project

  1. Sign in to Cloud Console

  2. Go to the project selector page

    You can either choose an existing project or create a new one. For more details about creating a project, see Google Cloud Platform documentation.

  3. If you create a new project, you will be prompted to link a billing account to this project. If you are using a pre-existing project, make sure that you have billing enabled.

    Learn how to confirm that billing is enabled for your project

  4. Once you have selected a project and linked it to a billing account, you can enable the Speech-to-Text API. Go to the Search products and resources bar at the top of the page and type in "speech". Select the Cloud Speech-to-Text API from the list of results.

  5. To try Speech-to-Text without linking it to your project, choose the TRY THIS API option. To enable the Speech-to-Text API for use with your project, click ENABLE.

  6. (Optional) Enable data logging. By opting in to data logging, you allow Google to record any audio data that you send to Speech-to-Text. This data is used to improve the Speech-to-Text models. Users who opt in to data logging benefit from lower pricing. See the pricing and data logging terms and conditions pages for more information.

  7. You now must link one or more service accounts to the Speech-to-Text API. Click on the Credentials menu item on the left side of the Speech-to-Text API main page. If you do not have any service accounts associated with this project, create one by following the instructions in the creating a new service account section.

    If you do have previously-created service accounts associated with this project, they will appear on this page. Make sure that you have access to a downloaded JSON key associated with the service account you'd like to use to authenticate with Speech-to-Text. Service account keys are downloadable only once, at the time they are created. If your service account has an existing key but you can't locate the downloaded .json file, you will need to create a new key for that service account and download its .json file. For instructions on how to create a new key on an existing service account, follow the instructions in the creating a JSON key section.

    If you already have a service account and its JSON key, you can now set your authentication environment variable.

Creating a new service account

  1. Create a new service account if your project doesn't already have one. You must create a service account in order to use Speech-to-Text.

    Go to Create service account

    In the service account name box, type a unique name for the new service account. Your input is automatically populated in the Service account ID box. The Service account description box is optional but recommended if you plan to associate multiple service accounts with your project. Enter a brief description of the service account into this box, then click CREATE AND CONTINUE.

  2. We recommend that you assign one of the basic IAM roles to your service account. You can also assign multiple roles to a single service account if needed. See IAM roles for details on available roles and the permissions allowed to each. Click on the drop-down Select a role menu and scroll down to Basic. You can choose a role for this service account from the options that appear in the right-hand column. Click CONTINUE.

  3. The final step allows you to optionally allow other entities (individuals, Google groups, and so on) to access your service account. If you don't need to grant additional access, you can click DONE without entering any information.

  4. The service account is now listed on the Service Accounts page. You can change the service account's permissions, add or generate new keys, and grant access at any time.

Creating a JSON key for your service account

  1. The newly-created service account appears on the service accounts page. Create a private key that will be associated with that account. You need to use this private key during the authentication process when you send a request to Speech-to-Text. If you choose not to create a key now, you can generate a key and/or change individual user information at any time by accessing the service account through the IAM & Admin -> Service Accounts option in the main navigation menu.

    To create a key, click on the service account and select the KEYS tab. Click ADD KEY -> Create new key. We recommend that you create a key in JSON format.

  2. A new key in the format of your choice is automatically downloaded. Store this file in a safe location and make a note of the file path. You will need to point the GOOGLE_APPLICATION_CREDENTIALS environment variable to this file when you go through the authentication process at the beginning of each new Speech-to-Text session. This is an essential step for authenticating requests to Speech-to-Text. The key's unique ID appears next to the name of the service account.

Set your authentication environment variable

In order to set your GOOGLE_APPLICATION_CREDENTIALS, you must have a service account associated with your project and have access to the service account's JSON key.

Provide authentication credentials to your application code by setting the environment variable GOOGLE_APPLICATION_CREDENTIALS. This variable only applies to your current shell session, so if you open a new session, set the variable again.

Linux or macOS

export GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"

Replace KEY_PATH with the path of the JSON file that contains your service account key.

For example:

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/service-account-file.json"

Windows

For PowerShell:

$env:GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"

Replace KEY_PATH with the path of the JSON file that contains your service account key.

For example:

$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\service-account-file.json"

For command prompt:

set GOOGLE_APPLICATION_CREDENTIALS=KEY_PATH

Replace KEY_PATH with the path of the JSON file that contains your service account key.

For more information, see the Google Cloud Platform authentication documentation.

(Optional) Creating a Google Cloud Storage bucket

If you intend to transcribe audio longer than 60 seconds or with a file size larger than 10 MB, you must store the audio data in a Cloud Storage bucket before you can transcribe it using Speech-to-Text. The following steps walk you through the process of creating a new bucket.

  • In the Cloud Console, go to the Cloud Storage Browser page.

    Go to Browser

  • Click Create bucket.
  • On the Create a bucket page, enter your bucket information. To go to the next step, click Continue.
    • For Name your bucket, enter a unique bucket name. Don't include sensitive information in the bucket name, because the bucket namespace is global and publicly visible.
    • For Choose where to store your data, do the following:
      • Select a Location type option.
      • Select a Location option.
    • For Choose a default storage class for your data, select a storage class.
    • For Choose how to control access to objects, select an Access control option.
    • For Advanced settings (optional), specify an encryption method, a retention policy, or bucket labels.
  • Click Create.
  • Disabling the Speech-to-Text API

    To disable the Speech-to-Text API, navigate to your Google Cloud Platform dashboard and click on the Go to APIs overview link in the APIs box. Click on the Speech-to-Text API, then select the DISABLE API button at the top of the page.

    What's next

    Learn how to send a transcription request to the Speech-to-Text API using client libraries, gcloud, the command line, or the Speech-to-Text UI.