Quickstart: Using the command line

This quickstart introduces you to Text-to-Speech. In this quickstart, you set up your Google Cloud Platform project and authorization and then make a request for Text-to-Speech to create audio from text.

To learn more about the fundamental concepts in Text-to-Speech, read Text-to-Speech Basics.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Enable the Cloud Text-to-Speech API.

    Enable the API

  5. Create a service account:

    1. In the Cloud Console, go to the Create service account page.

      Go to Create service account
    2. Select a project.
    3. In the Service account name field, enter a name. The Cloud Console fills in the Service account ID field based on this name.

      In the Service account description field, enter a description. For example, Service account for quickstart.

    4. Click Done to finish creating the service account.

      Do not close your browser window. You will use it in the next step.

  6. Create a service account key:

    1. In the Cloud Console, click the email address for the service account that you created.
    2. Click Keys.
    3. Click Add key, then click Create new key.
    4. Click Create. A JSON key file is downloaded to your computer.
    5. Click Close.
  7. Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again.

  8. Install and initialize the Cloud SDK.

Synthesize audio from text

You can convert text to audio by making an HTTP POST request to the https://texttospeech.googleapis.com/v1/text:synthesize endpoint. In the body of your POST command, specify the type of voice to synthesize in the voice configuration section, specify the text to synthesize in the text field of the input section, and specify the type of audio to create in the audioConfig section.

  1. Execute the REST request below at the command line to synthesize audio from text using Text-to-Speech. The command uses the gcloud auth application-default print-access-token command to retrieve an authorization token for the request.

    HTTP method and URL:

    POST https://texttospeech.googleapis.com/v1/text:synthesize

    Request JSON body:

    {
      "input":{
        "text":"Android is a mobile operating system developed by Google, based on the Linux kernel and designed primarily for touchscreen mobile devices such as smartphones and tablets."
      },
      "voice":{
        "languageCode":"en-gb",
        "name":"en-GB-Standard-A",
        "ssmlGender":"FEMALE"
      },
      "audioConfig":{
        "audioEncoding":"MP3"
      }
    }
    

    To send your request, expand one of these options:

    You should receive a JSON response similar to the following:

    {
      "audioContent": "//NExAASCCIIAAhEAGAAEMW4kAYPnwwIKw/BBTpwTvB+IAxIfghUfW.."
    }
    

  2. The JSON output for the REST command contains the synthesized audio in base64-encoded format. Copy the contents of the audioContent field into a new file named synthesize-output-base64.txt. Your new file will look something like the following:

    //NExAARqoIIAAhEuWAAAGNmBGMY4EBcxvABAXBPmPIAF//yAuh9Tn5CEap3/o
    ...
    VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
    
  3. Decode the contents of the synthesize-output-base64.txt file into a new file named synthesized-audio.mp3. For information on decoding base64, see Decoding Base64-Encoded Audio Content.

    Linux

    1. Copy only the base-64 encoded content into a text file.

    2. Decode the source text file using the base64 command line tool by using the -d flag:

        $ base64 SOURCE_BASE64_TEXT_FILE -d > DESTINATION_AUDIO_FILE
    

    Mac OSX

    1. Copy only the base-64 encoded content into a text file.

    2. Decode the source text file using the base64 command line tool:

        $ base64 --decode SOURCE_BASE64_TEXT_FILE > DESTINATION_AUDIO_FILE
    

    Windows

    1. Copy only the base-64 encoded content into a text file.

    2. Decode the source text file using the certutil command.

       certutil -decode SOURCE_BASE64_TEXT_FILE DESTINATION_AUDIO_FILE
    
  4. Play the contents of synthesized-audio.mp3 in an audio application or on an audio device. You can also open the synthesized-audio.mp3 in the Chrome browser to play the audio by navigating to the folder that contains the file, for example file://my_file_path/synthesized-audio.mp3

Clean up

To avoid unnecessary Google Cloud Platform charges, use the Cloud Console to delete your project if you do not need it.

What's next

  • Learn more about Cloud Text-to-Speech by reading the basics.
  • Review the list of available voices you can use for synthetic speech.