Create audio from text by using the command line
This document walks you through the process of making a request to Text-to-Speech using the command line. To learn more about the fundamental concepts in Text-to-Speech, read Text-to-Speech Basics.
Before you begin
Before you can send a request to the Text-to-Speech API, you must have completed the following actions. See the before you begin page for details.
- Enable Text-to-Speech on a GCP project.
- Make sure billing is enabled for Text-to-Speech.
-
After installing the Google Cloud CLI, configure the gcloud CLI to use your federated identity and then initialize it by running the following command:
gcloud init
Synthesize audio from text
You can convert text to audio by making an HTTP POST request to the
https://texttospeech.googleapis.com/v1/text:synthesize
endpoint. In
the body of your POST command, specify the type of voice to synthesize in
the voice
configuration section, specify the text to synthesize in the
text
field of the input
section, and specify the type of audio to create
in the audioConfig
section.
Execute the REST request below at the command line to synthesize audio from text using Text-to-Speech. The command uses the
gcloud auth application-default print-access-token
command to retrieve an authorization token for the request.Before using any of the request data, make the following replacements:
- PROJECT_ID: the alphanumeric ID of your Google Cloud project.
HTTP method and URL:
POST https://texttospeech.googleapis.com/v1/text:synthesize
Request JSON body:
{ "input": { "text": "Android is a mobile operating system developed by Google, based on the Linux kernel and designed primarily for touchscreen mobile devices such as smartphones and tablets." }, "voice": { "languageCode": "en-gb", "name": "en-GB-Standard-A", "ssmlGender": "FEMALE" }, "audioConfig": { "audioEncoding": "MP3" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "audioContent": "//NExAASCCIIAAhEAGAAEMW4kAYPnwwIKw/BBTpwTvB+IAxIfghUfW.." }
The JSON output for the REST command contains the synthesized audio in base64-encoded format. Copy the contents of the
audioContent
field into a new file namedsynthesize-output-base64.txt
. Your new file will look something like the following://NExAARqoIIAAhEuWAAAGNmBGMY4EBcxvABAXBPmPIAF//yAuh9Tn5CEap3/o ... VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
Decode the contents of the
synthesize-output-base64.txt
file into a new file namedsynthesized-audio.mp3
. For information on decoding base64, see Decoding Base64-Encoded Audio Content.Linux
Copy only the base-64 encoded content into a text file.
Decode the source text file using the base64 command line tool by using the
-d
flag:
$ base64 SOURCE_BASE64_TEXT_FILE -d > DESTINATION_AUDIO_FILE
Mac OSX
Copy only the base-64 encoded content into a text file.
Decode the source text file using the base64 command line tool:
$ base64 --decode SOURCE_BASE64_TEXT_FILE > DESTINATION_AUDIO_FILE
Windows
Copy only the base-64 encoded content into a text file.
Decode the source text file using the
certutil
command.
certutil -decode SOURCE_BASE64_TEXT_FILE DESTINATION_AUDIO_FILE
Play the contents of
synthesized-audio.mp3
in an audio application or on an audio device. You can also open thesynthesized-audio.mp3
in the Chrome browser to play the audio by navigating to the folder that contains the file, for examplefile://my_file_path/synthesized-audio.mp3
Clean up
To avoid unnecessary Google Cloud Platform charges, use the Google Cloud console to delete your project if you do not need it.
What's next
- Learn more about Cloud Text-to-Speech by reading the basics.
- Review the list of available voices you can use for synthetic speech.