This quickstart introduces you to Text-to-Speech. In this quickstart, you set up your Google Cloud Platform project and authorization and then make a request for Text-to-Speech to create audio from text.
To learn more about the fundamental concepts in Text-to-Speech, read Text-to-Speech Basics.
Before you begin
-
Sign in to your Google Account.
If you don't already have one, sign up for a new account.
-
In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.
- Enable the Cloud Text-to-Speech API.
-
Set up authentication:
-
In the Cloud Console, go to the Create service account key page.
Go to the Create Service Account Key page - From the Service account list, select New service account.
- In the Service account name field, enter a name.
- Don't select a value from the Role list. No role is required to access this service.
- Click Create. A note appears, warning that this service account has no role.
- Click Create without role. A JSON file that contains your key downloads to your computer.
-
-
Set the environment variable
GOOGLE_APPLICATION_CREDENTIALS
to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again. - Install and initialize the Cloud SDK.
Synthesize audio from text
You can convert text to audio by making an HTTP POST request to the
https://texttospeech.googleapis.com/v1/text:synthesize
endpoint. In
the body of your POST command, specify the type of voice to synthesize in
the voice
configuration section, specify the text to synthesize in the
text
field of the input
section, and specify the type of audio to create
in the audioConfig
section.
Execute the REST request below at the command line to synthesize audio from text using Text-to-Speech. The command uses the
gcloud auth application-default print-access-token
command to retrieve an authorization token for the request.HTTP method and URL:
POST https://texttospeech.googleapis.com/v1/text:synthesize
Request JSON body:
{ "input":{ "text":"Android is a mobile operating system developed by Google, based on the Linux kernel and designed primarily for touchscreen mobile devices such as smartphones and tablets." }, "voice":{ "languageCode":"en-gb", "name":"en-GB-Standard-A", "ssmlGender":"FEMALE" }, "audioConfig":{ "audioEncoding":"MP3" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "audioContent": "//NExAASCCIIAAhEAGAAEMW4kAYPnwwIKw/BBTpwTvB+IAxIfghUfW.." }
The JSON output for the REST command contains the synthesized audio in base64-encoded format. Copy the contents of the
audioContent
field into a new file namedsynthesize-output-base64.txt
. Your new file will look something like the following://NExAARqoIIAAhEuWAAAGNmBGMY4EBcxvABAXBPmPIAF//yAuh9Tn5CEap3/o ... VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
Decode the contents of the
synthesize-output-base64.txt
file into a new file namedsynthesized-audio.mp3
. For information on decoding base64, see Decoding Base64-Encoded Audio Content.base64 synthesize-output-base64.txt --decode > synthesized-audio.mp3
Play the contents of
synthesized-audio.mp3
in an audio application or on an audio device. You can also open thesynthesized-audio.mp3
in the Chrome browser to play the audio by navigating to the folder that contains the file, for examplefile://my_file_path/synthesized-audio.mp3
Clean up
To avoid unnecessary Google Cloud Platform charges, use the Cloud Console to delete your project if you do not need it.
What's next
- Learn more about Cloud Text-to-Speech by reading the basics.
- Review the list of available voices you can use for synthetic speech.