Try Gemini 1.5 Pro, our most advanced multimodal model in Vertex AI, and see what you can build with a 1M token context window. Try Gemini 1.5 Pro, our most advanced multimodal model in Vertex AI, and see what you can build with a 1M token context window.

Create audio from text by using the command line

This document walks you through the process of making a request to Text-to-Speech using the command line. To learn more about the fundamental concepts in Text-to-Speech, read Text-to-Speech Basics.

Before you begin

Before you can send a request to the Text-to-Speech API, you must have completed the following actions. See the before you begin page for details.

Enable Text-to-Speech on a GCP project.
Make sure billing is enabled for Text-to-Speech.
Install the Google Cloud CLI, then initialize it by running the following command:
```
gcloud init
```

Synthesize audio from text

You can convert text to audio by making an HTTP POST request to the https://texttospeech.googleapis.com/v1/text:synthesize endpoint. In the body of your POST command, specify the type of voice to synthesize in the voice configuration section, specify the text to synthesize in the text field of the input section, and specify the type of audio to create in the audioConfig section.

Execute the REST request below at the command line to synthesize audio from text using Text-to-Speech. The command uses the gcloud auth application-default print-access-token command to retrieve an authorization token for the request.

Before using any of the request data, make the following replacements:
- PROJECT_ID: the alphanumeric ID of your Google Cloud project.
HTTP method and URL:
```
POST https://texttospeech.googleapis.com/v1/text:synthesize
```
Request JSON body:
```
{
  "input": {
    "text": "Android is a mobile operating system developed by Google, based on the Linux kernel and designed primarily for touchscreen mobile devices such as smartphones and tablets."
  },
  "voice": {
    "languageCode": "en-gb",
    "name": "en-GB-Standard-A",
    "ssmlGender": "FEMALE"
  },
  "audioConfig": {
    "audioEncoding": "MP3"
  }
}
```
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:
```
curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://texttospeech.googleapis.com/v1/text:synthesize"
```
PowerShell (Windows)

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:
```
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://texttospeech.googleapis.com/v1/text:synthesize" | Select-Object -Expand Content
```
You should receive a JSON response similar to the following:
```
{
  "audioContent": "//NExAASCCIIAAhEAGAAEMW4kAYPnwwIKw/BBTpwTvB+IAxIfghUfW.."
}
```
The JSON output for the REST command contains the synthesized audio in base64-encoded format. Copy the contents of the audioContent field into a new file named synthesize-output-base64.txt. Your new file will look something like the following:
```
//NExAARqoIIAAhEuWAAAGNmBGMY4EBcxvABAXBPmPIAF//yAuh9Tn5CEap3/o
...
VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
```
Decode the contents of the synthesize-output-base64.txt file into a new file named synthesized-audio.mp3. For information on decoding base64, see Decoding Base64-Encoded Audio Content.
Linux
1. Copy only the base-64 encoded content into a text file.
2. Decode the source text file using the base64 command line tool by using the -d flag:
```
    $ base64 SOURCE_BASE64_TEXT_FILE -d > DESTINATION_AUDIO_FILE
```
Mac OSX
1. Copy only the base-64 encoded content into a text file.
2. Decode the source text file using the base64 command line tool:
```
    $ base64 --decode SOURCE_BASE64_TEXT_FILE > DESTINATION_AUDIO_FILE
```
Windows
1. Copy only the base-64 encoded content into a text file.
2. Decode the source text file using the certutil command.
```
   certutil -decode SOURCE_BASE64_TEXT_FILE DESTINATION_AUDIO_FILE
```
Play the contents of synthesized-audio.mp3 in an audio application or on an audio device. You can also open the synthesized-audio.mp3 in the Chrome browser to play the audio by navigating to the folder that contains the file, for example file://my_file_path/synthesized-audio.mp3

Clean up

To avoid unnecessary Google Cloud Platform charges, use the Google Cloud console to delete your project if you do not need it.

What's next

Learn more about Cloud Text-to-Speech by reading the basics.
Review the list of available voices you can use for synthetic speech.

Create audio from text by using the command line

Before you begin

Synthesize audio from text

curl (Linux, macOS, or Cloud Shell)

PowerShell (Windows)

Linux

Mac OSX

Windows

Clean up

What's next