- HTTP request
 - Request body
 - Response body
 - Authorization scopes
 - TimepointType
 - AdvancedVoiceOptions
 - Timepoint
 - Try it!
 
Synthesizes speech synchronously: receive results after all text input has been processed.
HTTP request
POST https://texttospeech.googleapis.com/v1beta1/text:synthesize
The URL uses gRPC Transcoding syntax.
Request body
The request body contains data with the following structure:
| JSON representation | 
|---|
{ "input": { object (  | 
                
| Fields | |
|---|---|
input | 
                  
                     
 Required. The Synthesizer requires either plain text or SSML as input.  | 
                
voice | 
                  
                     
 Required. The desired voice of the synthesized audio.  | 
                
audioConfig | 
                  
                     
 Required. The configuration of the synthesized audio.  | 
                
enableTimePointing[] | 
                  
                     
 Whether and what timepoints are returned in the response.  | 
                
advancedVoiceOptions | 
                  
                     
 Advanced voice options.  | 
                
Response body
The message returned to the client by the text.synthesize method.
If successful, the response body contains data with the following structure:
| JSON representation | 
|---|
{ "audioContent": string, "timepoints": [ { object (  | 
                  
| Fields | |
|---|---|
audioContent | 
                    
                       
 The audio data bytes encoded as specified in the request, including the header for encodings that are wrapped in containers (e.g. MP3, OGG_OPUS). For LINEAR16 audio, we include the WAV header. Note: as with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64. A base64-encoded string.  | 
                  
timepoints[] | 
                    
                       
 A link between a position in the original request input and a corresponding time in the output audio. It's only supported via   | 
                  
audioConfig | 
                    
                       
 The audio metadata of   | 
                  
Authorization scopes
Requires the following OAuth scope:
https://www.googleapis.com/auth/cloud-platform
For more information, see the Authentication Overview.
TimepointType
The type of timepoint information that is returned in the response.
| Enums | |
|---|---|
TIMEPOINT_TYPE_UNSPECIFIED | 
                Not specified. No timepoint information will be returned. | 
SSML_MARK | 
                Timepoint information of <mark> tags in SSML input will be returned. | 
              
AdvancedVoiceOptions
Used for advanced voice options.
| JSON representation | 
|---|
{ "lowLatencyJourneySynthesis": boolean }  | 
              
| Fields | |
|---|---|
lowLatencyJourneySynthesis | 
                
                   
 Only for Journey voices. If false, the synthesis is context aware and has a higher latency.  | 
              
Timepoint
This contains a mapping between a certain point in the input text and a corresponding time in the output audio.
| JSON representation | 
|---|
{ "markName": string, "timeSeconds": number }  | 
              
| Fields | |
|---|---|
markName | 
                
                   
 Timepoint name as received from the client within   | 
              
timeSeconds | 
                
                   
 Time offset in seconds from the start of the synthesized audio.  |