Contains text input to be synthesized. Either text or ssml must be supplied. Supplying both or neither returns google.rpc.Code.INVALID_ARGUMENT. The input size is limited to 5000 bytes.
| JSON representation | 
|---|
| { "customPronunciations": { object ( | 
| Fields | |
|---|---|
| customPronunciations | 
 Optional. The pronunciation customizations are applied to the input. If this is set, the input is synthesized using the given pronunciation customizations. The initial support is for en-us, with plans to expand to other locales in the future. Instant Clone voices aren't supported. In order to customize the pronunciation of a phrase, there must be an exact match of the phrase in the input types. If using SSML, the phrase must not be inside a phoneme tag. | 
| Union field input_source. The input source, which is either plain text or SSML.input_sourcecan be only one of the following: | |
| text | 
 The raw text to be synthesized. | 
| markup | 
 Markup for HD voices specifically. This field may not be used with any other voices. | 
| ssml | 
 The SSML document to be synthesized. The SSML document must be valid and well-formed. Otherwise the RPC will fail and return  | 
| multiSpeakerMarkup | 
 The multi-speaker input to be synthesized. Only applicable for multi-speaker synthesis. | 
| prompt | 
 This system instruction is supported only for controllable/promptable voice models. If this system instruction is used, we pass the unedited text to Gemini-TTS. Otherwise, a default system instruction is used. AI Studio calls this system instruction, Style Instructions. |