Method: text.synthesize

Synthesizes speech synchronously: receive results after all text input has been processed.

HTTP request

POST https://texttospeech.googleapis.com/v1/text:synthesize

The URL uses gRPC Transcoding syntax.

Request body

The request body contains data with the following structure:

JSON representation
{
  "input": {
    object (SynthesisInput)
  },
  "voice": {
    object (VoiceSelectionParams)
  },
  "audioConfig": {
    object (AudioConfig)
  },
  "advancedVoiceOptions": {
    object (AdvancedVoiceOptions)
  }
}
Fields
input

object (SynthesisInput)

Required. The Synthesizer requires either plain text or SSML as input.

voice

object (VoiceSelectionParams)

Required. The desired voice of the synthesized audio.

audioConfig

object (AudioConfig)

Required. The configuration of the synthesized audio.

advancedVoiceOptions

object (AdvancedVoiceOptions)

Advanced voice options.

Response body

The message returned to the client by the text.synthesize method.

If successful, the response body contains data with the following structure:

JSON representation
{
  "audioContent": string
}
Fields
audioContent

string (bytes format)

The audio data bytes encoded as specified in the request, including the header for encodings that are wrapped in containers (e.g. MP3, OGG_OPUS). For LINEAR16 audio, we include the WAV header. Note: as with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64.

A base64-encoded string.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

AdvancedVoiceOptions

Used for advanced voice options.

JSON representation
{
  "lowLatencyJourneySynthesis": boolean
}
Fields
lowLatencyJourneySynthesis

boolean

Only for Journey voices. If false, the synthesis is context aware and has a higher latency.