- Supports 100+ voices across 20+ languages and variants, with more to come soon.
- WaveNet Voices
- Exclusive multilingual access to DeepMind WaveNet voices that provide the most natural-sounding speech.
- Text and SSML Support
- Customize your speech with SSML tags that allow you to add pauses, numbers, date and time formatting, and other pronunciation instructions.
- Speaking Rate Tuning
- Customize your speaking rate to be 4x faster or slower than the normal rate.
- Pitch Tuning
- Customize the pitch of your selected voice, up to 20 semitones more or less than the default output.
- Volume Gain Control
- Increase the volume of the output by up to 16db or decrease the volume up to -96db.
- Audio Format Flexibility
- Choose from a number of audio formats including mp3, Linear16, and Ogg Opus.
- Audio Profiles
- Optimize for the type of speaker from which your speech is intended to play, such as headphones or phone lines.
|Feature||Monthly free tier||Paid usage|
|Standard (non-WaveNet) voices||0 to 4 million characters||$4.00 USD / 1 million characters|
|WaveNet voices||0 to 1 million characters||$16.00 USD / 1 million characters|
A product or feature listed on this page is in beta. For more information on our product launch stages, see here.
Cloud AI products comply with the SLA policies listed here. They may offer different latency or availability guarantees from other Google Cloud services.