Text-to-Speech release notes

This page documents production updates to Text-to-Speech. You can periodically check this page for announcements about new or updated features, bug fixes, known issues, and deprecated functionality.

You can see the latest product updates for all of Google Cloud on the Google Cloud page, browse and filter all release notes in the Google Cloud console, or programmatically access release notes in BigQuery.

To get the latest product updates delivered to you, add the URL of this page to your feed reader, or add the feed URL directly.

October 30, 2024

Studio Voices now support synthesis with multiple speakers to generate audios for interviews, interactive storytelling, video games, e-learning platforms, and accessibility solutions.

October 18, 2024

Journey Voices and streaming synthesis now support the de-de, en-gb, en-in, es-us, fr-ca, fr-fr, and it-it locales.

September 10, 2024

Journey Voices is now in Preview and supports text streaming.

May 14, 2024

Cloud Text-to-Speech now offers updated Journey voices with an additional speaker, en-us-Journey-O.

April 19, 2024

Cloud Text-to-Speech now offers es-ES Studio voices: es-ES-Studio-C and es-ES-Studio-F

February 26, 2024

Studio voices are now GA.

Casual voices are now in preview.

December 29, 2023

Journey voices are now in experimental.

November 29, 2023

Cloud Text-to-Speech now offers de-DE and fr-FR Studio voices: de-DE-Studio-B, de-DE-Studio-C, fr-FR-Studio-A, and fr-FR-Studio-D.

November 06, 2023

As of November 13 2023, speaker en-US-Studio-M will no longer be available. All requests sent to en-US-Studio-M will be routed to speaker en-US-Studio-Q. There is no action needed.

November 03, 2023

Cloud Text-to-Speech now offers en-GB Studio voices: en-GB-Studio-B and en-GB-Studio-C.

October 25, 2023

Styles are now supported in Neural2 voices through SSML. The following styles are supported

  • <google:emotion name="apologetic">
  • <google:emotion name="calm">
  • <google:emotion name="empathetic">
  • <google:emotion name="firm">
  • <google:emotion name="lively">
for the following voices:
  • en-us-Neural2-F
  • en-us-Neural2-J

October 24, 2023

Studio voices now support 5,000 bytes of either text or SSML input per synthesis request.

Long Audio Synthesis now supports SSML inputs.

October 16, 2023

The Long Audio Synthesis API now supports the following languages: English, Spanish, French, German, Japanese, Hindi, Italian, Korean, Portuguese, Thai, Vietnamese, Danish, Filipino.

There is no longer billing differentiation for Cloud Text-to-Speech Offline Custom Voice API calls. See the <ReportedUsage> documentation for more details.

June 28, 2023

Studio voices now support SSML, except for the following tags: <mark>, <emphasis>, <prosody>, and <lang>

March 16, 2023

Cloud Text-to-Speech now offers Long Audio Synthesis. This new API can be used to synthesize texts longer than 5 KB. For more information about API usage using the command line, see Create long audio from text by using the command line.

March 06, 2023

Text-to-Speech now offers a Spanish Studio voice, es-US-Studio-B, in addition to its existing English Studio voices.

February 16, 2023

Text-to-Speech offers these new voices. See the supported voices page for a complete list of voices and audio samples.

  • eu-ES-Standard-A
  • gl-ES-Standard-A

February 08, 2023

Text-to-Speech now offers Studio voices. This voice type is designed specifically for use with long-form texts such as narration and news reading. See the supported voices page for a complete list of voices and audio samples.

  1. en-US-Studio-M
  2. en-US-Studio-O

January 10, 2023

On or after July 9th, 2023, Cloud Text-to-Speech will replace the following voices with new voices of similar quality and accent. The new voices are available to try now. No action will be needed from you to switch to the new voice on July 9th, 2023. However, you are free to switch to the new voice at any time.

  • Removing ml-IN-Standard-A
    • Redirecting to ml-IN-Standard-C
  • Removing ml-IN-Wavenet-A
    • Redirecting ml-IN-Wavenet-C
  • Removing ml-IN-Standard-B
    • Redirecting to ml-IN-Standard-D
  • Removing ml-IN-Wavenet-B
    • Redirecting ml-IN-Wavenet-D
  • Removing bn-IN-Standard-A
    • Redirecting to bn-IN-Standard-C
  • Removing bn-IN-Wavenet-A
    • Redirecting bn-IN-Wavenet-C
  • Removing bn-IN-Standard-B
    • Redirecting to bn-IN-Standard-D
  • Removing bn-IN-Wavenet-B
    • Redirecting bn-IN-Wavenet-D
  • Removing kn-IN-Standard-A
    • Redirecting to kn-IN-Standard-C
  • Removing kn-IN-Wavenet-A
    • Redirecting kn-IN-Wavenet-C
  • Removing kn-IN-Standard-B
    • Redirecting to kn-IN-Standard-D
  • Removing kn-IN-Wavenet-B
    • Redirecting kn-IN-Wavenet-D
  • Removing gu-IN-Standard-A
    • Redirecting to gu-IN-Standard-C
  • Removing gu-IN-Wavenet-A
    • Redirecting gu-IN-Wavenet-C
  • Removing gu-IN-Standard-B
    • Redirecting to gu-IN-Standard-D
  • Removing gu-IN-Wavenet-B
    • Redirecting gu-IN-Wavenet-D
  • Removing it-IT-Standard-A
    • Redirecting to it-IT-Standard-B
  • Removing it-IT-Wavenet-A
    • Redirecting to it-IT-Wavenet-B
  • Removing es-ES-Standard-A
    • Redirecting to es-ES-Standard-C

December 22, 2022

Text-to-Speech now offers these new voices. See the supported voices page for a complete list of voices and audio samples.

  1. ml-IN-Wavenet-C
  2. ml-IN-Wavenet-D

Note these voices are bilingual with en-IN.

Text-to-Speech now offers these new news reading voices. See the supported voices page for a complete list of voices and audio samples.

  1. es-US-News-D
  2. es-US-News-E
  3. es-US-News-F
  4. es-US-News-G
  5. en-AU-News-E
  6. en-AU-News-F
  7. en-AU-News-G
  8. en-GB-News-G
  9. en-GB-News-H
  10. en-GB-News-I
  11. en-GB-News-J
  12. en-GB-News-K
  13. en-GB-News-L
  14. en-GB-News-M

November 29, 2022

Text-to-Speech now offers additional Neural2 voices across 9 locales with 40+ speakers. Voices are available in the us-central1, us, and eu endpoints. See the supported voices page for a complete list of voices and audio samples.

November 10, 2022

Text-to-Speech now offers these new voices. See the supported voices page for a complete list of voices and audio samples.

  1. en-US-News-K
  2. en-US-News-L
  3. en-US-News-M
  4. en-US-News-N

October 24, 2022

Text-to-Speech improved the quality of these voices. See the supported voices page for a complete list of voices and audio samples.

  1. en-GB-Wavenet-A
  2. en-GB-Wavenet-B
  3. en-GB-Wavenet-C
  4. en-GB-Wavenet-D
  5. en-GB-Wavenet-F
  6. es-ES-Wavenet-B
  7. es-ES-Wavenet-C
  8. es-ES-Wavenet-D
  9. hi-IN-Wavenet-A
  10. hi-IN-Wavenet-B
  11. hi-IN-Wavenet-C
  12. hi-IN-Wavenet-D

October 07, 2022

Text-to-Speech now offers these new voices. See the supported voices page for a complete list of voices and audio samples.

  1. mr-IN-Wavenet-A
  2. mr-IN-Standard-A
  3. mr-IN-Wavenet-B
  4. mr-IN-Standard-B
  5. mr-IN-Wavenet-C
  6. mr-IN-Standard-C

On or after April 8th, 2023, Cloud Text-to-Speech will replace the following voices with new voices of similar quality and accent. The new voices are available to try now. No action will be needed from you to switch to the new voice on April 8th, 2023. However, you are free to switch to the new voice at anytime

  1. Removing ta-IN-Standard-A
    1. Redirecting to ta-IN-Standard-C
  2. Removing ta-IN-Wavenet-A
    1. Redirecting to ta-IN-Wavenet-C
  3. Removing ta-IN-Standard-B
    1. Redirecting to ta-IN-Standard-D
  4. Removing ta-IN-Wavenet-B
    1. Redirecting to ta-IN-Wavenet-D
  5. Removing pt-BR-Standard-A
    1. Redirecting to pt-BR-Standard-C
  6. Removing pt-BR-Wavenet-A
    1. Redirecting to pt-BR-Wavenet-C
  7. Removing ja-JP-Standard-A
    1. Redirecting to ja-JP-Standard-B
  8. Removing ja-JP-Wavenet-A
    1. Redirecting to ja-JP-Wavenet-B

September 01, 2022

Text-to-Speech now offers these new voices. See the supported voices page for a complete list of voices and audio samples.

  1. ta-IN-Wavenet-C
  2. ta-IN-Standard-C
  3. ta-IN-Wavenet-D
  4. ta-IN-Standard-D

August 19, 2022

Text-to-Speech has improved the quality of these voices

  1. pt-br-Standard-A
  2. pt-br-Standard-B

August 05, 2022

Text-to-Speech now offers these new voices. See the supported voices page for a complete list of voices and audio samples.

  1. pt-BR-Standard-C
  2. pt-BR-Wavenet-C

June 27, 2022

Cloud Text-to-Speech now supports Neural2 voices in addition to Standard and WaveNet voice generation models. Neural2 uses Custom Voice technology without the need to train a unique voice. Neural2 voices are in Preview and are currently available in a single region for a limited number of languages.

March 09, 2022

Text-to-Speech now offers regional endpoints for the following places. See the How-to guides for more information on how to use these endpoints. - Europe - https://eu-texttospeech.googleapis.com - US - https://us-texttospeech.googleapis.com

June 17, 2021

Text-to-Speech now offers voices in the following new languages. See the supported voices page for a complete list of voices and audio samples.

  • ms-MY (Malay, Malaysia)
  • nl-BE (Dutch, Belgium)

April 09, 2021

Text-to-Speech now offers voices in the following new languages. See the supported voices page for a complete list of voices and audio samples.

  • es-US (Spanish, US)
  • af-ZA (Afrikaans, South Africa)
  • bg-BG (Bulgarian, Bulgaria)
  • ca-ES (Catalan, Spain)
  • is-IS (Icelandic, Iceland)
  • lv-LV (Latvian, Latvia)
  • sr-RS (Serbian, Cyrillic)

April 07, 2021

Text-to-Speech now supports MULAW and ALAW audio encodings. See the AudioEncoding reference documentation for details.

March 01, 2021

Text-to-Speech has launched Beta support of new SSML tags: <phoneme>, <mark>, <lang>, <voice>, and <say-as interpret-as="duration"> to specify durations. See the phonemes for a list of phonemes available for your language.

Support for the <prosody> SSML tag has been enhanced to produce continuous TTS when possible.

  • Text-to-speech has resolved an issue that affected how volume changes are calculated, resulting in different but correct behavior.
  • Text-to-speech has resolved an issue that affected how pitch changes are calculated, resulting in different but correct behavior.

Text-to-Speech has improved the continuity of mixed-media results. Now when you mix text and sounds within a <s>/<s> block, Text-to-Speech generates a much shorter pause and better transition between the synthesized speech and the sound.

Text-to-Speech has improved its handling of speech synthesis requests sent using SSML markup.

Text-to-Speech has improved the verbalization and pacing of phone numbers.

January 22, 2021

New language: Text-to-Speech now supports Romanian (ro-RO). See the supported voices page for details and audio samples.

New voice: Text-to-Speech now offers 2 new Bengali (bn-IN) WaveNet voices. See the supported voices page for details and audio samples.

August 24, 2020

Text-to-Speech now offers four new English (US) voices, available as both WaveNet and Standard models. See the supported voices and languages page for more details.

Text-to-Speech now offers four new Chinese (Hong Kong) voices, available as Standard models. See the supported voices and languages page for more details.

May 01, 2020

Cloud Text-to-Speech now offers 36 new voices (both Standard and WaveNet) in the following languages. See the Supported Voices and Languages page for complete details.

  • Arabic
  • Bengali (India)
  • English (India)
  • French (France)
  • German (Germany)
  • Gujarati (India)
  • Hindi (India)
  • Indonesian (Indonesia)
  • Kannada (India)
  • Malayalam (India)
  • Mandarin Chinese
  • Russian (Russia)
  • Tamil (India)
  • Telugu (India)
  • Thai (Thailand)

August 27, 2019

Cloud Text-to-Speech now offers 76 new voices, both standard and WaveNet, in the following languages:

  • Arabic
  • Czech
  • Dutch
  • English (India)
  • Filipino
  • Finnish
  • Greek
  • Hindi
  • Hungarian
  • Indonesian
  • Italian
  • Japanese
  • Mandarin Chinese
  • Norwegian
  • Vietnamese

February 05, 2019

The audio profile feature is generally available for use in new applications. Cloud Text-to-Speech API now allows developers to specify an audio profile for the audio generated from Cloud Text-to-Speech API. Audio profiles are optimized for specific hardware used for playback, from headphones to car stereos.

Added new Standard and WaveNet voices in the following languages and variants:

  • Danish (Denmark)
  • Polish (Poland)
  • Portuguese (Brazil)
  • Russian (Russia)
  • Slovak (Slovakia)
  • Turkish (Turkey)
  • Ukrainian (Ukraine)

Review the voices list for complete details.

August 28, 2018

Cloud Text-to-Speech API general availability (GA) release.

This release includes the public availability of the v1 API endpoint, both in REST and RPC.

July 24, 2018

Added new WaveNet voices in the following languages and variants:

  • Dutch (Netherlands)
  • English (Australia)
  • English (UK)
  • German
  • Italian
  • Japanese

Review the voices list for complete details.

Cloud Text-to-Speech API now allows developers to specify an audio profile for the audio generated from Cloud Text-to-Speech API. Audio profiles are optimized for specific hardware used for playback, from headphones to car stereos.

June 01, 2018

Added Korean (ko-KR) voice. Review the voices list for complete details.

March 27, 2018

Cloud Text-to-Speech API is now available in beta.

March 20, 2018

The names of voices provided for speech synthesis in the Text-to-Speech API have changed. Previous versions of the voice names do not work. To see the list of voices provided in the Text-to-Speech API including the correct names, see the voice list.

March 02, 2018

The gender field changed to ssmlGender.

February 02, 2018

Add voices in the following languages:

  • English (US)
  • French (Canada)
  • Dutch
  • Portuguese (Brazil)
  • Swedish
  • Turkish

See the voices list for complete details.

November 10, 2017

Cloud Text-to-Speech API Alpha release.