AI & Machine Learning

Google Cloud Text-to-Speech API now supports custom voices

March 4, 2022

Calum Barnes

Product Manager, Cloud Speech

With the rise of digital assistants and conversational interfaces, people have grown accustomed to hearing and speaking to synthetic voices. But how do these voices sound and how do they reflect on your brand? It's important for all companies to build a strong identity and brand association with their conversational AI systems and this starts with the synthetic voice.

That’s why we are excited to announce the general availability of Custom Voice in our Cloud Text-to-Speech (TTS) API, a new feature that lets you train custom voice models with your own audio recordings to create unique experiences.

For businesses looking to build a strong brand identity, establishing a unique voice can help turn mobile app interactions or customer service based on interactive voice responses (IVR) into differentiated customer experiences. Our TTS API has included a speech synthesis service with a static list of voices for some time, but now, with Custom Voice, moving beyond these predefined options is easier than ever.

Custom Voice lets you simply submit your audio recordings to get access to the new voice directly in the TTS API. Custom Voice TTS includes guidance on the audio requirements to help make sure you generate a high quality custom TTS voice model. Once this new model is trained, all you have to do to start using the newly trained voice is reference the model ID in your calls to the Cloud TTS API.

At Google, we are committed to building safe and accountable AI products, not only because it’s the right thing to do, but because it is a critical step in ensuring successful use in production. As part of Google Cloud’s Responsible AI governance process, we conducted a deep ethical evaluation of Custom Voice TTS, and its relation to synthetic media, in order to surface and mitigate potential harms that it may create. If you are interested in Custom Voice TTS, there is a review process to help ensure each use case is aligned with our AI Principles and adequate voice actor consent is given.

Additionally, to verify that voice actors are actually the ones producing the audio, you will need to submit an audio file producing a sentence that Google Cloud chooses (for example: “I agree that my voice will be used to create a synthetic custom Text-to-Speech voice).

We’re looking forward to seeing this API help businesses solve problems in an easy, fast, and scalable way. TTS Custom Voice is now GA in these languages:

English (US)
English (AU)
English (UK)
Spanish (US)
Spanish (Spain)
French (France)
French (Canada)
Italian (Italy)
German (Germany)
Portugues (Brazil)
Japanese (Japan)

We plan to continue expanding this lineup in order to meet your needs. Ready to try for yourself? Contact your seller to get started on your use case evaluation today!

AI & Machine Learning

Unveiling a new visual user interface for Google Cloud’s Speech-to-Text API

Developers can more leverage Google Cloud’s Speech-to-Text API in Google Cloud Console’s visual interface.

By Calum Barnes • 2-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/Google_Cloud_AIML_thumbnail.max-900x900.jpg

Posted in

https://storage.googleapis.com/gweb-cloudblog-publish/images/DO_NOT_USE_yYynxpM.max-700x700.jpg

Telecommunications

The Responsible Revolution: How generative AI is transforming the telco and media industry

By Malika Malik • 6-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/Google_Cloud_AIML_thumbnail.max-700x700.jpg

AI & Machine Learning

Announcing PyTorch/XLA 2.3: Distributed training, dev improvements, and GPUs

By Nisha Mariam Johnson • 3-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/Google_Cloud_App_Dev_4.max-700x700.jpg

Application Development

AI will break the stagnation in developer productivity, but only if you do it right

By Richard Seroter • 2-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/prewave.max-700x700.jpg

Customers

How Prewave is helping to secure deep supply chains worldwide with AI on Google Cloud

By Daniel Höfer • 4-minute read

Google Cloud Text-to-Speech API now supports custom voices

Calum Barnes

Unveiling a new visual user interface for Google Cloud’s Speech-to-Text API

Related articles

The Responsible Revolution: How generative AI is transforming the telco and media industry

Announcing PyTorch/XLA 2.3: Distributed training, dev improvements, and GPUs

AI will break the stagnation in developer productivity, but only if you do it right

How Prewave is helping to secure deep supply chains worldwide with AI on Google Cloud