Restez organisé à l'aide des collections
Enregistrez et classez les contenus selon vos préférences.
Cette page explique comment créer un dialogue avec plusieurs locuteurs créés par Text-to-Speech.
Vous pouvez générer du contenu audio avec plusieurs intervenants pour créer un dialogue. Cela peut être utile pour les interviews, le storytelling interactif, les jeux vidéo, les plates-formes d'apprentissage et les solutions d'accessibilité.
La voix suivante est disponible pour du contenu audio avec plusieurs locuteurs :
en-US-Studio-Multispeaker
Intervenant : R
Intervenant : S
Intervenant : T
Intervenant : U
Exemple. Cet extrait est un exemple de contenu audio généré avec plusieurs locuteurs.
Exemple d'utilisation du balisage multi-locuteurs
Voici un exemple qui montre comment utiliser le balisage multi-locuteurs.
"""Synthesizes speech for multiple speakers.Make sure to be working in a virtual environment."""fromgoogle.cloudimporttexttospeech_v1beta1astexttospeech# Instantiates a clientclient=texttospeech.TextToSpeechClient()multi_speaker_markup=texttospeech.MultiSpeakerMarkup(turns=[texttospeech.MultiSpeakerMarkup.Turn(text="I've heard that the Google Cloud multi-speaker audio generation sounds amazing!",speaker="R",),texttospeech.MultiSpeakerMarkup.Turn(text="Oh? What's so good about it?",speaker="S"),texttospeech.MultiSpeakerMarkup.Turn(text="Well..",speaker="R"),texttospeech.MultiSpeakerMarkup.Turn(text="Well what?",speaker="S"),texttospeech.MultiSpeakerMarkup.Turn(text="Well, you should find it out by yourself!",speaker="R"),texttospeech.MultiSpeakerMarkup.Turn(text="Alright alright, let's try it out!",speaker="S"),])# Set the text input to be synthesizedsynthesis_input=texttospeech.SynthesisInput(multi_speaker_markup=multi_speaker_markup)# Build the voice request, select the language code ('en-US') and the voicevoice=texttospeech.VoiceSelectionParams(language_code="en-US",name="en-US-Studio-MultiSpeaker")# Select the type of audio file you want returnedaudio_config=texttospeech.AudioConfig(audio_encoding=texttospeech.AudioEncoding.MP3)# Perform the text-to-speech request on the text input with the selected# voice parameters and audio file typeresponse=client.synthesize_speech(input=synthesis_input,voice=voice,audio_config=audio_config)# The response's audio_content is binary.
withopen("output.mp3","wb")asout:# Write the response to the output file.out.write(response.audio_content)print('Audio content written to file "output.mp3"')
Sauf indication contraire, le contenu de cette page est régi par une licence Creative Commons Attribution 4.0, et les échantillons de code sont régis par une licence Apache 2.0. Pour en savoir plus, consultez les Règles du site Google Developers. Java est une marque déposée d'Oracle et/ou de ses sociétés affiliées.
Dernière mise à jour le 2025/09/08 (UTC).
[[["Facile à comprendre","easyToUnderstand","thumb-up"],["J'ai pu résoudre mon problème","solvedMyProblem","thumb-up"],["Autre","otherUp","thumb-up"]],[["Difficile à comprendre","hardToUnderstand","thumb-down"],["Informations ou exemple de code incorrects","incorrectInformationOrSampleCode","thumb-down"],["Il n'y a pas l'information/les exemples dont j'ai besoin","missingTheInformationSamplesINeed","thumb-down"],["Problème de traduction","translationIssue","thumb-down"],["Autre","otherDown","thumb-down"]],["Dernière mise à jour le 2025/09/08 (UTC)."],[],[],null,["# Generate dialogue with multiple speakers\n\n| **Note:** This feature is only available to projects in allowlist. Please contact us if you want to use this feature.\n\nThis page describes how to create a dialogue with multiple speakers\ncreated by Text-to-Speech.\n\nYou can generate audio with multiple speakers to create a dialogue. This can be\nuseful for interviews, interactive storytelling, video games,\ne-learning platforms, and accessibility solutions.\n\nThe following voice is supported for audio with multiple speakers:\n\n- `en-US-Studio-Multispeaker`\n - speaker: `R`\n - speaker: `S`\n - speaker: `T`\n - speaker: `U`\n\nYour browser does not support the audio element. \n\n*Example. This sample is audio that was generated using multiple speakers.* \n| **Note:** This feature supports a maximum of two speakers. It's an experimental offering and not in the [list of voices](/text-to-speech/docs/list-voices).\n\nExample of how to use multi-speaker markup\n------------------------------------------\n\nThis is an example that demonstrates how to use multi-speaker markup. \n\n### Python\n\n\nTo learn how to install and use the client library for Text-to-Speech, see\n[Text-to-Speech client libraries](/text-to-speech/docs/libraries).\n\n\nFor more information, see the\n[Text-to-Speech Python API\nreference documentation](/python/docs/reference/texttospeech/latest).\n\n\nTo authenticate to Text-to-Speech, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n \"\"\"Synthesizes speech for multiple speakers.\n Make sure to be working in a virtual environment.\n \"\"\"\n from google.cloud import texttospeech_v1beta1 as texttospeech\n\n # Instantiates a client\n client = texttospeech.TextToSpeechClient()\n\n multi_speaker_markup = texttospeech.MultiSpeakerMarkup(\n turns=[\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"I've heard that the Google Cloud multi-speaker audio generation sounds amazing!\",\n speaker=\"R\",\n ),\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"Oh? What's so good about it?\", speaker=\"S\"\n ),\n texttospeech.MultiSpeakerMarkup.Turn(text=\"Well..\", speaker=\"R\"),\n texttospeech.MultiSpeakerMarkup.Turn(text=\"Well what?\", speaker=\"S\"),\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"Well, you should find it out by yourself!\", speaker=\"R\"\n ),\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"Alright alright, let's try it out!\", speaker=\"S\"\n ),\n ]\n )\n\n # Set the text input to be synthesized\n synthesis_input = texttospeech.SynthesisInput(\n multi_speaker_markup=multi_speaker_markup\n )\n\n # Build the voice request, select the language code ('en-US') and the voice\n voice = texttospeech.VoiceSelectionParams(\n language_code=\"en-US\", name=\"en-US-Studio-MultiSpeaker\"\n )\n\n # Select the type of audio file you want returned\n audio_config = texttospeech.AudioConfig(\n audio_encoding=texttospeech.AudioEncoding.MP3\n )\n\n # Perform the text-to-speech request on the text input with the selected\n # voice parameters and audio file type\n response = client.synthesize_speech(\n input=synthesis_input, voice=voice, audio_config=audio_config\n )\n\n # The response's audio_content is binary.\n with open(\"output.mp3\", \"wb\") as out:\n # Write the response to the output file.\n out.write(response.audio_content)\n print('Audio content written to file \"output.mp3\"')\n\n\u003cbr /\u003e"]]