Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
En esta página, se describe cómo crear un diálogo con varios interlocutores que se creó con Text-to-Speech.
Puedes generar audio con varios interlocutores para crear un diálogo. Esto puede ser útil para entrevistas, narración interactiva, videojuegos, plataformas de aprendizaje en línea y soluciones de accesibilidad.
La siguiente voz es compatible con el audio con varios locutores:
en-US-Studio-Multispeaker
interlocutor: R
interlocutor: S
interlocutor: T
interlocutor: U
Ejemplo. Este es un audio que se generó con varios interlocutores
Ejemplo de cómo usar el lenguaje de marcado de varios interlocutores
Este es un ejemplo que demuestra cómo usar el marcado de varios interlocutores.
"""Synthesizes speech for multiple speakers.Make sure to be working in a virtual environment."""fromgoogle.cloudimporttexttospeech_v1beta1astexttospeech# Instantiates a clientclient=texttospeech.TextToSpeechClient()multi_speaker_markup=texttospeech.MultiSpeakerMarkup(turns=[texttospeech.MultiSpeakerMarkup.Turn(text="I've heard that the Google Cloud multi-speaker audio generation sounds amazing!",speaker="R",),texttospeech.MultiSpeakerMarkup.Turn(text="Oh? What's so good about it?",speaker="S"),texttospeech.MultiSpeakerMarkup.Turn(text="Well..",speaker="R"),texttospeech.MultiSpeakerMarkup.Turn(text="Well what?",speaker="S"),texttospeech.MultiSpeakerMarkup.Turn(text="Well, you should find it out by yourself!",speaker="R"),texttospeech.MultiSpeakerMarkup.Turn(text="Alright alright, let's try it out!",speaker="S"),])# Set the text input to be synthesizedsynthesis_input=texttospeech.SynthesisInput(multi_speaker_markup=multi_speaker_markup)# Build the voice request, select the language code ('en-US') and the voicevoice=texttospeech.VoiceSelectionParams(language_code="en-US",name="en-US-Studio-MultiSpeaker")# Select the type of audio file you want returnedaudio_config=texttospeech.AudioConfig(audio_encoding=texttospeech.AudioEncoding.MP3)# Perform the text-to-speech request on the text input with the selected# voice parameters and audio file typeresponse=client.synthesize_speech(input=synthesis_input,voice=voice,audio_config=audio_config)# The response's audio_content is binary.
withopen("output.mp3","wb")asout:# Write the response to the output file.out.write(response.audio_content)print('Audio content written to file "output.mp3"')
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],["Última actualización: 2025-09-08 (UTC)"],[],[],null,["# Generate dialogue with multiple speakers\n\n| **Note:** This feature is only available to projects in allowlist. Please contact us if you want to use this feature.\n\nThis page describes how to create a dialogue with multiple speakers\ncreated by Text-to-Speech.\n\nYou can generate audio with multiple speakers to create a dialogue. This can be\nuseful for interviews, interactive storytelling, video games,\ne-learning platforms, and accessibility solutions.\n\nThe following voice is supported for audio with multiple speakers:\n\n- `en-US-Studio-Multispeaker`\n - speaker: `R`\n - speaker: `S`\n - speaker: `T`\n - speaker: `U`\n\nYour browser does not support the audio element. \n\n*Example. This sample is audio that was generated using multiple speakers.* \n| **Note:** This feature supports a maximum of two speakers. It's an experimental offering and not in the [list of voices](/text-to-speech/docs/list-voices).\n\nExample of how to use multi-speaker markup\n------------------------------------------\n\nThis is an example that demonstrates how to use multi-speaker markup. \n\n### Python\n\n\nTo learn how to install and use the client library for Text-to-Speech, see\n[Text-to-Speech client libraries](/text-to-speech/docs/libraries).\n\n\nFor more information, see the\n[Text-to-Speech Python API\nreference documentation](/python/docs/reference/texttospeech/latest).\n\n\nTo authenticate to Text-to-Speech, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n \"\"\"Synthesizes speech for multiple speakers.\n Make sure to be working in a virtual environment.\n \"\"\"\n from google.cloud import texttospeech_v1beta1 as texttospeech\n\n # Instantiates a client\n client = texttospeech.TextToSpeechClient()\n\n multi_speaker_markup = texttospeech.MultiSpeakerMarkup(\n turns=[\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"I've heard that the Google Cloud multi-speaker audio generation sounds amazing!\",\n speaker=\"R\",\n ),\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"Oh? What's so good about it?\", speaker=\"S\"\n ),\n texttospeech.MultiSpeakerMarkup.Turn(text=\"Well..\", speaker=\"R\"),\n texttospeech.MultiSpeakerMarkup.Turn(text=\"Well what?\", speaker=\"S\"),\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"Well, you should find it out by yourself!\", speaker=\"R\"\n ),\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"Alright alright, let's try it out!\", speaker=\"S\"\n ),\n ]\n )\n\n # Set the text input to be synthesized\n synthesis_input = texttospeech.SynthesisInput(\n multi_speaker_markup=multi_speaker_markup\n )\n\n # Build the voice request, select the language code ('en-US') and the voice\n voice = texttospeech.VoiceSelectionParams(\n language_code=\"en-US\", name=\"en-US-Studio-MultiSpeaker\"\n )\n\n # Select the type of audio file you want returned\n audio_config = texttospeech.AudioConfig(\n audio_encoding=texttospeech.AudioEncoding.MP3\n )\n\n # Perform the text-to-speech request on the text input with the selected\n # voice parameters and audio file type\n response = client.synthesize_speech(\n input=synthesis_input, voice=voice, audio_config=audio_config\n )\n\n # The response's audio_content is binary.\n with open(\"output.mp3\", \"wb\") as out:\n # Write the response to the output file.\n out.write(response.audio_content)\n print('Audio content written to file \"output.mp3\"')\n\n\u003cbr /\u003e"]]