컬렉션을 사용해 정리하기
내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.
양방향 스트리밍으로 음성 합성
이 문서에서는 양방향 스트리밍을 사용하여 오디오를 합성하는 과정을 안내합니다.
양방향 스트리밍을 사용하면 텍스트 입력을 전송하고 동시에 오디오 데이터를 수신할 수 있습니다. 즉, 전체 입력 텍스트가 전송되기 전에 음성 합성을 시작할 수 있으므로, 지연 시간을 줄이고 실시간 상호작용을 지원할 수 있습니다. 음성 어시스턴트와 대화형 게임은 양방향 스트리밍을 이용해서 보다 역동적이고 응답성이 뛰어난 애플리케이션을 만듭니다.
Text-to-Speech의 기본 개념에 대한 자세한 내용은 Text-to-Speech 기본 사항을 참조하세요.
시작하기 전에
Text-to-Speech API에 요청을 보내려면 먼저 다음 작업을 완료해야 합니다. 자세한 내용은 시작하기 전에 페이지를 참조하세요.
양방향 스트리밍으로 음성 합성
클라이언트 라이브러리 설치
Python
라이브러리를 설치하기 전에 Python 개발을 위한 환경이 준비됐는지 확인하세요.
pip install --upgrade google-cloud-texttospeech
텍스트 스트림을 전송하고 오디오 스트림 수신
API는 StreamingSynthesisInput
또는 StreamingSynthesizeConfig
가 포함된 StreamingSynthesizeRequest
유형의 요청 스트림을 수락합니다.
텍스트 입력을 제공하는 StreamingSynthesisInput
으로 StreamingSynthesizeRequest
스트림을 전송하기 전에 StreamingSynthesizeConfig
가 포함된 StreamingSynthesizeRequest
를 정확히 하나만 전송합니다.
Text-to-Speech 스트리밍은 Chirp 3: HD 음성만 호환됩니다.
달리 명시되지 않는 한 이 페이지의 콘텐츠에는 Creative Commons Attribution 4.0 라이선스에 따라 라이선스가 부여되며, 코드 샘플에는 Apache 2.0 라이선스에 따라 라이선스가 부여됩니다. 자세한 내용은 Google Developers 사이트 정책을 참조하세요. 자바는 Oracle 및/또는 Oracle 계열사의 등록 상표입니다.
최종 업데이트: 2025-07-21(UTC)
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-07-21(UTC)"],[],[],null,["# Quickstart: Synthesize speech with bidirectional streaming quickstart\n\nSynthesize speech with bidirectional streaming\n==============================================\n\nThis document walks you through the process of synthesizing audio using\nbidirectional streaming.\n\nBidirectional streaming lets you send text input and receive audio data\nsimultaneously. This means that you can start synthesizing speech before the\ncomplete input text is sent, which reduces latency and enables real-time\ninteractions. Voice assistants and interactive games use bidirectional streaming\nto create more dynamic and responsive applications.\n\nTo learn more about the fundamental concepts in Text-to-Speech, read\n[Text-to-Speech Basics](/text-to-speech/docs/basics).\n\nBefore you begin\n----------------\n\nBefore you can send a request to the Text-to-Speech API, you must have completed\nthe following actions. See the\n[before you begin](/text-to-speech/docs/before-you-begin) page for details.\n\n- Enable Text-to-Speech on a Google Cloud project.\n 1. Make sure billing is enabled for Text-to-Speech.\n-\n [Install](/sdk/docs/install) the Google Cloud CLI, and then\n [sign in to the gcloud CLI with your federated identity](/iam/docs/workforce-log-in-gcloud).\n\n After signing in,\n [initialize](/sdk/docs/initializing) the Google Cloud CLI by running the following command:\n\n ```bash\n gcloud init\n ```\n\nSynthesize speech with bidirectional streaming\n----------------------------------------------\n\n### Install the client library\n\n### Python\n\nBefore installing the library, make sure you've [prepared your environment for Python development](/python/docs/setup). \n\n```\npip install --upgrade google-cloud-texttospeech\n```\n\n\u003cbr /\u003e\n\n### Send a stream of text and receive a stream of audio\n\nThe API accepts a stream of requests with type `StreamingSynthesizeRequest`,\nwhich contain either `StreamingSynthesisInput` or `StreamingSynthesizeConfig`.\n\nBefore sending a stream `StreamingSynthesizeRequest` with\n`StreamingSynthesisInput`, which provides text input, send exactly one\n`StreamingSynthesizeRequest` with a `StreamingSynthesizeConfig`.\n\nStreaming Text-to-Speech is only compatible with [Chirp 3: HD voices](/text-to-speech/docs/chirp3-hd). \n\n### Python\n\nBefore running the example, make sure you've [prepared your environment for Python development](/python/docs/setup). \n\n #!/usr/bin/env python\n # Copyright 2024 Google LLC\n #\n # Licensed under the Apache License, Version 2.0 (the \"License\");\n # you may not use this file except in compliance with the License.\n # You may obtain a copy of the License at\n #\n # http://www.apache.org/licenses/LICENSE-2.0\n #\n # Unless required by applicable law or agreed to in writing, software\n # distributed under the License is distributed on an \"AS IS\" BASIS,\n # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n # See the License for the specific language governing permissions and\n # limitations under the License.\n #\n\n \"\"\"Google Cloud Text-To-Speech API streaming sample application .\n\n Example usage:\n python streaming_tts_quickstart.py\n \"\"\"\n\n\n def run_streaming_tts_quickstart():\n \"\"\"Synthesizes speech from a stream of input text.\"\"\"\n from google.cloud import texttospeech\n\n client = texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.services.text_to_speech.TextToSpeechClient.html()\n\n # See https://cloud.google.com/text-to-speech/docs/voices for all voices.\n streaming_config = texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.StreamingSynthesizeConfig.html(\n voice=texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.VoiceSelectionParams.html(\n name=\"en-US-Chirp3-HD-Charon\",\n language_code=\"en-US\",\n )\n )\n\n # Set the config for your stream. The first request must contain your config, and then each subsequent request must contain text.\n config_request = texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.StreamingSynthesizeRequest.html(\n streaming_config=streaming_config\n )\n\n text_iterator = [\n \"Hello there. \",\n \"How are you \",\n \"today? It's \",\n \"such nice weather outside.\",\n ]\n\n # Request generator. Consider using Gemini or another LLM with output streaming as a generator.\n def request_generator():\n yield config_request\n for text in text_iterator:\n yield texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.StreamingSynthesizeRequest.html(\n input=texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.StreamingSynthesisInput.html(text=text)\n )\n\n streaming_responses = client.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.services.text_to_speech.TextToSpeechClient.html#google_cloud_texttospeech_v1_services_text_to_speech_TextToSpeechClient_streaming_synthesize(request_generator())\n\n for response in streaming_responses:\n print(f\"Audio content size in bytes is: {len(response.audio_content)}\")\n\n\n if __name__ == \"__main__\":\n run_streaming_tts_quickstart()\n\n\u003cbr /\u003e\n\nClean up\n--------\n\nTo avoid unnecessary Google Cloud Platform charges, use the\n[Google Cloud console](https://console.cloud.google.com/) to delete your project if you do not need it.\n\nWhat's next\n-----------\n\n\n- Learn more about Cloud Text-to-Speech by reading the [basics](/text-to-speech/docs/basics).\n- Review the list of [available voices](/text-to-speech/docs/voices) you can use for synthetic speech.\n\n\u003cbr /\u003e"]]