透過集合功能整理內容
你可以依據偏好儲存及分類內容。
使用指令列從文字建立音訊
本文將逐步說明如何使用指令列,向 Text-to-Speech 提出要求。如要進一步瞭解 Text-to-Speech 的基本概念,請參閱「Text-to-Speech 基礎」。
事前準備
您必須先完成下列動作,才能向 Text-to-Speech API 傳送要求。詳情請參閱「事前準備」頁面。
從文字合成音訊
如要將文字轉換為音訊,請向 https://texttospeech.googleapis.com/v1/text:synthesize
端點發出 HTTP POST 要求。在 POST 指令的主體中,請在 voice
設定區段指定要合成的語音類型,在 input
區段的 text
欄位中指定要合成的文字,然後在 audioConfig
區段中指定要建立的音訊類型。
在指令列執行下列 REST 要求,使用 Text-to-Speech 從文字合成音訊。這個指令會使用 gcloud auth
application-default print-access-token
指令,擷取要求的授權權杖。
使用任何要求資料之前,請先替換以下項目:
- PROJECT_ID:專案的英數字元 ID。 Google Cloud
HTTP 方法和網址:
POST https://texttospeech.googleapis.com/v1/text:synthesize
JSON 要求主體:
{
"input": {
"text": "Android is a mobile operating system developed by Google, based on the Linux kernel and designed primarily for touchscreen mobile devices such as smartphones and tablets."
},
"voice": {
"languageCode": "en-gb",
"name": "en-GB-Standard-A",
"ssmlGender": "FEMALE"
},
"audioConfig": {
"audioEncoding": "MP3"
}
}
如要傳送要求,請展開以下其中一個選項:
curl (Linux、macOS 或 Cloud Shell)
將要求主體儲存在名為 request.json
的檔案中,然後執行下列指令:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://texttospeech.googleapis.com/v1/text:synthesize"
PowerShell (Windows)
將要求主體儲存在名為 request.json
的檔案中,然後執行下列指令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://texttospeech.googleapis.com/v1/text:synthesize" | Select-Object -Expand Content
您應該會收到如下的 JSON 回應:
{
"audioContent": "//NExAASCCIIAAhEAGAAEMW4kAYPnwwIKw/BBTpwTvB+IAxIfghUfW.."
}
REST 指令的 JSON 輸出含有 base64 編碼格式的合成音訊。將「audioContent
」欄位的內容複製到名為「synthesize-output-base64.txt
」的新檔案中。您的新檔案看起來會像下方這樣:
//NExAARqoIIAAhEuWAAAGNmBGMY4EBcxvABAXBPmPIAF//yAuh9Tn5CEap3/o
...
VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
將 synthesize-output-base64.txt
檔案的內容解碼到名為 synthesized-audio.mp3
的新檔案。如要瞭解如何解碼 Base64,請參閱「解碼 Base64 編碼的音訊內容」。
Linux
僅將 base-64 編碼內容複製到文字檔。
使用 base64 指令列工具和 -d
標記,解碼來源文字檔:
$ base64 SOURCE_BASE64_TEXT_FILE -d > DESTINATION_AUDIO_FILE
Mac OSX
僅將 base-64 編碼內容複製到文字檔。
使用 base64 指令列工具解碼來源文字檔:
$ base64 --decode SOURCE_BASE64_TEXT_FILE > DESTINATION_AUDIO_FILE
Windows
僅將 base-64 編碼內容複製到文字檔。
使用 certutil
指令解碼來源文字檔。
certutil -decode SOURCE_BASE64_TEXT_FILE DESTINATION_AUDIO_FILE
在音訊應用程式或音訊裝置上播放 synthesized-audio.mp3
的內容。您也可以在 Chrome 瀏覽器中前往 synthesized-audio.mp3
所在的資料夾,藉此開啟檔案並播放音訊,例如 file://my_file_path/synthesized-audio.mp3
後續步驟
- 如要進一步瞭解 Cloud Text-to-Speech,請參閱基本概念。
- 查看可用於合成語音的可用語音清單。
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
上次更新時間:2025-09-04 (世界標準時間)。
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-04 (世界標準時間)。"],[],[],null,["# Quickstart: Create audio from text by using the command line\n\nCreate audio from text by using the command line\n================================================\n\nThis document walks you through the process of making a request to\nText-to-Speech using the command line. To learn more about the fundamental\nconcepts in Text-to-Speech, read\n[Text-to-Speech Basics](/text-to-speech/docs/basics).\n\nBefore you begin\n----------------\n\nBefore you can send a request to the Text-to-Speech API, you must have completed\nthe following actions. See the\n[before you begin](/text-to-speech/docs/before-you-begin) page for details.\n\n- Enable Text-to-Speech on a GCP project.\n- Make sure billing is enabled for Text-to-Speech.\n-\n [Install](/sdk/docs/install) the Google Cloud CLI, and then\n [sign in to the gcloud CLI with your federated identity](/iam/docs/workforce-log-in-gcloud).\n\n After signing in,\n [initialize](/sdk/docs/initializing) the Google Cloud CLI by running the following command:\n\n ```bash\n gcloud init\n ```\n\nSynthesize audio from text\n--------------------------\n\nYou can convert text to audio by making an HTTP POST request to the\n`https://texttospeech.googleapis.com/v1/text:synthesize` endpoint. In\nthe body of your POST command, specify the type of voice to synthesize in\nthe `voice` configuration section, specify the text to synthesize in the\n`text` field of the `input` section, and specify the type of audio to create\nin the `audioConfig` section.\n\n1. Execute the REST request below at the command line to synthesize audio from\n text using Text-to-Speech. The command uses the `gcloud auth\n application-default print-access-token` command to retrieve an authorization\n token for the request.\n\n\n Before using any of the request data,\n make the following replacements:\n - \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the alphanumeric ID of your Google Cloud project.\n\n\n HTTP method and URL:\n\n ```\n POST https://texttospeech.googleapis.com/v1/text:synthesize\n ```\n\n\n Request JSON body:\n\n ```\n {\n \"input\": {\n \"text\": \"Android is a mobile operating system developed by Google, based on the Linux kernel and designed primarily for touchscreen mobile devices such as smartphones and tablets.\"\n },\n \"voice\": {\n \"languageCode\": \"en-gb\",\n \"name\": \"en-GB-Standard-A\",\n \"ssmlGender\": \"FEMALE\"\n },\n \"audioConfig\": {\n \"audioEncoding\": \"MP3\"\n }\n }\n ```\n\n To send your request, expand one of these options:\n\n #### curl (Linux, macOS, or Cloud Shell)\n\n | **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) , or by using [Cloud Shell](/shell/docs), which automatically logs you into the `gcloud` CLI . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\n Save the request body in a file named `request.json`,\n and execute the following command:\n\n ```\n curl -X POST \\\n -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n -H \"x-goog-user-project: PROJECT_ID\" \\\n -H \"Content-Type: application/json; charset=utf-8\" \\\n -d @request.json \\\n \"https://texttospeech.googleapis.com/v1/text:synthesize\"\n ```\n\n #### PowerShell (Windows)\n\n | **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\n Save the request body in a file named `request.json`,\n and execute the following command:\n\n ```\n $cred = gcloud auth print-access-token\n $headers = @{ \"Authorization\" = \"Bearer $cred\"; \"x-goog-user-project\" = \"PROJECT_ID\" }\n\n Invoke-WebRequest `\n -Method POST `\n -Headers $headers `\n -ContentType: \"application/json; charset=utf-8\" `\n -InFile request.json `\n -Uri \"https://texttospeech.googleapis.com/v1/text:synthesize\" | Select-Object -Expand Content\n ```\n\n You should receive a JSON response similar to the following:\n\n ```\n {\n \"audioContent\": \"//NExAASCCIIAAhEAGAAEMW4kAYPnwwIKw/BBTpwTvB+IAxIfghUfW..\"\n }\n ```\n\n \u003cbr /\u003e\n\n2. The JSON output for the REST command contains the synthesized audio in\n base64-encoded format. Copy the contents of the `audioContent` field into\n a new file named `synthesize-output-base64.txt`. Your new file will look something\n like the following:\n\n ```\n //NExAARqoIIAAhEuWAAAGNmBGMY4EBcxvABAXBPmPIAF//yAuh9Tn5CEap3/o\n ...\n VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV\n ```\n3. Decode the contents of the `synthesize-output-base64.txt` file into a new file\n named `synthesized-audio.mp3`. For information\n on decoding base64, see [Decoding Base64-Encoded Audio Content](/text-to-speech/docs/base64-decoding).\n\n ### Linux\n\n 1. Copy only the base-64 encoded content into a text file.\n\n 2. Decode the source text file using the base64 command line tool\n by using the `-d` flag:\n\n ```bash\n $ base64 SOURCE_BASE64_TEXT_FILE -d \u003e DESTINATION_AUDIO_FILE\n ```\n\n ### Mac OSX\n\n 1. Copy only the base-64 encoded content into a text file.\n\n 2. Decode the source text file using the base64 command line tool:\n\n ```bash\n $ base64 --decode SOURCE_BASE64_TEXT_FILE \u003e DESTINATION_AUDIO_FILE\n ```\n\n ### Windows\n\n 1. Copy only the base-64 encoded content into a text file.\n\n 2. Decode the source text file using the\n [`certutil`](https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/certutil) command.\n\n ```bash\n certutil -decode SOURCE_BASE64_TEXT_FILE DESTINATION_AUDIO_FILE\n ```\n4. Play the contents of `synthesized-audio.mp3` in an audio application or on\n an audio device. You can also open the `synthesized-audio.mp3` in the Chrome\n browser to play the audio by navigating to the folder that contains\n the file, for example `file://my_file_path/synthesized-audio.mp3`\n\nClean up\n--------\n\nTo avoid unnecessary Google Cloud Platform charges, use the\n[Google Cloud console](https://console.cloud.google.com/) to delete your project if you do not need it.\n\nWhat's next\n-----------\n\n\n- Learn more about Cloud Text-to-Speech by reading the [basics](/text-to-speech/docs/basics).\n- Review the list of [available voices](/text-to-speech/docs/voices) you can use for synthetic speech.\n\n\u003cbr /\u003e"]]