Sebelum memulai
Anda memberi Google project ID Google Cloud untuk mengakses Fitur Suara Kustom. Pastikan Anda telah mengaktifkan penagihan dan mengaktifkan Text-to-Speech API dan AutoML API untuk project ini, serta menginstal dan melakukan inisialisasi Google Cloud CLI.
Berikan peran AutoML Predictor ke Akun Google Anda di project Anda. Untuk mengetahui petunjuknya, lihat Memberikan satu peran.
Menggunakan command line
Metode HTTP dan URL:
POST https://texttospeech.googleapis.com/v1beta1/text:synthesize
Meminta isi JSON:
{
"input":{
"text":"Android is a mobile operating system developed by Google, based on the Linux kernel and designed primarily for touchscreen mobile devices such as smartphones and tablets."
},
"voice":{
"custom_voice":{
"reportedUsage":"REALTIME",
"model":"projects/{project_id}/locations/us-central1/models/{model_id}",
}
},
"audioConfig":{
"audioEncoding":"LINEAR16"
}
}
Simpan isi permintaan dalam file bernama request.json
dan jalankan perintah berikut, dengan mengganti PROJECT_ID
dengan project ID Anda:
curl -X POST \ -H "Authorization: Bearer "$(gcloud auth print-access-token) \ -H "x-goog-user-project: PROJECT_ID" \ -H "Content-Type: application/json; charset=utf-8" \ -d @request.json \ https://texttospeech.googleapis.com/v1beta1/text:synthesize
Menggunakan library klien Python
Download library klien dan jalankan perintah berikut:
pip install texttospeech-custom-voice-beta-v1beta1-py.tar.gz
pip install protobuf --upgrade pip
Contoh kode Python
"""Synthesize custom voice from the input string of text or ssml.
"""
from google.cloud import texttospeech_v1beta1
# Instantiate a client
client = texttospeech_v1beta1.TextToSpeechClient()
# Set the text input to be synthesized
synthesis_input = texttospeech_v1beta1.types.SynthesisInput(text="Hello, World!")
# Build the voice request, select the language code ("en-US") and specify
# custom voice model and speaker_id.
custom_voice = texttospeech_v1beta1.types.CustomVoiceParams(
reported_usage=texttospeech_v1beta1.enums.CustomVoiceParams.ReportedUsage.REALTIME,
model='projects/{project_id}/locations/us-central1/models/{model_id}')
voice = texttospeech_v1beta1.types.VoiceSelectionParams(
language_code='en-US',
custom_voice=custom_voice)
# Select the type of audio file you want returned
audio_config = texttospeech_v1beta1.types.AudioConfig(
audio_encoding=texttospeech_v1beta1.enums.AudioEncoding.LINEAR16)
# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(synthesis_input, voice, audio_config)
# The response's audio_content is binary.
with open('output.wav', 'wb') as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.wav"')