Coba Gemini 1.5 Pro, model multimodal kami yang paling canggih di Vertex AI, dan lihat apa yang dapat Anda bangun dengan jendela konteks token 1 juta. Coba Gemini 1.5 Pro, model multimodal kami yang paling canggih di Vertex AI, dan lihat apa yang dapat Anda bangun dengan jendela konteks token 1 juta.

Membuat audio dari teks menggunakan library klien

Panduan memulai ini akan memandu Anda melalui proses penggunaan library klien untuk membuat permintaan ke Text-to-Speech, yang akan membuat audio dari teks.

Untuk mempelajari konsep dasar Text-to-Speech lebih lanjut, baca Dasar-Dasar Text-to-Speech. Untuk mengetahui suara sintetis yang tersedia untuk bahasa Anda, lihat halaman suara dan bahasa yang didukung.

Sebelum memulai

Sebelum dapat mengirim permintaan ke Text-to-Speech API, Anda harus sudah menyelesaikan tindakan berikut. Lihat halaman sebelum memulai untuk mengetahui detailnya.

Aktifkan Text-to-Speech di project Google Cloud.
Pastikan penagihan diaktifkan untuk Text-to-Speech.
Instal Google Cloud CLI, lalu initialize dengan menjalankan perintah berikut:
```
gcloud init
```
Buat kredensial autentikasi lokal untuk Akun Google Anda:
```
gcloud auth application-default login
```

Menginstal library klien

Go

go get cloud.google.com/go/texttospeech/apiv1

Java

If you are using Maven, add the following to your pom.xml file. For more information about BOMs, see The Google Cloud Platform Libraries BOM.

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>26.37.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-texttospeech</artifactId>
  </dependency>

If you are using Gradle, add the following to your dependencies:

implementation 'com.google.cloud:google-cloud-texttospeech:2.42.0'

If you are using sbt, add the following to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-texttospeech" % "2.42.0"

If you're using Visual Studio Code, IntelliJ, or Eclipse, you can add client libraries to your project using the following IDE plugins:

The plugins provide additional functionality, such as key management for service accounts. Refer to each plugin's documentation for details.

Node.js

Sebelum menginstal library, pastikan Anda telah menyiapkan lingkungan untuk pengembangan Node.js.

npm install --save @google-cloud/text-to-speech

Python

Sebelum menginstal library, pastikan Anda telah menyiapkan lingkungan untuk pengembangan Python.

pip install --upgrade google-cloud-texttospeech

Bahasa tambahan

C# : Ikuti Petunjuk penyiapan C# di halaman library klien, lalu kunjungi dokumentasi referensi Text-to-Speech untuk .NET.

PHP : Ikuti Petunjuk penyiapan PHP di halaman library klien, lalu kunjungi dokumentasi referensi Text-to-Speech untuk PHP.

Ruby : Ikuti Petunjuk penyiapan Ruby di halaman library klien, lalu buka Dokumentasi referensi Text-to-Speech untuk Ruby.

Membuat data audio

Sekarang Anda dapat menggunakan Text-to-Speech untuk membuat file audio sintetis dari ucapan manusia. Gunakan kode berikut untuk mengirim permintaan synthesize ke Text-to-Speech API.

Go


// Command quickstart generates an audio file with the content "Hello, World!".
package main

import (
	"context"
	"fmt"
	"io/ioutil"
	"log"

	texttospeech "cloud.google.com/go/texttospeech/apiv1"
	"cloud.google.com/go/texttospeech/apiv1/texttospeechpb"
)

func main() {
	// Instantiates a client.
	ctx := context.Background()

	client, err := texttospeech.NewClient(ctx)
	if err != nil {
		log.Fatal(err)
	}
	defer client.Close()

	// Perform the text-to-speech request on the text input with the selected
	// voice parameters and audio file type.
	req := texttospeechpb.SynthesizeSpeechRequest{
		// Set the text input to be synthesized.
		Input: &texttospeechpb.SynthesisInput{
			InputSource: &texttospeechpb.SynthesisInput_Text{Text: "Hello, World!"},
		},
		// Build the voice request, select the language code ("en-US") and the SSML
		// voice gender ("neutral").
		Voice: &texttospeechpb.VoiceSelectionParams{
			LanguageCode: "en-US",
			SsmlGender:   texttospeechpb.SsmlVoiceGender_NEUTRAL,
		},
		// Select the type of audio file you want returned.
		AudioConfig: &texttospeechpb.AudioConfig{
			AudioEncoding: texttospeechpb.AudioEncoding_MP3,
		},
	}

	resp, err := client.SynthesizeSpeech(ctx, &req)
	if err != nil {
		log.Fatal(err)
	}

	// The resp's AudioContent is binary.
	filename := "output.mp3"
	err = ioutil.WriteFile(filename, resp.AudioContent, 0644)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Audio content written to file: %v\n", filename)
}

Java

// Imports the Google Cloud client library
import com.google.cloud.texttospeech.v1.AudioConfig;
import com.google.cloud.texttospeech.v1.AudioEncoding;
import com.google.cloud.texttospeech.v1.SsmlVoiceGender;
import com.google.cloud.texttospeech.v1.SynthesisInput;
import com.google.cloud.texttospeech.v1.SynthesizeSpeechResponse;
import com.google.cloud.texttospeech.v1.TextToSpeechClient;
import com.google.cloud.texttospeech.v1.VoiceSelectionParams;
import com.google.protobuf.ByteString;
import java.io.FileOutputStream;
import java.io.OutputStream;

/**
 * Google Cloud TextToSpeech API sample application. Example usage: mvn package exec:java
 * -Dexec.mainClass='com.example.texttospeech.QuickstartSample'
 */
public class QuickstartSample {

  /** Demonstrates using the Text-to-Speech API. */
  public static void main(String... args) throws Exception {
    // Instantiates a client
    try (TextToSpeechClient textToSpeechClient = TextToSpeechClient.create()) {
      // Set the text input to be synthesized
      SynthesisInput input = SynthesisInput.newBuilder().setText("Hello, World!").build();

      // Build the voice request, select the language code ("en-US") and the ssml voice gender
      // ("neutral")
      VoiceSelectionParams voice =
          VoiceSelectionParams.newBuilder()
              .setLanguageCode("en-US")
              .setSsmlGender(SsmlVoiceGender.NEUTRAL)
              .build();

      // Select the type of audio file you want returned
      AudioConfig audioConfig =
          AudioConfig.newBuilder().setAudioEncoding(AudioEncoding.MP3).build();

      // Perform the text-to-speech request on the text input with the selected voice parameters and
      // audio file type
      SynthesizeSpeechResponse response =
          textToSpeechClient.synthesizeSpeech(input, voice, audioConfig);

      // Get the audio contents from the response
      ByteString audioContents = response.getAudioContent();

      // Write the response to the output file.
      try (OutputStream out = new FileOutputStream("output.mp3")) {
        out.write(audioContents.toByteArray());
        System.out.println("Audio content written to file \"output.mp3\"");
      }
    }
  }
}

Node.js

Sebelum menjalankan contoh, pastikan Anda telah menyiapkan lingkungan untuk pengembangan Node.js.

// Imports the Google Cloud client library
const textToSpeech = require('@google-cloud/text-to-speech');

// Import other required libraries
const fs = require('fs');
const util = require('util');
// Creates a client
const client = new textToSpeech.TextToSpeechClient();
async function quickStart() {
  // The text to synthesize
  const text = 'hello, world!';

  // Construct the request
  const request = {
    input: {text: text},
    // Select the language and SSML voice gender (optional)
    voice: {languageCode: 'en-US', ssmlGender: 'NEUTRAL'},
    // select the type of audio encoding
    audioConfig: {audioEncoding: 'MP3'},
  };

  // Performs the text-to-speech request
  const [response] = await client.synthesizeSpeech(request);
  // Write the binary audio content to a local file
  const writeFile = util.promisify(fs.writeFile);
  await writeFile('output.mp3', response.audioContent, 'binary');
  console.log('Audio content written to file: output.mp3');
}
quickStart();

Python

Sebelum menjalankan contoh, pastikan Anda telah menyiapkan lingkungan untuk pengembangan Python.

"""Synthesizes speech from the input string of text or ssml.
Make sure to be working in a virtual environment.

Note: ssml must be well-formed according to:
    https://www.w3.org/TR/speech-synthesis/
"""
from google.cloud import texttospeech

# Instantiates a client
client = texttospeech.TextToSpeechClient()

# Set the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")

# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.VoiceSelectionParams(
    language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL
)

# Select the type of audio file you want returned
audio_config = texttospeech.AudioConfig(
    audio_encoding=texttospeech.AudioEncoding.MP3
)

# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
    input=synthesis_input, voice=voice, audio_config=audio_config
)

# The response's audio_content is binary.
with open("output.mp3", "wb") as out:
    # Write the response to the output file.
    out.write(response.audio_content)
    print('Audio content written to file "output.mp3"')

Selamat! Anda telah mengirimkan permintaan pertama ke Text-to-Speech.

Bagaimana hasilnya?

Pembersihan

Agar tidak menimbulkan biaya pada akun Google Cloud Anda untuk resource yang digunakan pada halaman ini, ikuti langkah-langkah berikut.

Gunakan konsol Google Cloud untuk menghapus project jika Anda tidak membutuhkannya.

Langkah selanjutnya

Pelajari Cloud Text-to-Speech lebih lanjut dengan membaca dasar-dasarnya.
Tinjau daftar suara yang tersedia yang dapat Anda gunakan untuk ucapan sintetis.