本頁面由 Cloud Translation API 翻譯而成。

取得自動加上標點符號

本頁說明如何從 Speech-to-Text 轉錄結果中取得自動標點符號。啟用這項功能後，Speech-to-Text 會自動推斷音訊資料中是否出現句號、逗號和問號，並將這些標點符號加入轉錄稿。

根據預設，Speech-to-Text 的語音辨識結果中不會包含標點符號。不過，您可以要求 Speech-to-Text 自動偵測標點符號，並在轉錄結果中插入標點符號。啟用自動加上標點符號後，語音轉文字服務也會自動將每個句號和問號後的第一個字母大寫。

如要啟用自動標點符號功能，請將要求中的 RecognitionConfig 參數的 enableAutomaticPunctuation 欄位設為 true。Speech-to-Text API 支援所有語音辨識方法的自動標點符號：speech:recognize、speech:longrunningrecognize 和串流。

下列程式碼範例示範如何在轉錄要求中取得自動標點符號詳細資料。

通訊協定

如要瞭解完整的詳細資訊，請參閱 speech:recognize API 端點。

如要執行同步語音辨識，請提出 POST 要求並提供適當的要求內容。以下為使用 curl 的 POST 要求示例。這個範例使用 Google Cloud CLI 產生存取權杖。如需安裝 gcloud CLI 的操作說明，請參閱快速入門導覽課程。

curl -s -H "Content-Type: application/json" \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    https://speech.googleapis.com/v1/speech:recognize \
    --data '{
  "config": {
    "encoding":"FLAC",
    "sampleRateHertz": 16000,
    "languageCode": "en-US",
    "enableAutomaticPunctuation": true
  },
  "audio": {
    "uri":"gs://cloud-samples-tests/speech/brooklyn.flac"
  }
}'

如需設定要求內容的更多資訊，請參閱 RecognitionConfig 參考說明文件。

如果要求成功，伺服器會傳回 200 OK HTTP 狀態碼與 JSON 格式的回應：

{
  "results": [
    {
      "alternatives": [
        {
          "transcript": "How old is the Brooklyn Bridge?",
          "confidence": 0.98360395
        }
      ]
    }
  ]
}

Go

如要瞭解如何安裝及使用 Speech-to-Text 的用戶端程式庫，請參閱這篇文章。詳情請參閱 Speech-to-Text Go API 參考說明文件。

如要向語音轉文字服務進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


import (
	"context"
	"fmt"
	"io"
	"os"
	"strings"

	speech "cloud.google.com/go/speech/apiv1"
	"cloud.google.com/go/speech/apiv1/speechpb"
)

func autoPunctuation(w io.Writer, path string) error {
	ctx := context.Background()

	client, err := speech.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %w", err)
	}
	defer client.Close()

	// path = "../testdata/commercial_mono.wav"
	data, err := os.ReadFile(path)
	if err != nil {
		return fmt.Errorf("ReadFile: %w", err)
	}

	resp, err := client.Recognize(ctx, &speechpb.RecognizeRequest{
		Config: &speechpb.RecognitionConfig{
			Encoding:        speechpb.RecognitionConfig_LINEAR16,
			SampleRateHertz: 8000,
			LanguageCode:    "en-US",
			// Enable automatic punctuation.
			EnableAutomaticPunctuation: true,
		},
		Audio: &speechpb.RecognitionAudio{
			AudioSource: &speechpb.RecognitionAudio_Content{Content: data},
		},
	})
	if err != nil {
		return fmt.Errorf("Recognize: %w", err)
	}

	for i, result := range resp.Results {
		fmt.Fprintf(w, "%s\n", strings.Repeat("-", 20))
		fmt.Fprintf(w, "Result %d\n", i+1)
		for j, alternative := range result.Alternatives {
			fmt.Fprintf(w, "Alternative %d: %s\n", j+1, alternative.Transcript)
		}
	}
	return nil
}

Java

如要瞭解如何安裝及使用 Speech-to-Text 的用戶端程式庫，請參閱這篇文章。詳情請參閱 Speech-to-Text Java API 參考說明文件。

如要向語音轉文字服務進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

/**
 * Performs transcription on remote FLAC file and prints the transcription.
 *
 * @param gcsUri the path to the remote FLAC audio file to transcribe.
 */
public static void transcribeGcsWithAutomaticPunctuation(String gcsUri) throws Exception {
  try (SpeechClient speechClient = SpeechClient.create()) {
    // Configure request with raw PCM audio
    RecognitionConfig config =
        RecognitionConfig.newBuilder()
            .setEncoding(AudioEncoding.FLAC)
            .setLanguageCode("en-US")
            .setSampleRateHertz(16000)
            .setEnableAutomaticPunctuation(true)
            .build();

    // Set the remote path for the audio file
    RecognitionAudio audio = RecognitionAudio.newBuilder().setUri(gcsUri).build();

    // Use non-blocking call for getting file transcription
    OperationFuture<LongRunningRecognizeResponse, LongRunningRecognizeMetadata> response =
        speechClient.longRunningRecognizeAsync(config, audio);

    while (!response.isDone()) {
      System.out.println("Waiting for response...");
      Thread.sleep(10000);
    }

    // Just print the first result here.
    SpeechRecognitionResult result = response.get().getResultsList().get(0);

    // There can be several alternative transcripts for a given chunk of speech. Just use the
    // first (most likely) one here.
    SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);

    // Print out the result
    System.out.printf("Transcript : %s\n", alternative.getTranscript());
  }
}

Node.js

如要瞭解如何安裝及使用 Speech-to-Text 的用戶端程式庫，請參閱這篇文章。詳情請參閱 Speech-to-Text Node.js API 參考說明文件。

如要向語音轉文字服務進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

// Imports the Google Cloud client library for API
/**
 * TODO(developer): Update client library import to use new
 * version of API when desired features become available
 */

const speech = require('@google-cloud/speech');
const fs = require('fs');

// Creates a client
const client = new speech.SpeechClient();

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 * Include the sampleRateHertz field in the config object.
 */
// const filename = 'Local path to audio file, e.g. /path/to/audio.raw';
// const encoding = 'Encoding of the audio file, e.g. LINEAR16';
// const sampleRateHertz = 16000;
// const languageCode = 'BCP-47 language code, e.g. en-US';

const config = {
  encoding: encoding,
  languageCode: languageCode,
  enableAutomaticPunctuation: true,
};

const audio = {
  content: fs.readFileSync(filename).toString('base64'),
};

const request = {
  config: config,
  audio: audio,
};

// Detects speech in the audio file
const [response] = await client.recognize(request);
const transcription = response.results
  .map(result => result.alternatives[0].transcript)
  .join('\n');
console.log('Transcription: ', transcription);

Python

如要瞭解如何安裝及使用 Speech-to-Text 的用戶端程式庫，請參閱這篇文章。詳情請參閱 Speech-to-Text Python API 參考說明文件。

如要向語音轉文字服務進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


from google.cloud import speech


def transcribe_file_with_auto_punctuation(audio_file: str) -> speech.RecognizeResponse:
    """Transcribe the given audio file with auto punctuation enabled.
    Args:
        audio_file (str): Path to the local audio file to be transcribed.
    Returns:
        speech.RecognizeResponse: The response containing the transcription results.
    """
    client = speech.SpeechClient()

    with open(audio_file, "rb") as f:
        audio_content = f.read()

    audio = speech.RecognitionAudio(content=audio_content)
    config = speech.RecognitionConfig(
        encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=8000,
        language_code="en-US",
        # Enable automatic punctuation
        enable_automatic_punctuation=True,
    )

    response = client.recognize(config=config, audio=audio)

    for i, result in enumerate(response.results):
        alternative = result.alternatives[0]
        print("-" * 20)
        print(f"First alternative of result {i}")
        print(f"Transcript: {alternative.transcript}")

    return response

其他語言

C#：請按照用戶端程式庫頁面的 C# 設定說明操作，然後前往 .NET 適用的 Speech-to-Text 參考說明文件。

PHP：請按照用戶端程式庫頁面的 PHP 設定說明操作，然後前往 PHP 適用的 Speech-to-Text 參考文件。

Ruby：請按照用戶端程式庫頁面的Ruby 設定說明操作，然後前往 Ruby 適用的 Speech-to-Text 參考說明文件。

後續步驟

瞭解如何提出同步轉錄要求。

取得自動加上標點符號 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

通訊協定

Go

Java

Node.js

Python

其他語言

後續步驟

取得自動加上標點符號