Speech-to-Text Client Libraries

}

本页面介绍了如何开始使用 Speech-to-Text API 的 Cloud 客户端库。通过客户端库，您可以更轻松地使用支持的语言访问 Google Cloud API。虽然您可以通过向服务器发出原始请求来直接使用 Google Cloud API，但客户端库可实现简化，从而显著减少您需要编写的代码量。

请参阅客户端库说明，详细了解 Cloud 客户端库和旧版 Google API 客户端库。

安装客户端库

C#

如果您使用的是 Visual Studio 2017 或更高版本，请打开 nuget 软件包管理器窗口并输入以下内容：

Install-Package Google.Apis

如果您使用 .NET Core 命令行界面工具来安装依赖项，请运行以下命令：

dotnet add package Google.Apis

如需了解详情，请参阅设置 C# 开发环境。

Go

go get cloud.google.com/go/speech/apiv1

如需了解详情，请参阅设置 Go 开发环境。

Java

如果您使用的是 Maven，请将以下代码添加到您的 pom.xml 文件中。如需详细了解 BOM，请参阅 Google Cloud Platform 库 BOM。

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>26.37.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-speech</artifactId>
  </dependency>

如果您使用的是 Gradle，请将以下代码添加到您的依赖项中：

implementation 'com.google.cloud:google-cloud-speech:4.36.0'

如果您使用的是 sbt，请将以下代码添加到您的依赖项中：

libraryDependencies += "com.google.cloud" % "google-cloud-speech" % "4.36.0"

如果您使用的是 Visual Studio Code、IntelliJ 或 Eclipse，可以通过以下 IDE 插件将客户端库添加到您的项目中：

上述插件还提供其他功能，例如服务账号密钥管理。如需了解详情，请参阅各个插件相应的文档。

如需了解详情，请参阅设置 Java 开发环境。

Node.js

npm install --save @google-cloud/speech

如需了解详情，请参阅设置 Node.js 开发环境。

PHP

composer require google/apiclient

如需了解详情，请参阅在 Google Cloud 上使用 PHP。

Python

pip install --upgrade google-cloud-speech

如需了解详情，请参阅设置 Python 开发环境。

Ruby

gem install google-api-client

如需了解详情，请参阅设置 Ruby 开发环境。

设置身份验证

为了对 Google Cloud API 的调用进行身份验证，客户端库支持应用默认凭据 (ADC)；这些库会在一组指定的位置查找凭据，并使用这些凭据对发送到 API 的请求进行身份验证。借助 ADC，您可以在各种环境（例如本地开发或生产环境）中为您的应用提供凭据，而无需修改应用代码。

对于生产环境，设置 ADC 的方式取决于服务和上下文。如需了解详情，请参阅设置应用默认凭据。

对于本地开发环境，您可以使用与您的 Google 账号关联的凭据设置 ADC：

安装并初始化 gcloud CLI。

初始化 gcloud CLI 时，请务必指定您在其中有权访问应用所需的资源的 Google Cloud 项目。
创建凭据文件：
```
gcloud auth application-default login
```
登录屏幕随即出现。在您登录后，您的凭据会存储在 ADC 使用的本地凭据文件中。

使用客户端库

以下示例展示了如何使用客户端库。

Go


// Sample speech-quickstart uses the Google Cloud Speech API to transcribe
// audio.
package main

import (
	"context"
	"fmt"
	"log"

	speech "cloud.google.com/go/speech/apiv1"
	"cloud.google.com/go/speech/apiv1/speechpb"
)

func main() {
	ctx := context.Background()

	// Creates a client.
	client, err := speech.NewClient(ctx)
	if err != nil {
		log.Fatalf("Failed to create client: %v", err)
	}
	defer client.Close()

	// The path to the remote audio file to transcribe.
	fileURI := "gs://cloud-samples-data/speech/brooklyn_bridge.raw"

	// Detects speech in the audio file.
	resp, err := client.Recognize(ctx, &speechpb.RecognizeRequest{
		Config: &speechpb.RecognitionConfig{
			Encoding:        speechpb.RecognitionConfig_LINEAR16,
			SampleRateHertz: 16000,
			LanguageCode:    "en-US",
		},
		Audio: &speechpb.RecognitionAudio{
			AudioSource: &speechpb.RecognitionAudio_Uri{Uri: fileURI},
		},
	})
	if err != nil {
		log.Fatalf("failed to recognize: %v", err)
	}

	// Prints the results.
	for _, result := range resp.Results {
		for _, alt := range result.Alternatives {
			fmt.Printf("\"%v\" (confidence=%3f)\n", alt.Transcript, alt.Confidence)
		}
	}
}

Java

// Imports the Google Cloud client library
import com.google.cloud.speech.v1.RecognitionAudio;
import com.google.cloud.speech.v1.RecognitionConfig;
import com.google.cloud.speech.v1.RecognitionConfig.AudioEncoding;
import com.google.cloud.speech.v1.RecognizeResponse;
import com.google.cloud.speech.v1.SpeechClient;
import com.google.cloud.speech.v1.SpeechRecognitionAlternative;
import com.google.cloud.speech.v1.SpeechRecognitionResult;
import java.util.List;

public class QuickstartSample {

  /** Demonstrates using the Speech API to transcribe an audio file. */
  public static void main(String... args) throws Exception {
    // Instantiates a client
    try (SpeechClient speechClient = SpeechClient.create()) {

      // The path to the audio file to transcribe
      String gcsUri = "gs://cloud-samples-data/speech/brooklyn_bridge.raw";

      // Builds the sync recognize request
      RecognitionConfig config =
          RecognitionConfig.newBuilder()
              .setEncoding(AudioEncoding.LINEAR16)
              .setSampleRateHertz(16000)
              .setLanguageCode("en-US")
              .build();
      RecognitionAudio audio = RecognitionAudio.newBuilder().setUri(gcsUri).build();

      // Performs speech recognition on the audio file
      RecognizeResponse response = speechClient.recognize(config, audio);
      List<SpeechRecognitionResult> results = response.getResultsList();

      for (SpeechRecognitionResult result : results) {
        // There can be several alternative transcripts for a given chunk of speech. Just use the
        // first (most likely) one here.
        SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
        System.out.printf("Transcription: %s%n", alternative.getTranscript());
      }
    }
  }
}

Node.js

// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');

// Creates a client
const client = new speech.SpeechClient();

async function quickstart() {
  // The path to the remote LINEAR16 file
  const gcsUri = 'gs://cloud-samples-data/speech/brooklyn_bridge.raw';

  // The audio file's encoding, sample rate in hertz, and BCP-47 language code
  const audio = {
    uri: gcsUri,
  };
  const config = {
    encoding: 'LINEAR16',
    sampleRateHertz: 16000,
    languageCode: 'en-US',
  };
  const request = {
    audio: audio,
    config: config,
  };

  // Detects speech in the audio file
  const [response] = await client.recognize(request);
  const transcription = response.results
    .map(result => result.alternatives[0].transcript)
    .join('\n');
  console.log(`Transcription: ${transcription}`);
}
quickstart();

Python


# Imports the Google Cloud client library

from google.cloud import speech

def run_quickstart() -> speech.RecognizeResponse:
    # Instantiates a client
    client = speech.SpeechClient()

    # The name of the audio file to transcribe
    gcs_uri = "gs://cloud-samples-data/speech/brooklyn_bridge.raw"

    audio = speech.RecognitionAudio(uri=gcs_uri)

    config = speech.RecognitionConfig(
        encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=16000,
        language_code="en-US",
    )

    # Detects speech in the audio file
    response = client.recognize(config=config, audio=audio)

    for result in response.results:
        print(f"Transcript: {result.alternatives[0].transcript}")

其他资源

C#

以下列表包含与 C# 版客户端库相关的更多资源的链接：

Go

以下列表包含与 Go 版客户端库相关的更多资源的链接：

Java

以下列表包含与 Java 版客户端库相关的更多资源的链接：

Node.js

以下列表包含与 Node.js 版客户端库相关的更多资源的链接：

PHP

以下列表包含与 PHP 版客户端库相关的更多资源的链接：

Python

以下列表包含与 Python 版客户端库相关的更多资源的链接：

Ruby

以下列表包含与 Ruby 版客户端库相关的更多资源的链接：