迁移到 Python 客户端库 v0.27

Python 客户端库 v0.27 针对以前客户端库的设计进行了一些重大更改。这些更改可以总结如下:

  • 整合模块合以减少类型

  • 将无类型参数替换为强类型的类和枚举

本主题详细说明为了使用 v0.27 Python 客户端库,您将需要如何更改 Speech-to-Text API 客户端库的 Python 代码。

运行旧版本的客户端库

系统并不要求您将 Python 客户端库升级到 v0.27。但是,Speech-to-Text API 中的新功能只有在 v0.27 及更高版本中才受支持。

如果您想要继续使用先前版本的 Python 客户端库,且不想迁移您的代码,那么您应指定您的应用所使用的 Python 客户端库版本。要指定特定的库版本,请按如下所示修改 requirements.txt 文件:

google-cloud-speech==0.26

已移除的模块

Python 客户端库 v0.27 软件包中已移除以下模块。

  • google.cloud.speech.alternatives

  • google.cloud.speech.client

  • google.cloud.speech.encoding

  • google.cloud.speech.operation

  • google.cloud.speech.result

  • google.cloud.speech.sample

所需的代码更改

导入数据

添加了新的 google.cloud.speech.typesgoogle.cloud.speech.enums 模块,以访问 Python 客户端库 v0.27 中的新类型和枚举。

types 模块包含创建请求所需的新类,例如 types.RecognitionAudioenums 模块包含用于指定音频编码的枚举。您可以继续使用 'LINEAR16' 等字符串来指定音频编码,但是我们建议您使用 enums 模块中的枚举。

from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types

创建客户端

Client 类已替换为 SpeechClient 类。请将对 Client 类的引用替换为 SpeechClient

旧版客户端库

old_client = speech.Client()

Python 客户端库 v0.27

client = speech.SpeechClient()

构建表示音频内容的对象

要识别音频内容是来自本地文件还是来自 Google Cloud Storage URI,请使用新的 RecognitionAudioRecognitionConfig类。请注意,language_code 等参数现在作为 RecognitionConfig 类的一部分(而不是作为参数)传递至 API 方法。

构造表示来自本地文件的音频内容的对象

以下示例展示了表示来自本地文件的音频内容的新方式。

旧版客户端库

with io.open(file_name, 'rb') as audio_file:
    content = audio_file.read()

sample = old_client.sample(
    content,
    encoding='LINEAR16',
    sample_rate_hertz=16000)

Python 客户端库 v0.27

with io.open(speech_file, 'rb') as audio_file:
    content = audio_file.read()

audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code='en-US')

构建表示来自 Google Cloud Storage URI 的音频内容的对象

以下示例展示了表示来自 Google Cloud Storage URI 的音频内容的新方式。gcs_uri 是 Google Cloud Storage 中的一个音频文件的 URI。

旧版客户端库

sample = old_client.sample(
    source_uri=gcs_uri,
    encoding='LINEAR16',
    sample_rate_hertz=16000)

Python 客户端库 v0.27

audio = types.RecognitionAudio(uri=gcs_uri)
config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.FLAC,
    sample_rate_hertz=16000,
    language_code='en-US')

发出请求

发出同步请求

以下示例展示了发出同步或异步识别请求的新方式。

旧版客户端库

with io.open(file_name, 'rb') as audio_file:
    content = audio_file.read()

sample = old_client.sample(
    content,
    encoding='LINEAR16',
    sample_rate_hertz=16000)

alternatives = sample.recognize(language_code='en-US')

Python 客户端库 v0.27

with io.open(speech_file, 'rb') as audio_file:
    content = audio_file.read()

audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code='en-US')

response = client.recognize(config, audio)

发出异步请求

以下示例展示了发出同步或异步识别请求的新方式。

旧版客户端库

with io.open(file_name, 'rb') as audio_file:
    content = audio_file.read()

sample = old_client.sample(
    content,
    encoding='LINEAR16',
    sample_rate_hertz=16000)

operation = sample.long_running_recognize(language_code='en-US')

Python 客户端库 v0.27

with io.open(speech_file, 'rb') as audio_file:
    content = audio_file.read()

audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code='en-US')

operation = client.long_running_recognize(config, audio)

发出流式请求

以下示例显示了发出流式传输识别请求的新方式。

旧版客户端库

with io.open(file_name, 'rb') as audio_file:
    sample = old_client.sample(
        stream=audio_file,
        encoding='LINEAR16',
        sample_rate_hertz=16000)

    alternatives = sample.streaming_recognize(language_code='en-US')

Python 客户端库 v0.27

with io.open(stream_file, 'rb') as audio_file:
    content = audio_file.read()

# In practice, stream should be a generator yielding chunks of audio data.
stream = [content]
requests = (types.StreamingRecognizeRequest(audio_content=chunk)
            for chunk in stream)

config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code='en-US')
streaming_config = types.StreamingRecognitionConfig(config=config)

# streaming_recognize returns a generator.
responses = client.streaming_recognize(streaming_config, requests)

处理响应

处理同步识别响应

以下示例显示了处理同步识别请求的新方式。

旧版客户端库

alternatives = sample.recognize(language_code='en-US')

for alternative in alternatives:
    print('Transcript: {}'.format(alternative.transcript))

Python 客户端库 v0.27

response = client.recognize(config, audio)
# Each result is for a consecutive portion of the audio. Iterate through
# them to get the transcripts for the entire audio file.
for result in response.results:
    # The first alternative is the most likely one for this portion.
    print(u'Transcript: {}'.format(result.alternatives[0].transcript))

处理异步识别响应

以下示例显示了处理异步识别请求的新方式。

旧版客户端库

operation = sample.long_running_recognize('en-US')

# Sleep and poll operation.complete
# ...

if operation.complete:
    alternatives = operation.results
    for alternative in alternatives:
        print('Transcript: {}'.format(alternative.transcript))
        print('Confidence: {}'.format(alternative.confidence))

Python 客户端库 v0.27

operation = client.long_running_recognize(config, audio)

print('Waiting for operation to complete...')
response = operation.result(timeout=90)

# Each result is for a consecutive portion of the audio. Iterate through
# them to get the transcripts for the entire audio file.
for result in response.results:
    # The first alternative is the most likely one for this portion.
    print(u'Transcript: {}'.format(result.alternatives[0].transcript))
    print('Confidence: {}'.format(result.alternatives[0].confidence))

处理流式识别响应

以下示例显示了处理流式传输识别请求的新方式。

旧版客户端库

alternatives = sample.streaming_recognize('en-US')

for alternative in alternatives:
    print('Finished: {}'.format(alternative.is_final))
    print('Stability: {}'.format(alternative.stability))
    print('Confidence: {}'.format(alternative.confidence))
    print('Transcript: {}'.format(alternative.transcript))

Python 客户端库 v0.27

responses = client.streaming_recognize(streaming_config, requests)

for response in responses:
    # Once the transcription has settled, the first result will contain the
    # is_final result. The other results will be for subsequent portions of
    # the audio.
    for result in response.results:
        print('Finished: {}'.format(result.is_final))
        print('Stability: {}'.format(result.stability))
        alternatives = result.alternatives
        # The alternatives are ordered from most likely to least.
        for alternative in alternatives:
            print('Confidence: {}'.format(alternative.confidence))
            print(u'Transcript: {}'.format(alternative.transcript))