Migrating to Python Client Library v0.27

The Client Library for Python v0.27 includes some significant changes to how previous client libraries were designed. These changes can be summarized as follows:

  • Consolidation of modules into fewer types

  • Replacing untyped parameters with strongly-typed classes and enumerations

This topic provides details on the changes that you will need to make to your Python code for the Speech-to-Text API client libraries in order to use the v0.27 Python client library.

Running previous versions of the client library

You are not required to upgrade your Python client library to v0.27. However, new functionality in the Speech-to-Text API is only supported in the v0.27 and later versions.

If you want to continue using a previous version of the Python client library and do not want to migrate your code, then you should specify the version of the Python client library used by your app. To specify a specific library version, edit the requirements.txt file as shown following:

google-cloud-speech==0.26

Removed Modules

The following modules were removed in the Python Client Library v0.27 package.

  • google.cloud.speech.alternatives

  • google.cloud.speech.client

  • google.cloud.speech.encoding

  • google.cloud.speech.operation

  • google.cloud.speech.result

  • google.cloud.speech.sample

Required Code Changes

Imports

Include the new google.cloud.speech.types and google.cloud.speech.enums modules in order to access the new types and enumerations in the Python Client Library v0.27.

The types module contains the new classes that are required for creating requests, such as types.RecognitionAudio. The enums module contains the enumerations for specifying audio encodings. You can continue to use strings such as 'LINEAR16' to specify your audio encoding, however we recommend that you use the enumerations in the enums module.

from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types

Create a client

The Client class has been replaced with the SpeechClient class. Replace references to the Client class with SpeechClient.

Previous versions of the client libraries:

old_client = speech.Client()

Python Client Library v0.27:

client = speech.SpeechClient()

Constructing objects that represent audio content

To identify audio content from a local file or from a Google Cloud Storage URI, use the new RecognitionAudio and RecognitionConfig classes. Notice that parameters such as the language_code parameter are now passed as part of the RecognitionConfig class instead of being passed as a parameter to the API method.

Constructing objects that represent audio content from local file

The following example shows the new way to represent audio content from a local file.

Previous versions of the client libraries:

with io.open(file_name, 'rb') as audio_file:
    content = audio_file.read()

sample = old_client.sample(
    content,
    encoding='LINEAR16',
    sample_rate_hertz=16000)

Python Client Library v0.27:

with io.open(speech_file, 'rb') as audio_file:
    content = audio_file.read()

audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code='en-US')

Constructing objects that represent audio content from Google Cloud Storage URI

The following example shows the new way to represent audio content from a Google Cloud Storage URI. gcs_uri is the URI to an audio file on Google Cloud Storage.

Previous versions of the client libraries:

sample = old_client.sample(
    source_uri=gcs_uri,
    encoding='LINEAR16',
    sample_rate_hertz=16000)

Python Client Library v0.27:

audio = types.RecognitionAudio(uri=gcs_uri)
config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.FLAC,
    sample_rate_hertz=16000,
    language_code='en-US')

Making requests

Making a synchronous request

The following example shows the new way to make a synchronous or asynchronous recognize request.

Previous versions of the client libraries:

with io.open(file_name, 'rb') as audio_file:
    content = audio_file.read()

sample = old_client.sample(
    content,
    encoding='LINEAR16',
    sample_rate_hertz=16000)

alternatives = sample.recognize(language_code='en-US')

Python Client Library v0.27:

with io.open(speech_file, 'rb') as audio_file:
    content = audio_file.read()

audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code='en-US')

response = client.recognize(config, audio)

Making an asynchronous request

The following example shows the new way to make a synchronous or asynchronous recognize request.

Previous versions of the client libraries:

with io.open(file_name, 'rb') as audio_file:
    content = audio_file.read()

sample = old_client.sample(
    content,
    encoding='LINEAR16',
    sample_rate_hertz=16000)

operation = sample.long_running_recognize(language_code='en-US')

Python Client Library v0.27:

with io.open(speech_file, 'rb') as audio_file:
    content = audio_file.read()

audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code='en-US')

operation = client.long_running_recognize(config, audio)

Making a streaming request

The following example shows the new way to make a streaming recognize request.

Previous versions of the client libraries:

with io.open(file_name, 'rb') as audio_file:
    sample = old_client.sample(
        stream=audio_file,
        encoding='LINEAR16',
        sample_rate_hertz=16000)

    alternatives = sample.streaming_recognize(language_code='en-US')

Python Client Library v0.27:

with io.open(stream_file, 'rb') as audio_file:
    content = audio_file.read()

# In practice, stream should be a generator yielding chunks of audio data.
stream = [content]
requests = (types.StreamingRecognizeRequest(audio_content=chunk)
            for chunk in stream)

config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code='en-US')
streaming_config = types.StreamingRecognitionConfig(config=config)

# streaming_recognize returns a generator.
responses = client.streaming_recognize(streaming_config, requests)

Processing responses

Processing synchronous recognition response

The following example shows the new way to make a process a synchronous recognition request.

Previous versions of the client libraries:

alternatives = sample.recognize(language_code='en-US')

for alternative in alternatives:
    print('Transcript: {}'.format(alternative.transcript))

Python Client Library v0.27:

response = client.recognize(config, audio)
# Each result is for a consecutive portion of the audio. Iterate through
# them to get the transcripts for the entire audio file.
for result in response.results:
    # The first alternative is the most likely one for this portion.
    print(u'Transcript: {}'.format(result.alternatives[0].transcript))

Processing asynchronous recognition response

The following example shows the new way to make a process an asynchronous recognition request.

Previous versions of the client libraries:

operation = sample.long_running_recognize('en-US')

# Sleep and poll operation.complete
# ...

if operation.complete:
    alternatives = operation.results
    for alternative in alternatives:
        print('Transcript: {}'.format(alternative.transcript))
        print('Confidence: {}'.format(alternative.confidence))

Python Client Library v0.27:

operation = client.long_running_recognize(config, audio)

print('Waiting for operation to complete...')
response = operation.result(timeout=90)

# Each result is for a consecutive portion of the audio. Iterate through
# them to get the transcripts for the entire audio file.
for result in response.results:
    # The first alternative is the most likely one for this portion.
    print(u'Transcript: {}'.format(result.alternatives[0].transcript))
    print('Confidence: {}'.format(result.alternatives[0].confidence))

Processing streaming recognition response

The following example shows the new way to make a process a streaming recognition request.

Previous versions of the client libraries:

alternatives = sample.streaming_recognize('en-US')

for alternative in alternatives:
    print('Finished: {}'.format(alternative.is_final))
    print('Stability: {}'.format(alternative.stability))
    print('Confidence: {}'.format(alternative.confidence))
    print('Transcript: {}'.format(alternative.transcript))

Python Client Library v0.27:

responses = client.streaming_recognize(streaming_config, requests)

for response in responses:
    # Once the transcription has settled, the first result will contain the
    # is_final result. The other results will be for subsequent portions of
    # the audio.
    for result in response.results:
        print('Finished: {}'.format(result.is_final))
        print('Stability: {}'.format(result.stability))
        alternatives = result.alternatives
        # The alternatives are ordered from most likely to least.
        for alternative in alternatives:
            print('Confidence: {}'.format(alternative.confidence))
            print(u'Transcript: {}'.format(alternative.transcript))
Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Speech API Documentation