Migrating to Ruby Speech-to-Text API client library v0.30.0

The Ruby Speech-to-Text API Client Library v0.30.0 has significant design differences from previous client libraries.

The following changes have been made to the Speech-to-Text API library:

  • The client library is more consistent with the RPC API.

  • Content method parameters now distinguish between a local audio file and an audio file on Google Cloud Storage.

The following hasn't changed and moved over to the new client library:

  • The Stream class is still available, but is now returned from the streaming_recognize method.

This page provides details on the changes that you need to make to your Ruby code for the Speech-to-Text API client libraries in order to use the v0.30.0 Ruby client library.

Running previous versions of the client library

You are not required to upgrade your Ruby client library to v0.30. However, new functionality in the Speech-to-Text API is only supported in the v0.30 and later versions.

If you want to continue using a previous version of the Ruby client library and do not want to migrate your code, then you should specify the version of the Ruby client library used by your app. To specify a specific library version, edit the Gemfile file as shown following:

gem "google-cloud-speech", "~> 0.29.0"

Removed classes

The following classes have been removed in the Ruby Speech-to-Text API v0.30.0 gem.

  • Google::Cloud::Speech::Project

  • Google::Cloud::Speech::Audio

  • Google::Cloud::Speech::Result

  • Google::Cloud::Speech::InterimResult

  • Google::Cloud::Speech::Operation

Moved classes

The following classes will now be namespaced for each version of Speech-to-Text API which allows for updates per version of the API.

  • Google::Cloud::Speech::Stream is now namespaced under the API version Google::Cloud::Speech::V[API_VERSION]::Stream and is returned from Speech#streaming_recognize

  • Google::Cloud::Speech::Credentials is now namespaced under the API version Google::Cloud::Speech::V[API_VERSION]::Credentials

Both these classes can be accessed from Google::Cloud::Speech and a specific version of the Speech-to-Text API API can be used by initializing the client in the following way:

speech = Google::Cloud::Speech.new version: :v1

Required code changes

Gem dependency

There are no changes to the require path for the Ruby Speech-to-Text API Client Library v0.30.0 gem.

require "google/cloud/speech"

Create a client

When you create a Speech-to-Text API client using the Ruby Speech-to-Text API Client Library v0.30.0, you must make the following changes:

  • Remove the project parameter. The project parameter has been removed because the Speech-to-Text API doesn't include a project name in requests.
speech = Google::Cloud::Speech.new

Constructing requests that represent audio content

The following sections cover constructing requests using the Ruby Speech-to-Text API Client Library v0.30.0. The examples make requests using both a local audio file and a URI for a file from Google Cloud Storage.

Constructing objects that represent audio content from local file

The following example shows the new way to represent audio content from a local file.

Previous versions of the client library:

audio = speech.audio audio_file_path, encoding:    :linear16,
                                      sample_rate: 16000,
                                      language:    "en-US"

Ruby Client Library v0.30:

# The raw audio
audio_file = File.binread file_name

# The audio file's encoding and sample rate
config = { encoding:          :LINEAR16,
           sample_rate_hertz: 16_000,
           language_code:     "en-US" }
audio  = { content: audio_file }

# Detects speech in the audio file
response = speech.recognize config, audio

Constructing objects that represent audio content from Google Cloud Storage URI

The following example shows the new way to represent audio content from a Google Cloud Storage URI. uri is the URI to an audio file on Google Cloud Storage.

Previous versions of the client library:

audio = speech.audio storage_path, encoding:    :linear16,
                                   sample_rate: 16000,
                                   language:    "en-US"

Ruby Client Library v0.30:

config = { encoding:          :LINEAR16,
           sample_rate_hertz: 16_000,
           language_code:     "en-US" }
audio  = { uri: storage_path }

response = speech.recognize config, audio

Making requests

Making a synchronous request

The following example shows the new way to make a synchronous or asynchronous recognize request.

Previous versions of the client library:

audio = speech.audio audio_file_path, encoding:    :linear16,
                                      sample_rate: 16000,
                                      language:    "en-US"

results = audio.recognize

Ruby client library v0.30:

# The raw audio
audio_file = File.binread file_name

# The audio file's encoding and sample rate
config = { encoding:          :LINEAR16,
           sample_rate_hertz: 16_000,
           language_code:     "en-US" }
audio  = { content: audio_file }

# Detects speech in the audio file
response = speech.recognize config, audio

results = response.results

Making an asynchronous request

The following example shows the new way to make a synchronous or asynchronous recognize request.

Previous versions of the client library:

audio = speech.audio audio_file_path, encoding:    :linear16,
                                      sample_rate: 16000,
                                      language:    "en-US"

operation = audio.process

puts "Operation started"

operation.wait_until_done!

results = operation.results

Ruby client library v0.30:

audio_file = File.binread audio_file_path
config     = { encoding:          :LINEAR16,
               sample_rate_hertz: 16_000,
               language_code:     "en-US" }
audio      = { content: audio_file }

operation = speech.long_running_recognize config, audio

puts "Operation started"

operation.wait_until_done!

raise operation.results.message if operation.error?

results = operation.response.results

Making a streaming request

The following example shows the new way to make a streaming recognize request.

Previous versions of the client library:

stream = speech.stream encoding:    :linear16,
                       sample_rate: 16000,
                       language:    "en-US"

audio_content = File.binread audio_file_path
bytes_total   = audio_content.size
bytes_sent    = 0
chunk_size    = 32000

# Send chunks of the audio content to the Speech API 1 second at a time
# This is an example of simulating microphone data.
while bytes_sent < bytes_total do
  stream.send audio_content[bytes_sent, chunk_size]
  bytes_sent += chunk_size
  sleep 1
end

# Signal the completion of audio content
stream.stop

stream.wait_until_complete!

results = stream.results

Ruby client library v0.30:

audio_content  = File.binread audio_file_path
bytes_total    = audio_content.size
bytes_sent     = 0
chunk_size     = 32_000

streaming_config = { config:          { encoding:                 :LINEAR16,
                                        sample_rate_hertz:        16_000,
                                        language_code:            "en-US",
                                        enable_word_time_offsets: true },
                     interim_results: true }

stream = speech.streaming_recognize streaming_config

# Simulated streaming from a microphone
# Stream bytes...
while bytes_sent < bytes_total
  stream.send audio_content[bytes_sent, chunk_size]
  bytes_sent += chunk_size
  sleep 1
end

puts "Stopped passing"
stream.stop

# Wait until processing is complete...
stream.wait_until_complete!

results = stream.results

Processing responses

Processing synchronous recognition response

The following example shows the new way to make a process a synchronous recognition request.

Previous versions of the client library:

audio = speech.audio path, encoding:    :linear16,
                           sample_rate: 16000,
                           language:    "en-US"

results = audio.recognize

results.each do |result|
  puts "Transcription: #{result.transcript}"
end

Ruby client library v0.30:

audio_file = File.binread audio_file_path
config     = { encoding:          :LINEAR16,
               sample_rate_hertz: 16_000,
               language_code:     "en-US" }
audio      = { content: audio_file }

response = speech.recognize config, audio

results = response.results

alternatives = results.first.alternatives
alternatives.each do |alternative|
  puts "Transcription: #{alternative.transcript}"
end

Processing asynchronous recognition response

The following example shows the new way to make a process an asynchronous recognition request.

Previous versions of the client library:

audio = speech.audio path, encoding:    :linear16,
                           sample_rate: 16000,
                           language:    "en-US"

operation = audio.process

puts "Operation started"

operation.wait_until_done!

results = operation.results
results.each do |result|
  puts "Transcription: #{result.transcript}"
end

Ruby client library v0.30:

audio_file = File.binread audio_file_path
config     = { encoding:          :LINEAR16,
               sample_rate_hertz: 16_000,
               language_code:     "en-US" }
audio      = { content: audio_file }

operation = speech.long_running_recognize config, audio

puts "Operation started"

operation.wait_until_done!

raise operation.results.message if operation.error?

results = operation.response.results

alternatives = results.first.alternatives
alternatives.each do |alternative|
  puts "Transcription: #{alternative.transcript}"
end

Processing streaming recognition response

The following example shows the new way to make a process a streaming recognition request.

Previous versions of the client library:

stream = speech.stream encoding:    :linear16,
                       sample_rate: 16000,
                       language:    "en-US"

# Stream bytes...
# Wait until processing is complete...

results = stream.results
results.each do |result|
  puts "Transcript: #{result.transcript}"
end

Ruby client library v0.30:

audio_content  = File.binread audio_file_path
bytes_total    = audio_content.size
bytes_sent     = 0
chunk_size     = 32_000

streaming_config = { config:          { encoding:                 :LINEAR16,
                                        sample_rate_hertz:        16_000,
                                        language_code:            "en-US",
                                        enable_word_time_offsets: true },
                     interim_results: true }

stream = speech.streaming_recognize streaming_config

# Simulated streaming from a microphone
# Stream bytes...
while bytes_sent < bytes_total
  stream.send audio_content[bytes_sent, chunk_size]
  bytes_sent += chunk_size
  sleep 1
end

puts "Stopped passing"
stream.stop

# Wait until processing is complete...
stream.wait_until_complete!

results = stream.results

alternatives = results.first.alternatives
alternatives.each do |result|
  puts "Transcript: #{result.transcript}"
end
Bu sayfayı yararlı buldunuz mu? Lütfen görüşünüzü bildirin:

Şunun hakkında geri bildirim gönderin...

Cloud Speech-to-Text Documentation
Yardım mı gerekiyor? Destek sayfamızı ziyaret edin.