Dialogflow V2 API - Class Google::Cloud::Dialogflow::V2::StreamingRecognitionResult (v0.31.0)

Reference documentation and code samples for the Dialogflow V2 API class Google::Cloud::Dialogflow::V2::StreamingRecognitionResult.

Contains a speech recognition result corresponding to a portion of the audio that is currently being processed or an indication that this is the end of the single requested utterance.

While end-user audio is being processed, Dialogflow sends a series of results. Each result may contain a transcript value. A transcript represents a portion of the utterance. While the recognizer is processing audio, transcript values may be interim values or finalized values. Once a transcript is finalized, the is_final value is set to true and processing continues for the next transcript.

If StreamingDetectIntentRequest.query_input.audio_config.single_utterance was true, and the recognizer has completed processing audio, the message_type value is set to `END_OF_SINGLE_UTTERANCE and the following (last) result contains the last finalized transcript.

The complete end-user utterance is determined by concatenating the finalized transcript values received for the series of results.

In the following example, single utterance is enabled. In the case where single utterance is not enabled, result 7 would not occur.

Num | transcript | message_type | is_final --- | ----------------------- | ----------------------- | -------- 1 | "tube" | TRANSCRIPT | false 2 | "to be a" | TRANSCRIPT | false 3 | "to be" | TRANSCRIPT | false 4 | "to be or not to be" | TRANSCRIPT | true 5 | "that's" | TRANSCRIPT | false 6 | "that is | TRANSCRIPT | false 7 | unset | END_OF_SINGLE_UTTERANCE | unset 8 | " that is the question" | TRANSCRIPT | true

Concatenating the finalized transcripts with is_final set to true, the complete utterance becomes "to be or not to be that is the question".

Inherits

  • Object

Extended By

  • Google::Protobuf::MessageExts::ClassMethods

Includes

  • Google::Protobuf::MessageExts

Methods

#confidence

def confidence() -> ::Float
Returns
  • (::Float) — The Speech confidence between 0.0 and 1.0 for the current portion of audio. A higher number indicates an estimated greater likelihood that the recognized words are correct. The default of 0.0 is a sentinel value indicating that confidence was not set.

    This field is typically only provided if is_final is true and you should not rely on it being accurate or even set.

#confidence=

def confidence=(value) -> ::Float
Parameter
  • value (::Float) — The Speech confidence between 0.0 and 1.0 for the current portion of audio. A higher number indicates an estimated greater likelihood that the recognized words are correct. The default of 0.0 is a sentinel value indicating that confidence was not set.

    This field is typically only provided if is_final is true and you should not rely on it being accurate or even set.

Returns
  • (::Float) — The Speech confidence between 0.0 and 1.0 for the current portion of audio. A higher number indicates an estimated greater likelihood that the recognized words are correct. The default of 0.0 is a sentinel value indicating that confidence was not set.

    This field is typically only provided if is_final is true and you should not rely on it being accurate or even set.

#is_final

def is_final() -> ::Boolean
Returns
  • (::Boolean) — If false, the StreamingRecognitionResult represents an interim result that may change. If true, the recognizer will not return any further hypotheses about this piece of the audio. May only be populated for message_type = TRANSCRIPT.

#is_final=

def is_final=(value) -> ::Boolean
Parameter
  • value (::Boolean) — If false, the StreamingRecognitionResult represents an interim result that may change. If true, the recognizer will not return any further hypotheses about this piece of the audio. May only be populated for message_type = TRANSCRIPT.
Returns
  • (::Boolean) — If false, the StreamingRecognitionResult represents an interim result that may change. If true, the recognizer will not return any further hypotheses about this piece of the audio. May only be populated for message_type = TRANSCRIPT.

#language_code

def language_code() -> ::String
Returns
  • (::String) — Detected language code for the transcript.

#language_code=

def language_code=(value) -> ::String
Parameter
  • value (::String) — Detected language code for the transcript.
Returns
  • (::String) — Detected language code for the transcript.

#message_type

def message_type() -> ::Google::Cloud::Dialogflow::V2::StreamingRecognitionResult::MessageType

#message_type=

def message_type=(value) -> ::Google::Cloud::Dialogflow::V2::StreamingRecognitionResult::MessageType
Parameter

#speech_end_offset

def speech_end_offset() -> ::Google::Protobuf::Duration
Returns
  • (::Google::Protobuf::Duration) — Time offset of the end of this Speech recognition result relative to the beginning of the audio. Only populated for message_type = TRANSCRIPT.

#speech_end_offset=

def speech_end_offset=(value) -> ::Google::Protobuf::Duration
Parameter
  • value (::Google::Protobuf::Duration) — Time offset of the end of this Speech recognition result relative to the beginning of the audio. Only populated for message_type = TRANSCRIPT.
Returns
  • (::Google::Protobuf::Duration) — Time offset of the end of this Speech recognition result relative to the beginning of the audio. Only populated for message_type = TRANSCRIPT.

#speech_word_info

def speech_word_info() -> ::Array<::Google::Cloud::Dialogflow::V2::SpeechWordInfo>
Returns

#speech_word_info=

def speech_word_info=(value) -> ::Array<::Google::Cloud::Dialogflow::V2::SpeechWordInfo>
Parameter
Returns

#transcript

def transcript() -> ::String
Returns
  • (::String) — Transcript text representing the words that the user spoke. Populated if and only if message_type = TRANSCRIPT.

#transcript=

def transcript=(value) -> ::String
Parameter
  • value (::String) — Transcript text representing the words that the user spoke. Populated if and only if message_type = TRANSCRIPT.
Returns
  • (::String) — Transcript text representing the words that the user spoke. Populated if and only if message_type = TRANSCRIPT.