Class StreamingRecognitionResult (0.8.0)

Contains a speech recognition result corresponding to a portion of the audio that is currently being processed or an indication that this is the end of the single requested utterance.


  1. transcript: "tube"

  2. transcript: "to be a"

  3. transcript: "to be"

  4. transcript: "to be or not to be" is_final: true

  5. transcript: " that's"

  6. transcript: " that is"

  7. message_type: END_OF_SINGLE_UTTERANCE

  8. transcript: " that is the question" is_final: true

Only two of the responses contain final results (#4 and #8 indicated by is_final: true). Concatenating these generates the full transcript: "to be or not to be that is the question".

In each response we populate:

  • for TRANSCRIPT: transcript and possibly is_final.

  • for END_OF_SINGLE_UTTERANCE: only message_type.

    Transcript text representing the words that the user spoke. Populated if and only if message_type = TRANSCRIPT.

    The Speech confidence between 0.0 and 1.0 for the current portion of audio. A higher number indicates an estimated greater likelihood that the recognized words are correct. The default of 0.0 is a sentinel value indicating that confidence was not set. This field is typically only provided if is_final is true and you should not rely on it being accurate or even set.

    Word-specific information for the words recognized by Speech in [transcript][ nitionResult.transcript]. Populated if and only if message_type = TRANSCRIPT and [InputAudioConfig.enable_word_info] is set.