public final class StreamingRecognitionResult extends GeneratedMessageV3 implements StreamingRecognitionResultOrBuilder
Contains a speech recognition result corresponding to a portion of the audio
that is currently being processed or an indication that this is the end
of the single requested utterance.
While end-user audio is being processed, Dialogflow sends a series of
results. Each result may contain a transcript
value. A transcript
represents a portion of the utterance. While the recognizer is processing
audio, transcript values may be interim values or finalized values.
Once a transcript is finalized, the is_final
value is set to true and
processing continues for the next transcript.
If StreamingDetectIntentRequest.query_input.audio.config.single_utterance
was true, and the recognizer has completed processing audio,
the message_type
value is set to END_OF_SINGLE_UTTERANCE and the
following (last) result contains the last finalized transcript.
The complete end-user utterance is determined by concatenating the
finalized transcript values received for the series of results.
In the following example, single utterance is enabled. In the case where
single utterance is not enabled, result 7 would not occur.
Num |
transcript |
message_type |
is_final |
1 |
"tube" |
TRANSCRIPT |
false |
2 |
"to be a" |
TRANSCRIPT |
false |
3 |
"to be" |
TRANSCRIPT |
false |
4 |
"to be or not to be" |
TRANSCRIPT |
true |
5 |
"that's" |
TRANSCRIPT |
false |
6 |
"that is |
TRANSCRIPT |
false |
7 |
unset |
END_OF_SINGLE_UTTERANCE |
unset |
8 |
" that is the question" |
TRANSCRIPT |
true |
Concatenating the finalized transcripts with
is_final` set to true,
the complete utterance becomes "to be or not to be that is the question".
Protobuf type google.cloud.dialogflow.cx.v3.StreamingRecognitionResult
Inherited Members
com.google.protobuf.GeneratedMessageV3.<ListT>makeMutableCopy(ListT)
Static Fields
CONFIDENCE_FIELD_NUMBER
public static final int CONFIDENCE_FIELD_NUMBER
Field Value |
Type |
Description |
int |
|
IS_FINAL_FIELD_NUMBER
public static final int IS_FINAL_FIELD_NUMBER
Field Value |
Type |
Description |
int |
|
LANGUAGE_CODE_FIELD_NUMBER
public static final int LANGUAGE_CODE_FIELD_NUMBER
Field Value |
Type |
Description |
int |
|
MESSAGE_TYPE_FIELD_NUMBER
public static final int MESSAGE_TYPE_FIELD_NUMBER
Field Value |
Type |
Description |
int |
|
SPEECH_END_OFFSET_FIELD_NUMBER
public static final int SPEECH_END_OFFSET_FIELD_NUMBER
Field Value |
Type |
Description |
int |
|
SPEECH_WORD_INFO_FIELD_NUMBER
public static final int SPEECH_WORD_INFO_FIELD_NUMBER
Field Value |
Type |
Description |
int |
|
STABILITY_FIELD_NUMBER
public static final int STABILITY_FIELD_NUMBER
Field Value |
Type |
Description |
int |
|
TRANSCRIPT_FIELD_NUMBER
public static final int TRANSCRIPT_FIELD_NUMBER
Field Value |
Type |
Description |
int |
|
Static Methods
getDefaultInstance()
public static StreamingRecognitionResult getDefaultInstance()
getDescriptor()
public static final Descriptors.Descriptor getDescriptor()
newBuilder()
public static StreamingRecognitionResult.Builder newBuilder()
newBuilder(StreamingRecognitionResult prototype)
public static StreamingRecognitionResult.Builder newBuilder(StreamingRecognitionResult prototype)
public static StreamingRecognitionResult parseDelimitedFrom(InputStream input)
public static StreamingRecognitionResult parseDelimitedFrom(InputStream input, ExtensionRegistryLite extensionRegistry)
parseFrom(byte[] data)
public static StreamingRecognitionResult parseFrom(byte[] data)
Parameter |
Name |
Description |
data |
byte[]
|
parseFrom(byte[] data, ExtensionRegistryLite extensionRegistry)
public static StreamingRecognitionResult parseFrom(byte[] data, ExtensionRegistryLite extensionRegistry)
parseFrom(ByteString data)
public static StreamingRecognitionResult parseFrom(ByteString data)
parseFrom(ByteString data, ExtensionRegistryLite extensionRegistry)
public static StreamingRecognitionResult parseFrom(ByteString data, ExtensionRegistryLite extensionRegistry)
public static StreamingRecognitionResult parseFrom(CodedInputStream input)
public static StreamingRecognitionResult parseFrom(CodedInputStream input, ExtensionRegistryLite extensionRegistry)
public static StreamingRecognitionResult parseFrom(InputStream input)
public static StreamingRecognitionResult parseFrom(InputStream input, ExtensionRegistryLite extensionRegistry)
parseFrom(ByteBuffer data)
public static StreamingRecognitionResult parseFrom(ByteBuffer data)
parseFrom(ByteBuffer data, ExtensionRegistryLite extensionRegistry)
public static StreamingRecognitionResult parseFrom(ByteBuffer data, ExtensionRegistryLite extensionRegistry)
parser()
public static Parser<StreamingRecognitionResult> parser()
Methods
equals(Object obj)
public boolean equals(Object obj)
Parameter |
Name |
Description |
obj |
Object
|
Overrides
getConfidence()
public float getConfidence()
The Speech confidence between 0.0 and 1.0 for the current portion of audio.
A higher number indicates an estimated greater likelihood that the
recognized words are correct. The default of 0.0 is a sentinel value
indicating that confidence was not set.
This field is typically only provided if is_final
is true and you should
not rely on it being accurate or even set.
float confidence = 4;
Returns |
Type |
Description |
float |
The confidence.
|
getDefaultInstanceForType()
public StreamingRecognitionResult getDefaultInstanceForType()
getIsFinal()
public boolean getIsFinal()
If false
, the StreamingRecognitionResult
represents an
interim result that may change. If true
, the recognizer will not return
any further hypotheses about this piece of the audio. May only be populated
for message_type
= TRANSCRIPT
.
bool is_final = 3;
Returns |
Type |
Description |
boolean |
The isFinal.
|
getLanguageCode()
public String getLanguageCode()
Detected language code for the transcript.
string language_code = 10;
Returns |
Type |
Description |
String |
The languageCode.
|
getLanguageCodeBytes()
public ByteString getLanguageCodeBytes()
Detected language code for the transcript.
string language_code = 10;
Returns |
Type |
Description |
ByteString |
The bytes for languageCode.
|
getMessageType()
public StreamingRecognitionResult.MessageType getMessageType()
Type of the result message.
.google.cloud.dialogflow.cx.v3.StreamingRecognitionResult.MessageType message_type = 1;
getMessageTypeValue()
public int getMessageTypeValue()
Type of the result message.
.google.cloud.dialogflow.cx.v3.StreamingRecognitionResult.MessageType message_type = 1;
Returns |
Type |
Description |
int |
The enum numeric value on the wire for messageType.
|
getParserForType()
public Parser<StreamingRecognitionResult> getParserForType()
Overrides
getSerializedSize()
public int getSerializedSize()
Returns |
Type |
Description |
int |
|
Overrides
getSpeechEndOffset()
public Duration getSpeechEndOffset()
Time offset of the end of this Speech recognition result relative to the
beginning of the audio. Only populated for message_type
=
TRANSCRIPT
.
.google.protobuf.Duration speech_end_offset = 8;
Returns |
Type |
Description |
Duration |
The speechEndOffset.
|
getSpeechEndOffsetOrBuilder()
public DurationOrBuilder getSpeechEndOffsetOrBuilder()
Time offset of the end of this Speech recognition result relative to the
beginning of the audio. Only populated for message_type
=
TRANSCRIPT
.
.google.protobuf.Duration speech_end_offset = 8;
getSpeechWordInfo(int index)
public SpeechWordInfo getSpeechWordInfo(int index)
Word-specific information for the words recognized by Speech in
transcript.
Populated if and only if message_type
= TRANSCRIPT
and
[InputAudioConfig.enable_word_info] is set.
repeated .google.cloud.dialogflow.cx.v3.SpeechWordInfo speech_word_info = 7;
Parameter |
Name |
Description |
index |
int
|
getSpeechWordInfoCount()
public int getSpeechWordInfoCount()
Word-specific information for the words recognized by Speech in
transcript.
Populated if and only if message_type
= TRANSCRIPT
and
[InputAudioConfig.enable_word_info] is set.
repeated .google.cloud.dialogflow.cx.v3.SpeechWordInfo speech_word_info = 7;
Returns |
Type |
Description |
int |
|
getSpeechWordInfoList()
public List<SpeechWordInfo> getSpeechWordInfoList()
Word-specific information for the words recognized by Speech in
transcript.
Populated if and only if message_type
= TRANSCRIPT
and
[InputAudioConfig.enable_word_info] is set.
repeated .google.cloud.dialogflow.cx.v3.SpeechWordInfo speech_word_info = 7;
getSpeechWordInfoOrBuilder(int index)
public SpeechWordInfoOrBuilder getSpeechWordInfoOrBuilder(int index)
Word-specific information for the words recognized by Speech in
transcript.
Populated if and only if message_type
= TRANSCRIPT
and
[InputAudioConfig.enable_word_info] is set.
repeated .google.cloud.dialogflow.cx.v3.SpeechWordInfo speech_word_info = 7;
Parameter |
Name |
Description |
index |
int
|
getSpeechWordInfoOrBuilderList()
public List<? extends SpeechWordInfoOrBuilder> getSpeechWordInfoOrBuilderList()
Word-specific information for the words recognized by Speech in
transcript.
Populated if and only if message_type
= TRANSCRIPT
and
[InputAudioConfig.enable_word_info] is set.
repeated .google.cloud.dialogflow.cx.v3.SpeechWordInfo speech_word_info = 7;
Returns |
Type |
Description |
List<? extends com.google.cloud.dialogflow.cx.v3.SpeechWordInfoOrBuilder> |
|
getStability()
public float getStability()
An estimate of the likelihood that the speech recognizer will
not change its guess about this interim recognition result:
- If the value is unspecified or 0.0, Dialogflow didn't compute the
stability. In particular, Dialogflow will only provide stability for
TRANSCRIPT
results with is_final = false
.
- Otherwise, the value is in (0.0, 1.0] where 0.0 means completely
unstable and 1.0 means completely stable.
float stability = 6;
Returns |
Type |
Description |
float |
The stability.
|
getTranscript()
public String getTranscript()
Transcript text representing the words that the user spoke.
Populated if and only if message_type
= TRANSCRIPT
.
string transcript = 2;
Returns |
Type |
Description |
String |
The transcript.
|
getTranscriptBytes()
public ByteString getTranscriptBytes()
Transcript text representing the words that the user spoke.
Populated if and only if message_type
= TRANSCRIPT
.
string transcript = 2;
Returns |
Type |
Description |
ByteString |
The bytes for transcript.
|
getUnknownFields()
public final UnknownFieldSet getUnknownFields()
Overrides
hasSpeechEndOffset()
public boolean hasSpeechEndOffset()
Time offset of the end of this Speech recognition result relative to the
beginning of the audio. Only populated for message_type
=
TRANSCRIPT
.
.google.protobuf.Duration speech_end_offset = 8;
Returns |
Type |
Description |
boolean |
Whether the speechEndOffset field is set.
|
hashCode()
Returns |
Type |
Description |
int |
|
Overrides
internalGetFieldAccessorTable()
protected GeneratedMessageV3.FieldAccessorTable internalGetFieldAccessorTable()
Overrides
isInitialized()
public final boolean isInitialized()
Overrides
newBuilderForType()
public StreamingRecognitionResult.Builder newBuilderForType()
newBuilderForType(GeneratedMessageV3.BuilderParent parent)
protected StreamingRecognitionResult.Builder newBuilderForType(GeneratedMessageV3.BuilderParent parent)
Overrides
newInstance(GeneratedMessageV3.UnusedPrivateParameter unused)
protected Object newInstance(GeneratedMessageV3.UnusedPrivateParameter unused)
Returns |
Type |
Description |
Object |
|
Overrides
toBuilder()
public StreamingRecognitionResult.Builder toBuilder()
writeTo(CodedOutputStream output)
public void writeTo(CodedOutputStream output)
Overrides