Summary of entries of Classes for speech.
Classes
AdaptationAsyncClient
Service that implements Google Cloud Speech Adaptation API.
AdaptationClient
Service that implements Google Cloud Speech Adaptation API.
ListCustomClassesAsyncPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __aiter__
method to iterate through its
custom_classes
field.
If there are more pages, the __aiter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListCustomClassesPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __iter__
method to iterate through its
custom_classes
field.
If there are more pages, the __iter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetAsyncPager
A pager for iterating through list_phrase_set
requests.
This class thinly wraps an initial
ListPhraseSetResponse object, and
provides an __aiter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __aiter__
method will make additional
ListPhraseSet
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetPager
A pager for iterating through list_phrase_set
requests.
This class thinly wraps an initial
ListPhraseSetResponse object, and
provides an __iter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __iter__
method will make additional
ListPhraseSet
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
SpeechAsyncClient
Service that implements Google Cloud Speech API.
SpeechClient
Service that implements Google Cloud Speech API.
CreateCustomClassRequest
Message sent by the client for the CreateCustomClass
method.
CreatePhraseSetRequest
Message sent by the client for the CreatePhraseSet
method.
CustomClass
A set of words or phrases that represents a common concept likely to appear in your audio, for example a list of passenger ship names. CustomClass items can be substituted into placeholders that you set in PhraseSet phrases.
ClassItem
An item of the class.
DeleteCustomClassRequest
Message sent by the client for the DeleteCustomClass
method.
DeletePhraseSetRequest
Message sent by the client for the DeletePhraseSet
method.
GetCustomClassRequest
Message sent by the client for the GetCustomClass
method.
GetPhraseSetRequest
Message sent by the client for the GetPhraseSet
method.
ListCustomClassesRequest
Message sent by the client for the ListCustomClasses
method.
ListCustomClassesResponse
Message returned to the client by the ListCustomClasses
method.
ListPhraseSetRequest
Message sent by the client for the ListPhraseSet
method.
ListPhraseSetResponse
Message returned to the client by the ListPhraseSet
method.
LongRunningRecognizeMetadata
Describes the progress of a long-running LongRunningRecognize
call. It is included in the metadata
field of the Operation
returned by the GetOperation
call of the
google::longrunning::Operations
service.
LongRunningRecognizeRequest
The top-level message sent by the client for the
LongRunningRecognize
method.
LongRunningRecognizeResponse
The only message returned to the client by the
LongRunningRecognize
method. It contains the result as zero or
more sequential SpeechRecognitionResult
messages. It is included
in the result.response
field of the Operation
returned by
the GetOperation
call of the google::longrunning::Operations
service.
PhraseSet
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
Phrase
A phrases containing words and phrase "hints" so that the speech
recognition is more likely to recognize them. This can be used to
improve the accuracy for specific words and phrases, for example, if
specific commands are typically spoken by the user. This can also be
used to add additional words to the vocabulary of the recognizer.
See usage
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
List items can also include pre-built or custom classes containing
groups of words that represent common concepts that occur in natural
language. For example, rather than providing a phrase hint for every
month of the year (e.g. "i was born in january", "i was born in
febuary", ...), use the pre-built $MONTH
class improves the
likelihood of correctly transcribing audio that includes months
(e.g. "i was born in $month"). To refer to pre-built classes, use
the class' symbol prepended with $
e.g. $MONTH
. To refer to
custom classes that were defined inline in the request, set the
class's custom_class_id
to a string unique to all class
resources and inline classes. Then use the class' id wrapped in
$\ {...}
e.g. "${my-months}". To refer to custom classes
resources, use the class' id wrapped in ${}
(e.g.
${my-months}
).
Speech-to-Text supports three locations: global
, us
(US
North America), and eu
(Europe). If you are calling the
speech.googleapis.com
endpoint, use the global
location. To
specify a region, use a regional
endpoint <https://cloud.google.com/speech-to-text/docs/endpoints>
__
with matching us
or eu
location value.
RecognitionAudio
Contains audio data in the encoding specified in the
RecognitionConfig
. Either content
or uri
must be
supplied. Supplying both or neither returns
google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
.
See content
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionConfig
Provides information to the recognizer that specifies how to process the request.
AudioEncoding
The encoding of the audio data sent in the request.
All encodings support only 1 channel (mono) audio, unless the
audio_channel_count
and
enable_separate_recognition_per_channel
fields are set.
For best results, the audio source should be captured and
transmitted using a lossless encoding (FLAC
or LINEAR16
).
The accuracy of the speech recognition can be reduced if lossy
codecs are used to capture or transmit audio, particularly if
background noise is present. Lossy codecs include MULAW
,
AMR
, AMR_WB
, OGG_OPUS
, SPEEX_WITH_HEADER_BYTE
,
MP3
, and WEBM_OPUS
.
The FLAC
and WAV
audio file formats include a header that
describes the included audio content. You can request recognition
for WAV
files that contain either LINEAR16
or MULAW
encoded audio. If you send FLAC
or WAV
audio file format in
your request, you do not need to specify an AudioEncoding
; the
audio encoding format is determined from the file header. If you
specify an AudioEncoding
when you send send FLAC
or WAV
audio, the encoding configuration must match the encoding described
in the audio header; otherwise the request returns an
google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
error code.
RecognitionMetadata
Description of audio data to be recognized.
InteractionType
Use case categories that the audio recognition request can be described by.
MicrophoneDistance
Enumerates the types of capture settings describing an audio file.
OriginalMediaType
The original media the speech was recorded on.
RecordingDeviceType
The type of device the speech was recorded with.
RecognizeRequest
The top-level message sent by the client for the Recognize
method.
RecognizeResponse
The only message returned to the client by the Recognize
method.
It contains the result as zero or more sequential
SpeechRecognitionResult
messages.
SpeakerDiarizationConfig
Config to enable speaker diarization.
SpeechAdaptation
Speech adaptation configuration.
ABNFGrammar
SpeechAdaptationInfo
Information on speech adaptation use in results
SpeechContext
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list).
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
StreamingRecognitionConfig
Provides information to the recognizer that specifies how to process the request.
VoiceActivityTimeout
Events that a timeout can be set on for voice activity.
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
StreamingRecognizeRequest
The top-level message sent by the client for the
StreamingRecognize
method. Multiple
StreamingRecognizeRequest
messages are sent. The first message
must contain a streaming_config
message and must not contain
audio_content
. All subsequent messages must contain
audio_content
and must not contain a streaming_config
message.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the
client by StreamingRecognize
. A series of zero or more
StreamingRecognizeResponse
messages are streamed back to the
client. If there is no recognizable audio, and single_utterance
is set to false, then no messages are streamed back to the client.
Here's an example of a series of StreamingRecognizeResponse
\ s
that might be returned while processing audio:
results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }
Notes:
Only two of the above responses #4 and #7 contain final results; they are indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question".The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
.The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary.In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
SpeechEventType
Indicates the type of speech event.
TranscriptNormalization
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
Entry
A single replacement configuration.
TranscriptOutputConfig
Specifies an optional destination for the recognition results.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
UpdateCustomClassRequest
Message sent by the client for the UpdateCustomClass
method.
UpdatePhraseSetRequest
Message sent by the client for the UpdatePhraseSet
method.
WordInfo
Word-specific information for recognized words.
AdaptationAsyncClient
Service that implements Google Cloud Speech Adaptation API.
AdaptationClient
Service that implements Google Cloud Speech Adaptation API.
ListCustomClassesAsyncPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __aiter__
method to iterate through its
custom_classes
field.
If there are more pages, the __aiter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListCustomClassesPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __iter__
method to iterate through its
custom_classes
field.
If there are more pages, the __iter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetAsyncPager
A pager for iterating through list_phrase_set
requests.
This class thinly wraps an initial
ListPhraseSetResponse object, and
provides an __aiter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __aiter__
method will make additional
ListPhraseSet
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetPager
A pager for iterating through list_phrase_set
requests.
This class thinly wraps an initial
ListPhraseSetResponse object, and
provides an __iter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __iter__
method will make additional
ListPhraseSet
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
SpeechAsyncClient
Service that implements Google Cloud Speech API.
SpeechClient
Service that implements Google Cloud Speech API.
CreateCustomClassRequest
Message sent by the client for the CreateCustomClass
method.
CreatePhraseSetRequest
Message sent by the client for the CreatePhraseSet
method.
CustomClass
A set of words or phrases that represents a common concept likely to appear in your audio, for example a list of passenger ship names. CustomClass items can be substituted into placeholders that you set in PhraseSet phrases.
ClassItem
An item of the class.
DeleteCustomClassRequest
Message sent by the client for the DeleteCustomClass
method.
DeletePhraseSetRequest
Message sent by the client for the DeletePhraseSet
method.
GetCustomClassRequest
Message sent by the client for the GetCustomClass
method.
GetPhraseSetRequest
Message sent by the client for the GetPhraseSet
method.
ListCustomClassesRequest
Message sent by the client for the ListCustomClasses
method.
ListCustomClassesResponse
Message returned to the client by the ListCustomClasses
method.
ListPhraseSetRequest
Message sent by the client for the ListPhraseSet
method.
ListPhraseSetResponse
Message returned to the client by the ListPhraseSet
method.
LongRunningRecognizeMetadata
Describes the progress of a long-running LongRunningRecognize
call. It is included in the metadata
field of the Operation
returned by the GetOperation
call of the
google::longrunning::Operations
service.
LongRunningRecognizeRequest
The top-level message sent by the client for the
LongRunningRecognize
method.
LongRunningRecognizeResponse
The only message returned to the client by the
LongRunningRecognize
method. It contains the result as zero or
more sequential SpeechRecognitionResult
messages. It is included
in the result.response
field of the Operation
returned by
the GetOperation
call of the google::longrunning::Operations
service.
PhraseSet
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
Phrase
A phrases containing words and phrase "hints" so that the speech
recognition is more likely to recognize them. This can be used to
improve the accuracy for specific words and phrases, for example, if
specific commands are typically spoken by the user. This can also be
used to add additional words to the vocabulary of the recognizer.
See usage
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
List items can also include pre-built or custom classes containing
groups of words that represent common concepts that occur in natural
language. For example, rather than providing a phrase hint for every
month of the year (e.g. "i was born in january", "i was born in
febuary", ...), use the pre-built $MONTH
class improves the
likelihood of correctly transcribing audio that includes months
(e.g. "i was born in $month"). To refer to pre-built classes, use
the class' symbol prepended with $
e.g. $MONTH
. To refer to
custom classes that were defined inline in the request, set the
class's custom_class_id
to a string unique to all class
resources and inline classes. Then use the class' id wrapped in
$\ {...}
e.g. "${my-months}". To refer to custom classes
resources, use the class' id wrapped in ${}
(e.g.
${my-months}
).
Speech-to-Text supports three locations: global
, us
(US
North America), and eu
(Europe). If you are calling the
speech.googleapis.com
endpoint, use the global
location. To
specify a region, use a regional
endpoint <https://cloud.google.com/speech-to-text/docs/endpoints>
__
with matching us
or eu
location value.
RecognitionAudio
Contains audio data in the encoding specified in the
RecognitionConfig
. Either content
or uri
must be
supplied. Supplying both or neither returns
google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
.
See content
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionConfig
Provides information to the recognizer that specifies how to process the request.
AudioEncoding
The encoding of the audio data sent in the request.
All encodings support only 1 channel (mono) audio, unless the
audio_channel_count
and
enable_separate_recognition_per_channel
fields are set.
For best results, the audio source should be captured and
transmitted using a lossless encoding (FLAC
or LINEAR16
).
The accuracy of the speech recognition can be reduced if lossy
codecs are used to capture or transmit audio, particularly if
background noise is present. Lossy codecs include MULAW
,
AMR
, AMR_WB
, OGG_OPUS
, SPEEX_WITH_HEADER_BYTE
,
MP3
, and WEBM_OPUS
.
The FLAC
and WAV
audio file formats include a header that
describes the included audio content. You can request recognition
for WAV
files that contain either LINEAR16
or MULAW
encoded audio. If you send FLAC
or WAV
audio file format in
your request, you do not need to specify an AudioEncoding
; the
audio encoding format is determined from the file header. If you
specify an AudioEncoding
when you send send FLAC
or WAV
audio, the encoding configuration must match the encoding described
in the audio header; otherwise the request returns an
google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
error code.
RecognitionMetadata
Description of audio data to be recognized.
InteractionType
Use case categories that the audio recognition request can be described by.
MicrophoneDistance
Enumerates the types of capture settings describing an audio file.
OriginalMediaType
The original media the speech was recorded on.
RecordingDeviceType
The type of device the speech was recorded with.
RecognizeRequest
The top-level message sent by the client for the Recognize
method.
RecognizeResponse
The only message returned to the client by the Recognize
method.
It contains the result as zero or more sequential
SpeechRecognitionResult
messages.
SpeakerDiarizationConfig
Config to enable speaker diarization.
SpeechAdaptation
Speech adaptation configuration.
ABNFGrammar
SpeechAdaptationInfo
Information on speech adaptation use in results
SpeechContext
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list).
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
StreamingRecognitionConfig
Provides information to the recognizer that specifies how to process the request.
VoiceActivityTimeout
Events that a timeout can be set on for voice activity.
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
StreamingRecognizeRequest
The top-level message sent by the client for the
StreamingRecognize
method. Multiple
StreamingRecognizeRequest
messages are sent. The first message
must contain a streaming_config
message and must not contain
audio_content
. All subsequent messages must contain
audio_content
and must not contain a streaming_config
message.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the
client by StreamingRecognize
. A series of zero or more
StreamingRecognizeResponse
messages are streamed back to the
client. If there is no recognizable audio, and single_utterance
is set to false, then no messages are streamed back to the client.
Here's an example of a series of StreamingRecognizeResponse
\ s
that might be returned while processing audio:
results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }
Notes:
Only two of the above responses #4 and #7 contain final results; they are indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question".The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
.The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary.In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
SpeechEventType
Indicates the type of speech event.
TranscriptNormalization
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
Entry
A single replacement configuration.
TranscriptOutputConfig
Specifies an optional destination for the recognition results.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
UpdateCustomClassRequest
Message sent by the client for the UpdateCustomClass
method.
UpdatePhraseSetRequest
Message sent by the client for the UpdatePhraseSet
method.
WordInfo
Word-specific information for recognized words.
SpeechAsyncClient
Enables speech transcription and resource management.
SpeechClient
Enables speech transcription and resource management.
ListCustomClassesAsyncPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __aiter__
method to iterate through its
custom_classes
field.
If there are more pages, the __aiter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListCustomClassesPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __iter__
method to iterate through its
custom_classes
field.
If there are more pages, the __iter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetsAsyncPager
A pager for iterating through list_phrase_sets
requests.
This class thinly wraps an initial
ListPhraseSetsResponse object, and
provides an __aiter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __aiter__
method will make additional
ListPhraseSets
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetsResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetsPager
A pager for iterating through list_phrase_sets
requests.
This class thinly wraps an initial
ListPhraseSetsResponse object, and
provides an __iter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __iter__
method will make additional
ListPhraseSets
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetsResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListRecognizersAsyncPager
A pager for iterating through list_recognizers
requests.
This class thinly wraps an initial
ListRecognizersResponse object, and
provides an __aiter__
method to iterate through its
recognizers
field.
If there are more pages, the __aiter__
method will make additional
ListRecognizers
requests and continue to iterate
through the recognizers
field on the
corresponding responses.
All the usual ListRecognizersResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListRecognizersPager
A pager for iterating through list_recognizers
requests.
This class thinly wraps an initial
ListRecognizersResponse object, and
provides an __iter__
method to iterate through its
recognizers
field.
If there are more pages, the __iter__
method will make additional
ListRecognizers
requests and continue to iterate
through the recognizers
field on the
corresponding responses.
All the usual ListRecognizersResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
AutoDetectDecodingConfig
Automatically detected decoding parameters. Supported for the following encodings:
WAV_LINEAR16: 16-bit signed little-endian PCM samples in a WAV container.
WAV_MULAW: 8-bit companded mulaw samples in a WAV container.
WAV_ALAW: 8-bit companded alaw samples in a WAV container.
RFC4867_5_AMR: AMR frames with an rfc4867.5 header.
RFC4867_5_AMRWB: AMR-WB frames with an rfc4867.5 header.
FLAC: FLAC frames in the "native FLAC" container format.
MP3: MPEG audio frames with optional (ignored) ID3 metadata.
OGG_OPUS: Opus audio frames in an Ogg container.
WEBM_OPUS: Opus audio frames in a WebM container.
M4A: M4A audio format.
BatchRecognizeFileMetadata
Metadata about a single file in a batch for BatchRecognize.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
BatchRecognizeFileResult
Final results for a single file.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
BatchRecognizeMetadata
Operation metadata for BatchRecognize.
TranscriptionMetadataEntry
The abstract base class for a message.
BatchRecognizeRequest
Request message for the BatchRecognize method.
ProcessingStrategy
Possible processing strategies for batch requests.
BatchRecognizeResponse
Response message for
BatchRecognize that
is packaged into a longrunning
Operation][google.longrunning.Operation]
.
ResultsEntry
The abstract base class for a message.
BatchRecognizeResults
Output type for Cloud Storage of BatchRecognize transcripts. Though this proto isn't returned in this API anywhere, the Cloud Storage transcripts will be this proto serialized and should be parsed as such.
BatchRecognizeTranscriptionMetadata
Metadata about transcription for a single file (for example, progress percent).
CloudStorageResult
Final results written to Cloud Storage.
Config
Message representing the config for the Speech-to-Text API. This
includes an optional KMS
key <https://cloud.google.com/kms/docs/resource-hierarchy#keys>
__
with which incoming data will be encrypted.
CreateCustomClassRequest
Request message for the CreateCustomClass method.
CreatePhraseSetRequest
Request message for the CreatePhraseSet method.
CreateRecognizerRequest
Request message for the CreateRecognizer method.
CustomClass
CustomClass for biasing in speech recognition. Used to define a set of words or phrases that represents a common concept or theme likely to appear in your audio, for example a list of passenger ship names.
AnnotationsEntry
The abstract base class for a message.
ClassItem
An item of the class.
State
Set of states that define the lifecycle of a CustomClass.
DeleteCustomClassRequest
Request message for the DeleteCustomClass method.
DeletePhraseSetRequest
Request message for the DeletePhraseSet method.
DeleteRecognizerRequest
Request message for the DeleteRecognizer method.
ExplicitDecodingConfig
Explicitly specified decoding parameters.
AudioEncoding
Supported audio data encodings.
GcsOutputConfig
Output configurations for Cloud Storage.
GetConfigRequest
Request message for the GetConfig method.
GetCustomClassRequest
Request message for the GetCustomClass method.
GetPhraseSetRequest
Request message for the GetPhraseSet method.
GetRecognizerRequest
Request message for the GetRecognizer method.
InlineOutputConfig
Output configurations for inline response.
InlineResult
Final results returned inline in the recognition response.
ListCustomClassesRequest
Request message for the ListCustomClasses method.
ListCustomClassesResponse
Response message for the ListCustomClasses method.
ListPhraseSetsRequest
Request message for the ListPhraseSets method.
ListPhraseSetsResponse
Response message for the ListPhraseSets method.
ListRecognizersRequest
Request message for the ListRecognizers method.
ListRecognizersResponse
Response message for the ListRecognizers method.
NativeOutputFileFormatConfig
Output configurations for serialized BatchRecognizeResults
protos.
OperationMetadata
Represents the metadata of a long-running operation.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
OutputFormatConfig
Configuration for the format of the results stored to output
.
PhraseSet
PhraseSet for biasing in speech recognition. A PhraseSet is used to provide "hints" to the speech recognizer to favor specific words and phrases in the results.
AnnotationsEntry
The abstract base class for a message.
Phrase
A Phrase contains words and phrase "hints" so that the speech recognition is more likely to recognize them. This can be used to improve the accuracy for specific words and phrases, for example, if specific commands are typically spoken by the user. This can also be used to add additional words to the vocabulary of the recognizer.
List items can also include CustomClass references containing groups of words that represent common concepts that occur in natural language.
State
Set of states that define the lifecycle of a PhraseSet.
RecognitionConfig
Provides information to the Recognizer that specifies how to process the recognition request.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionFeatures
Available recognition features.
MultiChannelMode
Options for how to recognize multi-channel audio.
RecognitionOutputConfig
Configuration options for the output(s) of recognition.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionResponseMetadata
Metadata about the recognition request and response.
RecognizeRequest
Request message for the
Recognize method. Either
content
or uri
must be supplied. Supplying both or neither
returns INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
. See
content
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognizeResponse
Response message for the Recognize method.
Recognizer
A Recognizer message. Stores recognition configuration and metadata.
AnnotationsEntry
The abstract base class for a message.
State
Set of states that define the lifecycle of a Recognizer.
SpeakerDiarizationConfig
Configuration to enable speaker diarization.
SpeechAdaptation
Provides "hints" to the speech recognizer to favor specific words and phrases in the results. PhraseSets can be specified as an inline resource, or a reference to an existing PhraseSet resource.
AdaptationPhraseSet
A biasing PhraseSet, which can be either a string referencing the name of an existing PhraseSets resource, or an inline definition of a PhraseSet.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list).
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
SrtOutputFileFormatConfig
Output configurations SubRip
Text <https://www.matroska.org/technical/subtitles.html#srt-subtitles>
__
formatted subtitle file.
StreamingRecognitionConfig
Provides configuration information for the StreamingRecognize request.
StreamingRecognitionFeatures
Available recognition features specific to streaming recognition requests.
VoiceActivityTimeout
Events that a timeout can be set on for voice activity.
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
StreamingRecognizeRequest
Request message for the StreamingRecognize method. Multiple StreamingRecognizeRequest messages are sent in one call.
If the Recognizer referenced by recognizer contains a fully specified request configuration then the stream may only contain messages with only audio set.
Otherwise the first message must contain a recognizer and a streaming_config message that together fully specify the request configuration and must not contain audio. All subsequent messages must only have audio set.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the
client by StreamingRecognize
. A series of zero or more
StreamingRecognizeResponse
messages are streamed back to the
client. If there is no recognizable audio then no messages are
streamed back to the client.
Here are some examples of StreamingRecognizeResponse
\ s that
might be returned while processing audio:
results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }
Notes:
Only two of the above responses #4 and #7 contain final results; they are indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question".The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
.The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary.In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
SpeechEventType
Indicates the type of speech event.
TranscriptNormalization
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
Entry
A single replacement configuration.
TranslationConfig
Translation configuration. Use to translate the given audio into text for the desired language.
UndeleteCustomClassRequest
Request message for the UndeleteCustomClass method.
UndeletePhraseSetRequest
Request message for the UndeletePhraseSet method.
UndeleteRecognizerRequest
Request message for the UndeleteRecognizer method.
UpdateConfigRequest
Request message for the UpdateConfig method.
UpdateCustomClassRequest
Request message for the UpdateCustomClass method.
UpdatePhraseSetRequest
Request message for the UpdatePhraseSet method.
UpdateRecognizerRequest
Request message for the UpdateRecognizer method.
VttOutputFileFormatConfig
Output configurations for
WebVTT <https://www.w3.org/TR/webvtt1/>
__ formatted subtitle file.
WordInfo
Word-specific information for recognized words.
Modules
pagers
API documentation for speech_v1.services.adaptation.pagers
module.
pagers
API documentation for speech_v1p1beta1.services.adaptation.pagers
module.
pagers
API documentation for speech_v2.services.speech.pagers
module.