Summary of entries of Classes for speech.
Classes
AdaptationAsyncClient
Service that implements Google Cloud Speech Adaptation API.
AdaptationClient
Service that implements Google Cloud Speech Adaptation API.
ListCustomClassesAsyncPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __aiter__
method to iterate through its
custom_classes
field.
If there are more pages, the __aiter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListCustomClassesPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __iter__
method to iterate through its
custom_classes
field.
If there are more pages, the __iter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetAsyncPager
A pager for iterating through list_phrase_set
requests.
This class thinly wraps an initial
ListPhraseSetResponse object, and
provides an __aiter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __aiter__
method will make additional
ListPhraseSet
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetPager
A pager for iterating through list_phrase_set
requests.
This class thinly wraps an initial
ListPhraseSetResponse object, and
provides an __iter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __iter__
method will make additional
ListPhraseSet
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
SpeechAsyncClient
Service that implements Google Cloud Speech API.
SpeechClient
Service that implements Google Cloud Speech API.
CreateCustomClassRequest
Message sent by the client for the CreateCustomClass
method.
CreatePhraseSetRequest
Message sent by the client for the CreatePhraseSet
method.
CustomClass
A set of words or phrases that represents a common concept likely to appear in your audio, for example a list of passenger ship names. CustomClass items can be substituted into placeholders that you set in PhraseSet phrases.
ClassItem
An item of the class.
DeleteCustomClassRequest
Message sent by the client for the DeleteCustomClass
method.
DeletePhraseSetRequest
Message sent by the client for the DeletePhraseSet
method.
GetCustomClassRequest
Message sent by the client for the GetCustomClass
method.
GetPhraseSetRequest
Message sent by the client for the GetPhraseSet
method.
ListCustomClassesRequest
Message sent by the client for the ListCustomClasses
method.
ListCustomClassesResponse
Message returned to the client by the ListCustomClasses
method.
ListPhraseSetRequest
Message sent by the client for the ListPhraseSet
method.
ListPhraseSetResponse
Message returned to the client by the ListPhraseSet
method.
LongRunningRecognizeMetadata
Describes the progress of a long-running LongRunningRecognize
call. It is included in the metadata
field of the Operation
returned by the GetOperation
call of the
google::longrunning::Operations
service.
LongRunningRecognizeRequest
The top-level message sent by the client for the
LongRunningRecognize
method.
LongRunningRecognizeResponse
The only message returned to the client by the
LongRunningRecognize
method. It contains the result as zero or
more sequential SpeechRecognitionResult
messages. It is included
in the result.response
field of the Operation
returned by
the GetOperation
call of the google::longrunning::Operations
service.
PhraseSet
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
Phrase
A phrases containing words and phrase "hints" so that the speech
recognition is more likely to recognize them. This can be used to
improve the accuracy for specific words and phrases, for example, if
specific commands are typically spoken by the user. This can also be
used to add additional words to the vocabulary of the recognizer.
See usage
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
List items can also include pre-built or custom classes containing
groups of words that represent common concepts that occur in natural
language. For example, rather than providing a phrase hint for every
month of the year (e.g. "i was born in january", "i was born in
febuary", ...), use the pre-built $MONTH
class improves the
likelihood of correctly transcribing audio that includes months
(e.g. "i was born in $month"). To refer to pre-built classes, use
the class' symbol prepended with $
e.g. $MONTH
. To refer to
custom classes that were defined inline in the request, set the
class's custom_class_id
to a string unique to all class
resources and inline classes. Then use the class' id wrapped in
$\ {...}
e.g. "${my-months}". To refer to custom classes
resources, use the class' id wrapped in ${}
(e.g.
${my-months}
).
Speech-to-Text supports three locations: global
, us
(US
North America), and eu
(Europe). If you are calling the
speech.googleapis.com
endpoint, use the global
location. To
specify a region, use a regional
endpoint <https://cloud.google.com/speech-to-text/docs/endpoints>
__
with matching us
or eu
location value.
RecognitionAudio
Contains audio data in the encoding specified in the
RecognitionConfig
. Either content
or uri
must be
supplied. Supplying both or neither returns
google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
.
See content
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionConfig
Provides information to the recognizer that specifies how to process the request.
AudioEncoding
The encoding of the audio data sent in the request.
All encodings support only 1 channel (mono) audio, unless the
audio_channel_count
and
enable_separate_recognition_per_channel
fields are set.
For best results, the audio source should be captured and
transmitted using a lossless encoding (FLAC
or LINEAR16
).
The accuracy of the speech recognition can be reduced if lossy
codecs are used to capture or transmit audio, particularly if
background noise is present. Lossy codecs include MULAW
,
AMR
, AMR_WB
, OGG_OPUS
, SPEEX_WITH_HEADER_BYTE
,
MP3
, and WEBM_OPUS
.
The FLAC
and WAV
audio file formats include a header that
describes the included audio content. You can request recognition
for WAV
files that contain either LINEAR16
or MULAW
encoded audio. If you send FLAC
or WAV
audio file format in
your request, you do not need to specify an AudioEncoding
; the
audio encoding format is determined from the file header. If you
specify an AudioEncoding
when you send send FLAC
or WAV
audio, the encoding configuration must match the encoding described
in the audio header; otherwise the request returns an
google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
error code.
Values:
ENCODING_UNSPECIFIED (0):
Not specified.
LINEAR16 (1):
Uncompressed 16-bit signed little-endian
samples (Linear PCM).
FLAC (2):
FLAC
(Free Lossless Audio Codec) is the recommended
encoding because it is lossless--therefore recognition is
not compromised--and requires only about half the bandwidth
of LINEAR16
. FLAC
stream encoding supports 16-bit
and 24-bit samples, however, not all fields in
STREAMINFO
are supported.
MULAW (3):
8-bit samples that compand 14-bit audio
samples using G.711 PCMU/mu-law.
AMR (4):
Adaptive Multi-Rate Narrowband codec. sample_rate_hertz
must be 8000.
AMR_WB (5):
Adaptive Multi-Rate Wideband codec. sample_rate_hertz
must be 16000.
OGG_OPUS (6):
Opus encoded audio frames in Ogg container
(OggOpus <https://wiki.xiph.org/OggOpus>
).
sample_rate_hertz
must be one of 8000, 12000, 16000,
24000, or 48000.
SPEEX_WITH_HEADER_BYTE (7):
Although the use of lossy encodings is not recommended, if a
very low bitrate encoding is required, OGG_OPUS
is
highly preferred over Speex encoding. The
Speex <https://speex.org/>
encoding supported by Cloud
Speech API has a header byte in each block, as in MIME type
audio/x-speex-with-header-byte
. It is a variant of the
RTP Speex encoding defined in RFC
5574 <https://tools.ietf.org/html/rfc5574>
__. The stream is
a sequence of blocks, one block per RTP packet. Each block
starts with a byte containing the length of the block, in
bytes, followed by one or more frames of Speex data, padded
to an integral number of bytes (octets) as specified in RFC
- In other words, each RTP header is replaced with a
single byte containing the block length. Only Speex wideband
is supported.
sample_rate_hertz
must be 16000. MP3 (8): MP3 audio. MP3 encoding is a Beta feature and only available in v1p1beta1. Support all standard MP3 bitrates (which range from 32-320 kbps). When using this encoding,sample_rate_hertz
has to match the sample rate of the file being used. WEBM_OPUS (9): Opus encoded audio frames in WebM container (OggOpus <https://wiki.xiph.org/OggOpus>
__).sample_rate_hertz
must be one of 8000, 12000, 16000, 24000, or 48000.
RecognitionMetadata
Description of audio data to be recognized.
InteractionType
Use case categories that the audio recognition request can be described by.
Values: INTERACTION_TYPE_UNSPECIFIED (0): Use case is either unknown or is something other than one of the other values below. DISCUSSION (1): Multiple people in a conversation or discussion. For example in a meeting with two or more people actively participating. Typically all the primary people speaking would be in the same room (if not, see PHONE_CALL) PRESENTATION (2): One or more persons lecturing or presenting to others, mostly uninterrupted. PHONE_CALL (3): A phone-call or video-conference in which two or more people, who are not in the same room, are actively participating. VOICEMAIL (4): A recorded message intended for another person to listen to. PROFESSIONALLY_PRODUCED (5): Professionally produced audio (eg. TV Show, Podcast). VOICE_SEARCH (6): Transcribe spoken questions and queries into text. VOICE_COMMAND (7): Transcribe voice commands, such as for controlling a device. DICTATION (8): Transcribe speech to text to create a written document, such as a text-message, email or report.
MicrophoneDistance
Enumerates the types of capture settings describing an audio file.
Values: MICROPHONE_DISTANCE_UNSPECIFIED (0): Audio type is not known. NEARFIELD (1): The audio was captured from a closely placed microphone. Eg. phone, dictaphone, or handheld microphone. Generally if there speaker is within 1 meter of the microphone. MIDFIELD (2): The speaker if within 3 meters of the microphone. FARFIELD (3): The speaker is more than 3 meters away from the microphone.
OriginalMediaType
The original media the speech was recorded on.
Values: ORIGINAL_MEDIA_TYPE_UNSPECIFIED (0): Unknown original media type. AUDIO (1): The speech data is an audio recording. VIDEO (2): The speech data originally recorded on a video.
RecordingDeviceType
The type of device the speech was recorded with.
Values: RECORDING_DEVICE_TYPE_UNSPECIFIED (0): The recording device is unknown. SMARTPHONE (1): Speech was recorded on a smartphone. PC (2): Speech was recorded using a personal computer or tablet. PHONE_LINE (3): Speech was recorded over a phone line. VEHICLE (4): Speech was recorded in a vehicle. OTHER_OUTDOOR_DEVICE (5): Speech was recorded outdoors. OTHER_INDOOR_DEVICE (6): Speech was recorded indoors.
RecognizeRequest
The top-level message sent by the client for the Recognize
method.
RecognizeResponse
The only message returned to the client by the Recognize
method.
It contains the result as zero or more sequential
SpeechRecognitionResult
messages.
SpeakerDiarizationConfig
Config to enable speaker diarization.
SpeechAdaptation
Speech adaptation configuration.
ABNFGrammar
SpeechAdaptationInfo
Information on speech adaptation use in results
SpeechContext
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list).
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
StreamingRecognitionConfig
Provides information to the recognizer that specifies how to process the request.
VoiceActivityTimeout
Events that a timeout can be set on for voice activity.
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
StreamingRecognizeRequest
The top-level message sent by the client for the
StreamingRecognize
method. Multiple
StreamingRecognizeRequest
messages are sent. The first message
must contain a streaming_config
message and must not contain
audio_content
. All subsequent messages must contain
audio_content
and must not contain a streaming_config
message.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the
client by StreamingRecognize
. A series of zero or more
StreamingRecognizeResponse
messages are streamed back to the
client. If there is no recognizable audio, and single_utterance
is set to false, then no messages are streamed back to the client.
Here's an example of a series of StreamingRecognizeResponse
\ s
that might be returned while processing audio:
results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }
Notes:
Only two of the above responses #4 and #7 contain final results; they are indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question".The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
.The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary.In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
SpeechEventType
Indicates the type of speech event.
Values:
SPEECH_EVENT_UNSPECIFIED (0):
No speech event specified.
END_OF_SINGLE_UTTERANCE (1):
This event indicates that the server has detected the end of
the user's speech utterance and expects no additional
speech. Therefore, the server will not process additional
audio (although it may subsequently return additional
results). The client should stop sending additional audio
data, half-close the gRPC connection, and wait for any
additional results until the server closes the gRPC
connection. This event is only sent if single_utterance
was set to true
, and is not used otherwise.
SPEECH_ACTIVITY_BEGIN (2):
This event indicates that the server has detected the
beginning of human voice activity in the stream. This event
can be returned multiple times if speech starts and stops
repeatedly throughout the stream. This event is only sent if
voice_activity_events
is set to true.
SPEECH_ACTIVITY_END (3):
This event indicates that the server has detected the end of
human voice activity in the stream. This event can be
returned multiple times if speech starts and stops
repeatedly throughout the stream. This event is only sent if
voice_activity_events
is set to true.
SPEECH_ACTIVITY_TIMEOUT (4):
This event indicates that the user-set
timeout for speech activity begin or end has
exceeded. Upon receiving this event, the client
is expected to send a half close. Further audio
will not be processed.
TranscriptNormalization
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
Entry
A single replacement configuration.
TranscriptOutputConfig
Specifies an optional destination for the recognition results.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
UpdateCustomClassRequest
Message sent by the client for the UpdateCustomClass
method.
UpdatePhraseSetRequest
Message sent by the client for the UpdatePhraseSet
method.
WordInfo
Word-specific information for recognized words.
AdaptationAsyncClient
Service that implements Google Cloud Speech Adaptation API.
AdaptationClient
Service that implements Google Cloud Speech Adaptation API.
ListCustomClassesAsyncPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __aiter__
method to iterate through its
custom_classes
field.
If there are more pages, the __aiter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListCustomClassesPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __iter__
method to iterate through its
custom_classes
field.
If there are more pages, the __iter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetAsyncPager
A pager for iterating through list_phrase_set
requests.
This class thinly wraps an initial
ListPhraseSetResponse object, and
provides an __aiter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __aiter__
method will make additional
ListPhraseSet
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetPager
A pager for iterating through list_phrase_set
requests.
This class thinly wraps an initial
ListPhraseSetResponse object, and
provides an __iter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __iter__
method will make additional
ListPhraseSet
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
SpeechAsyncClient
Service that implements Google Cloud Speech API.
SpeechClient
Service that implements Google Cloud Speech API.
CreateCustomClassRequest
Message sent by the client for the CreateCustomClass
method.
CreatePhraseSetRequest
Message sent by the client for the CreatePhraseSet
method.
CustomClass
A set of words or phrases that represents a common concept likely to appear in your audio, for example a list of passenger ship names. CustomClass items can be substituted into placeholders that you set in PhraseSet phrases.
ClassItem
An item of the class.
DeleteCustomClassRequest
Message sent by the client for the DeleteCustomClass
method.
DeletePhraseSetRequest
Message sent by the client for the DeletePhraseSet
method.
GetCustomClassRequest
Message sent by the client for the GetCustomClass
method.
GetPhraseSetRequest
Message sent by the client for the GetPhraseSet
method.
ListCustomClassesRequest
Message sent by the client for the ListCustomClasses
method.
ListCustomClassesResponse
Message returned to the client by the ListCustomClasses
method.
ListPhraseSetRequest
Message sent by the client for the ListPhraseSet
method.
ListPhraseSetResponse
Message returned to the client by the ListPhraseSet
method.
LongRunningRecognizeMetadata
Describes the progress of a long-running LongRunningRecognize
call. It is included in the metadata
field of the Operation
returned by the GetOperation
call of the
google::longrunning::Operations
service.
LongRunningRecognizeRequest
The top-level message sent by the client for the
LongRunningRecognize
method.
LongRunningRecognizeResponse
The only message returned to the client by the
LongRunningRecognize
method. It contains the result as zero or
more sequential SpeechRecognitionResult
messages. It is included
in the result.response
field of the Operation
returned by
the GetOperation
call of the google::longrunning::Operations
service.
PhraseSet
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
Phrase
A phrases containing words and phrase "hints" so that the speech
recognition is more likely to recognize them. This can be used to
improve the accuracy for specific words and phrases, for example, if
specific commands are typically spoken by the user. This can also be
used to add additional words to the vocabulary of the recognizer.
See usage
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
List items can also include pre-built or custom classes containing
groups of words that represent common concepts that occur in natural
language. For example, rather than providing a phrase hint for every
month of the year (e.g. "i was born in january", "i was born in
febuary", ...), use the pre-built $MONTH
class improves the
likelihood of correctly transcribing audio that includes months
(e.g. "i was born in $month"). To refer to pre-built classes, use
the class' symbol prepended with $
e.g. $MONTH
. To refer to
custom classes that were defined inline in the request, set the
class's custom_class_id
to a string unique to all class
resources and inline classes. Then use the class' id wrapped in
$\ {...}
e.g. "${my-months}". To refer to custom classes
resources, use the class' id wrapped in ${}
(e.g.
${my-months}
).
Speech-to-Text supports three locations: global
, us
(US
North America), and eu
(Europe). If you are calling the
speech.googleapis.com
endpoint, use the global
location. To
specify a region, use a regional
endpoint <https://cloud.google.com/speech-to-text/docs/endpoints>
__
with matching us
or eu
location value.
RecognitionAudio
Contains audio data in the encoding specified in the
RecognitionConfig
. Either content
or uri
must be
supplied. Supplying both or neither returns
google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
.
See content
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionConfig
Provides information to the recognizer that specifies how to process the request.
AudioEncoding
The encoding of the audio data sent in the request.
All encodings support only 1 channel (mono) audio, unless the
audio_channel_count
and
enable_separate_recognition_per_channel
fields are set.
For best results, the audio source should be captured and
transmitted using a lossless encoding (FLAC
or LINEAR16
).
The accuracy of the speech recognition can be reduced if lossy
codecs are used to capture or transmit audio, particularly if
background noise is present. Lossy codecs include MULAW
,
AMR
, AMR_WB
, OGG_OPUS
, SPEEX_WITH_HEADER_BYTE
,
MP3
, and WEBM_OPUS
.
The FLAC
and WAV
audio file formats include a header that
describes the included audio content. You can request recognition
for WAV
files that contain either LINEAR16
or MULAW
encoded audio. If you send FLAC
or WAV
audio file format in
your request, you do not need to specify an AudioEncoding
; the
audio encoding format is determined from the file header. If you
specify an AudioEncoding
when you send send FLAC
or WAV
audio, the encoding configuration must match the encoding described
in the audio header; otherwise the request returns an
google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
error code.
Values:
ENCODING_UNSPECIFIED (0):
Not specified.
LINEAR16 (1):
Uncompressed 16-bit signed little-endian
samples (Linear PCM).
FLAC (2):
FLAC
(Free Lossless Audio Codec) is the recommended
encoding because it is lossless--therefore recognition is
not compromised--and requires only about half the bandwidth
of LINEAR16
. FLAC
stream encoding supports 16-bit
and 24-bit samples, however, not all fields in
STREAMINFO
are supported.
MULAW (3):
8-bit samples that compand 14-bit audio
samples using G.711 PCMU/mu-law.
AMR (4):
Adaptive Multi-Rate Narrowband codec. sample_rate_hertz
must be 8000.
AMR_WB (5):
Adaptive Multi-Rate Wideband codec. sample_rate_hertz
must be 16000.
OGG_OPUS (6):
Opus encoded audio frames in Ogg container
(OggOpus <https://wiki.xiph.org/OggOpus>
).
sample_rate_hertz
must be one of 8000, 12000, 16000,
24000, or 48000.
SPEEX_WITH_HEADER_BYTE (7):
Although the use of lossy encodings is not recommended, if a
very low bitrate encoding is required, OGG_OPUS
is
highly preferred over Speex encoding. The
Speex <https://speex.org/>
encoding supported by Cloud
Speech API has a header byte in each block, as in MIME type
audio/x-speex-with-header-byte
. It is a variant of the
RTP Speex encoding defined in RFC
5574 <https://tools.ietf.org/html/rfc5574>
__. The stream is
a sequence of blocks, one block per RTP packet. Each block
starts with a byte containing the length of the block, in
bytes, followed by one or more frames of Speex data, padded
to an integral number of bytes (octets) as specified in RFC
- In other words, each RTP header is replaced with a
single byte containing the block length. Only Speex wideband
is supported.
sample_rate_hertz
must be 16000. MP3 (8): MP3 audio. MP3 encoding is a Beta feature and only available in v1p1beta1. Support all standard MP3 bitrates (which range from 32-320 kbps). When using this encoding,sample_rate_hertz
has to match the sample rate of the file being used. WEBM_OPUS (9): Opus encoded audio frames in WebM container (OggOpus <https://wiki.xiph.org/OggOpus>
__).sample_rate_hertz
must be one of 8000, 12000, 16000, 24000, or 48000.
RecognitionMetadata
Description of audio data to be recognized.
InteractionType
Use case categories that the audio recognition request can be described by.
Values: INTERACTION_TYPE_UNSPECIFIED (0): Use case is either unknown or is something other than one of the other values below. DISCUSSION (1): Multiple people in a conversation or discussion. For example in a meeting with two or more people actively participating. Typically all the primary people speaking would be in the same room (if not, see PHONE_CALL) PRESENTATION (2): One or more persons lecturing or presenting to others, mostly uninterrupted. PHONE_CALL (3): A phone-call or video-conference in which two or more people, who are not in the same room, are actively participating. VOICEMAIL (4): A recorded message intended for another person to listen to. PROFESSIONALLY_PRODUCED (5): Professionally produced audio (eg. TV Show, Podcast). VOICE_SEARCH (6): Transcribe spoken questions and queries into text. VOICE_COMMAND (7): Transcribe voice commands, such as for controlling a device. DICTATION (8): Transcribe speech to text to create a written document, such as a text-message, email or report.
MicrophoneDistance
Enumerates the types of capture settings describing an audio file.
Values: MICROPHONE_DISTANCE_UNSPECIFIED (0): Audio type is not known. NEARFIELD (1): The audio was captured from a closely placed microphone. Eg. phone, dictaphone, or handheld microphone. Generally if there speaker is within 1 meter of the microphone. MIDFIELD (2): The speaker if within 3 meters of the microphone. FARFIELD (3): The speaker is more than 3 meters away from the microphone.
OriginalMediaType
The original media the speech was recorded on.
Values: ORIGINAL_MEDIA_TYPE_UNSPECIFIED (0): Unknown original media type. AUDIO (1): The speech data is an audio recording. VIDEO (2): The speech data originally recorded on a video.
RecordingDeviceType
The type of device the speech was recorded with.
Values: RECORDING_DEVICE_TYPE_UNSPECIFIED (0): The recording device is unknown. SMARTPHONE (1): Speech was recorded on a smartphone. PC (2): Speech was recorded using a personal computer or tablet. PHONE_LINE (3): Speech was recorded over a phone line. VEHICLE (4): Speech was recorded in a vehicle. OTHER_OUTDOOR_DEVICE (5): Speech was recorded outdoors. OTHER_INDOOR_DEVICE (6): Speech was recorded indoors.
RecognizeRequest
The top-level message sent by the client for the Recognize
method.
RecognizeResponse
The only message returned to the client by the Recognize
method.
It contains the result as zero or more sequential
SpeechRecognitionResult
messages.
SpeakerDiarizationConfig
Config to enable speaker diarization.
SpeechAdaptation
Speech adaptation configuration.
ABNFGrammar
SpeechAdaptationInfo
Information on speech adaptation use in results
SpeechContext
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list).
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
StreamingRecognitionConfig
Provides information to the recognizer that specifies how to process the request.
VoiceActivityTimeout
Events that a timeout can be set on for voice activity.
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
StreamingRecognizeRequest
The top-level message sent by the client for the
StreamingRecognize
method. Multiple
StreamingRecognizeRequest
messages are sent. The first message
must contain a streaming_config
message and must not contain
audio_content
. All subsequent messages must contain
audio_content
and must not contain a streaming_config
message.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the
client by StreamingRecognize
. A series of zero or more
StreamingRecognizeResponse
messages are streamed back to the
client. If there is no recognizable audio, and single_utterance
is set to false, then no messages are streamed back to the client.
Here's an example of a series of StreamingRecognizeResponse
\ s
that might be returned while processing audio:
results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }
Notes:
Only two of the above responses #4 and #7 contain final results; they are indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question".The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
.The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary.In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
SpeechEventType
Indicates the type of speech event.
Values:
SPEECH_EVENT_UNSPECIFIED (0):
No speech event specified.
END_OF_SINGLE_UTTERANCE (1):
This event indicates that the server has detected the end of
the user's speech utterance and expects no additional
speech. Therefore, the server will not process additional
audio (although it may subsequently return additional
results). The client should stop sending additional audio
data, half-close the gRPC connection, and wait for any
additional results until the server closes the gRPC
connection. This event is only sent if single_utterance
was set to true
, and is not used otherwise.
SPEECH_ACTIVITY_BEGIN (2):
This event indicates that the server has detected the
beginning of human voice activity in the stream. This event
can be returned multiple times if speech starts and stops
repeatedly throughout the stream. This event is only sent if
voice_activity_events
is set to true.
SPEECH_ACTIVITY_END (3):
This event indicates that the server has detected the end of
human voice activity in the stream. This event can be
returned multiple times if speech starts and stops
repeatedly throughout the stream. This event is only sent if
voice_activity_events
is set to true.
SPEECH_ACTIVITY_TIMEOUT (4):
This event indicates that the user-set
timeout for speech activity begin or end has
exceeded. Upon receiving this event, the client
is expected to send a half close. Further audio
will not be processed.
TranscriptNormalization
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
Entry
A single replacement configuration.
TranscriptOutputConfig
Specifies an optional destination for the recognition results.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
UpdateCustomClassRequest
Message sent by the client for the UpdateCustomClass
method.
UpdatePhraseSetRequest
Message sent by the client for the UpdatePhraseSet
method.
WordInfo
Word-specific information for recognized words.
SpeechAsyncClient
Enables speech transcription and resource management.
SpeechClient
Enables speech transcription and resource management.
ListCustomClassesAsyncPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __aiter__
method to iterate through its
custom_classes
field.
If there are more pages, the __aiter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListCustomClassesPager
A pager for iterating through list_custom_classes
requests.
This class thinly wraps an initial
ListCustomClassesResponse object, and
provides an __iter__
method to iterate through its
custom_classes
field.
If there are more pages, the __iter__
method will make additional
ListCustomClasses
requests and continue to iterate
through the custom_classes
field on the
corresponding responses.
All the usual ListCustomClassesResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetsAsyncPager
A pager for iterating through list_phrase_sets
requests.
This class thinly wraps an initial
ListPhraseSetsResponse object, and
provides an __aiter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __aiter__
method will make additional
ListPhraseSets
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetsResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListPhraseSetsPager
A pager for iterating through list_phrase_sets
requests.
This class thinly wraps an initial
ListPhraseSetsResponse object, and
provides an __iter__
method to iterate through its
phrase_sets
field.
If there are more pages, the __iter__
method will make additional
ListPhraseSets
requests and continue to iterate
through the phrase_sets
field on the
corresponding responses.
All the usual ListPhraseSetsResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListRecognizersAsyncPager
A pager for iterating through list_recognizers
requests.
This class thinly wraps an initial
ListRecognizersResponse object, and
provides an __aiter__
method to iterate through its
recognizers
field.
If there are more pages, the __aiter__
method will make additional
ListRecognizers
requests and continue to iterate
through the recognizers
field on the
corresponding responses.
All the usual ListRecognizersResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
ListRecognizersPager
A pager for iterating through list_recognizers
requests.
This class thinly wraps an initial
ListRecognizersResponse object, and
provides an __iter__
method to iterate through its
recognizers
field.
If there are more pages, the __iter__
method will make additional
ListRecognizers
requests and continue to iterate
through the recognizers
field on the
corresponding responses.
All the usual ListRecognizersResponse attributes are available on the pager. If multiple requests are made, only the most recent response is retained, and thus used for attribute lookup.
AutoDetectDecodingConfig
Automatically detected decoding parameters. Supported for the following encodings:
WAV_LINEAR16: 16-bit signed little-endian PCM samples in a WAV container.
WAV_MULAW: 8-bit companded mulaw samples in a WAV container.
WAV_ALAW: 8-bit companded alaw samples in a WAV container.
RFC4867_5_AMR: AMR frames with an rfc4867.5 header.
RFC4867_5_AMRWB: AMR-WB frames with an rfc4867.5 header.
FLAC: FLAC frames in the "native FLAC" container format.
MP3: MPEG audio frames with optional (ignored) ID3 metadata.
OGG_OPUS: Opus audio frames in an Ogg container.
WEBM_OPUS: Opus audio frames in a WebM container.
M4A: M4A audio format.
BatchRecognizeFileMetadata
Metadata about a single file in a batch for BatchRecognize.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
BatchRecognizeFileResult
Final results for a single file.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
BatchRecognizeMetadata
Operation metadata for BatchRecognize.
TranscriptionMetadataEntry
The abstract base class for a message.
BatchRecognizeRequest
Request message for the BatchRecognize method.
ProcessingStrategy
Possible processing strategies for batch requests.
Values: PROCESSING_STRATEGY_UNSPECIFIED (0): Default value for the processing strategy. The request is processed as soon as its received. DYNAMIC_BATCHING (1): If selected, processes the request during lower utilization periods for a price discount. The request is fulfilled within 24 hours.
BatchRecognizeResponse
Response message for
BatchRecognize that
is packaged into a longrunning
Operation][google.longrunning.Operation]
.
ResultsEntry
The abstract base class for a message.
BatchRecognizeResults
Output type for Cloud Storage of BatchRecognize transcripts. Though this proto isn't returned in this API anywhere, the Cloud Storage transcripts will be this proto serialized and should be parsed as such.
BatchRecognizeTranscriptionMetadata
Metadata about transcription for a single file (for example, progress percent).
CloudStorageResult
Final results written to Cloud Storage.
Config
Message representing the config for the Speech-to-Text API. This
includes an optional KMS
key <https://cloud.google.com/kms/docs/resource-hierarchy#keys>
__
with which incoming data will be encrypted.
CreateCustomClassRequest
Request message for the CreateCustomClass method.
CreatePhraseSetRequest
Request message for the CreatePhraseSet method.
CreateRecognizerRequest
Request message for the CreateRecognizer method.
CustomClass
CustomClass for biasing in speech recognition. Used to define a set of words or phrases that represents a common concept or theme likely to appear in your audio, for example a list of passenger ship names.
AnnotationsEntry
The abstract base class for a message.
ClassItem
An item of the class.
State
Set of states that define the lifecycle of a CustomClass.
Values: STATE_UNSPECIFIED (0): Unspecified state. This is only used/useful for distinguishing unset values. ACTIVE (2): The normal and active state. DELETED (4): This CustomClass has been deleted.
DeleteCustomClassRequest
Request message for the DeleteCustomClass method.
DeletePhraseSetRequest
Request message for the DeletePhraseSet method.
DeleteRecognizerRequest
Request message for the DeleteRecognizer method.
ExplicitDecodingConfig
Explicitly specified decoding parameters.
AudioEncoding
Supported audio data encodings.
Values: AUDIO_ENCODING_UNSPECIFIED (0): Default value. This value is unused. LINEAR16 (1): Headerless 16-bit signed little-endian PCM samples. MULAW (2): Headerless 8-bit companded mulaw samples. ALAW (3): Headerless 8-bit companded alaw samples.
GcsOutputConfig
Output configurations for Cloud Storage.
GetConfigRequest
Request message for the GetConfig method.
GetCustomClassRequest
Request message for the GetCustomClass method.
GetPhraseSetRequest
Request message for the GetPhraseSet method.
GetRecognizerRequest
Request message for the GetRecognizer method.
InlineOutputConfig
Output configurations for inline response.
InlineResult
Final results returned inline in the recognition response.
ListCustomClassesRequest
Request message for the ListCustomClasses method.
ListCustomClassesResponse
Response message for the ListCustomClasses method.
ListPhraseSetsRequest
Request message for the ListPhraseSets method.
ListPhraseSetsResponse
Response message for the ListPhraseSets method.
ListRecognizersRequest
Request message for the ListRecognizers method.
ListRecognizersResponse
Response message for the ListRecognizers method.
NativeOutputFileFormatConfig
Output configurations for serialized BatchRecognizeResults
protos.
OperationMetadata
Represents the metadata of a long-running operation.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
OutputFormatConfig
Configuration for the format of the results stored to output
.
PhraseSet
PhraseSet for biasing in speech recognition. A PhraseSet is used to provide "hints" to the speech recognizer to favor specific words and phrases in the results.
AnnotationsEntry
The abstract base class for a message.
Phrase
A Phrase contains words and phrase "hints" so that the speech recognition is more likely to recognize them. This can be used to improve the accuracy for specific words and phrases, for example, if specific commands are typically spoken by the user. This can also be used to add additional words to the vocabulary of the recognizer.
List items can also include CustomClass references containing groups of words that represent common concepts that occur in natural language.
State
Set of states that define the lifecycle of a PhraseSet.
Values: STATE_UNSPECIFIED (0): Unspecified state. This is only used/useful for distinguishing unset values. ACTIVE (2): The normal and active state. DELETED (4): This PhraseSet has been deleted.
RecognitionConfig
Provides information to the Recognizer that specifies how to process the recognition request.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionFeatures
Available recognition features.
MultiChannelMode
Options for how to recognize multi-channel audio.
Values:
MULTI_CHANNEL_MODE_UNSPECIFIED (0):
Default value for the multi-channel mode. If
the audio contains multiple channels, only the
first channel will be transcribed; other
channels will be ignored.
SEPARATE_RECOGNITION_PER_CHANNEL (1):
If selected, each channel in the provided audio is
transcribed independently. This cannot be selected if the
selected model is
latest_short
.
RecognitionOutputConfig
Configuration options for the output(s) of recognition.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionResponseMetadata
Metadata about the recognition request and response.
RecognizeRequest
Request message for the
Recognize method. Either
content
or uri
must be supplied. Supplying both or neither
returns INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
. See
content
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognizeResponse
Response message for the Recognize method.
Recognizer
A Recognizer message. Stores recognition configuration and metadata.
AnnotationsEntry
The abstract base class for a message.
State
Set of states that define the lifecycle of a Recognizer.
Values: STATE_UNSPECIFIED (0): The default value. This value is used if the state is omitted. ACTIVE (2): The Recognizer is active and ready for use. DELETED (4): This Recognizer has been deleted.
SpeakerDiarizationConfig
Configuration to enable speaker diarization.
SpeechAdaptation
Provides "hints" to the speech recognizer to favor specific words and phrases in the results. PhraseSets can be specified as an inline resource, or a reference to an existing PhraseSet resource.
AdaptationPhraseSet
A biasing PhraseSet, which can be either a string referencing the name of an existing PhraseSets resource, or an inline definition of a PhraseSet.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list).
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
SrtOutputFileFormatConfig
Output configurations SubRip
Text <https://www.matroska.org/technical/subtitles.html#srt-subtitles>
__
formatted subtitle file.
StreamingRecognitionConfig
Provides configuration information for the StreamingRecognize request.
StreamingRecognitionFeatures
Available recognition features specific to streaming recognition requests.
VoiceActivityTimeout
Events that a timeout can be set on for voice activity.
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
StreamingRecognizeRequest
Request message for the StreamingRecognize method. Multiple StreamingRecognizeRequest messages are sent in one call.
If the Recognizer referenced by recognizer contains a fully specified request configuration then the stream may only contain messages with only audio set.
Otherwise the first message must contain a recognizer and a streaming_config message that together fully specify the request configuration and must not contain audio. All subsequent messages must only have audio set.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the
client by StreamingRecognize
. A series of zero or more
StreamingRecognizeResponse
messages are streamed back to the
client. If there is no recognizable audio then no messages are
streamed back to the client.
Here are some examples of StreamingRecognizeResponse
\ s that
might be returned while processing audio:
results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }
Notes:
Only two of the above responses #4 and #7 contain final results; they are indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question".The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
.The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary.In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
SpeechEventType
Indicates the type of speech event.
Values:
SPEECH_EVENT_TYPE_UNSPECIFIED (0):
No speech event specified.
END_OF_SINGLE_UTTERANCE (1):
This event indicates that the server has detected the end of
the user's speech utterance and expects no additional
speech. Therefore, the server will not process additional
audio and will close the gRPC bidirectional stream. This
event is only sent if there was a force cutoff due to
silence being detected early. This event is only available
through the latest_short
model.
SPEECH_ACTIVITY_BEGIN (2):
This event indicates that the server has detected the
beginning of human voice activity in the stream. This event
can be returned multiple times if speech starts and stops
repeatedly throughout the stream. This event is only sent if
voice_activity_events
is set to true.
SPEECH_ACTIVITY_END (3):
This event indicates that the server has detected the end of
human voice activity in the stream. This event can be
returned multiple times if speech starts and stops
repeatedly throughout the stream. This event is only sent if
voice_activity_events
is set to true.
TranscriptNormalization
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
Entry
A single replacement configuration.
TranslationConfig
Translation configuration. Use to translate the given audio into text for the desired language.
UndeleteCustomClassRequest
Request message for the UndeleteCustomClass method.
UndeletePhraseSetRequest
Request message for the UndeletePhraseSet method.
UndeleteRecognizerRequest
Request message for the UndeleteRecognizer method.
UpdateConfigRequest
Request message for the UpdateConfig method.
UpdateCustomClassRequest
Request message for the UpdateCustomClass method.
UpdatePhraseSetRequest
Request message for the UpdatePhraseSet method.
UpdateRecognizerRequest
Request message for the UpdateRecognizer method.
VttOutputFileFormatConfig
Output configurations for
WebVTT <https://www.w3.org/TR/webvtt1/>
__ formatted subtitle file.
WordInfo
Word-specific information for recognized words.
Modules
pagers
API documentation for speech_v1.services.adaptation.pagers
module.
pagers
API documentation for speech_v1p1beta1.services.adaptation.pagers
module.
pagers
API documentation for speech_v2.services.speech.pagers
module.