API documentation for speech_v2.types
package.
Classes
AccessMetadata
The access metadata for a particular region. This can be applied if the org policy for the given project disallows a particular region.
AutoDetectDecodingConfig
Automatically detected decoding parameters. Supported for the following encodings:
WAV_LINEAR16: 16-bit signed little-endian PCM samples in a WAV container.
WAV_MULAW: 8-bit companded mulaw samples in a WAV container.
WAV_ALAW: 8-bit companded alaw samples in a WAV container.
RFC4867_5_AMR: AMR frames with an rfc4867.5 header.
RFC4867_5_AMRWB: AMR-WB frames with an rfc4867.5 header.
FLAC: FLAC frames in the "native FLAC" container format.
MP3: MPEG audio frames with optional (ignored) ID3 metadata.
OGG_OPUS: Opus audio frames in an Ogg container.
WEBM_OPUS: Opus audio frames in a WebM container.
MP4_AAC: AAC audio frames in an MP4 container.
M4A_AAC: AAC audio frames in an M4A container.
MOV_AAC: AAC audio frames in an MOV container.
BatchRecognizeFileMetadata
Metadata about a single file in a batch for BatchRecognize.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
BatchRecognizeFileResult
Final results for a single file.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
BatchRecognizeMetadata
Operation metadata for BatchRecognize.
BatchRecognizeRequest
Request message for the BatchRecognize method.
BatchRecognizeResponse
Response message for
BatchRecognize that
is packaged into a longrunning
Operation][google.longrunning.Operation]
.
BatchRecognizeResults
Output type for Cloud Storage of BatchRecognize transcripts. Though this proto isn't returned in this API anywhere, the Cloud Storage transcripts will be this proto serialized and should be parsed as such.
BatchRecognizeTranscriptionMetadata
Metadata about transcription for a single file (for example, progress percent).
CloudStorageResult
Final results written to Cloud Storage.
Config
Message representing the config for the Speech-to-Text API. This
includes an optional KMS
key <https://cloud.google.com/kms/docs/resource-hierarchy#keys>
__
with which incoming data will be encrypted.
CreateCustomClassRequest
Request message for the CreateCustomClass method.
CreatePhraseSetRequest
Request message for the CreatePhraseSet method.
CreateRecognizerRequest
Request message for the CreateRecognizer method.
CustomClass
CustomClass for biasing in speech recognition. Used to define a set of words or phrases that represents a common concept or theme likely to appear in your audio, for example a list of passenger ship names.
DeleteCustomClassRequest
Request message for the DeleteCustomClass method.
DeletePhraseSetRequest
Request message for the DeletePhraseSet method.
DeleteRecognizerRequest
Request message for the DeleteRecognizer method.
ExplicitDecodingConfig
Explicitly specified decoding parameters.
GcsOutputConfig
Output configurations for Cloud Storage.
GetConfigRequest
Request message for the GetConfig method.
GetCustomClassRequest
Request message for the GetCustomClass method.
GetPhraseSetRequest
Request message for the GetPhraseSet method.
GetRecognizerRequest
Request message for the GetRecognizer method.
InlineOutputConfig
Output configurations for inline response.
InlineResult
Final results returned inline in the recognition response.
LanguageMetadata
The metadata about locales available in a given region. Currently this is just the models that are available for each locale
ListCustomClassesRequest
Request message for the ListCustomClasses method.
ListCustomClassesResponse
Response message for the ListCustomClasses method.
ListPhraseSetsRequest
Request message for the ListPhraseSets method.
ListPhraseSetsResponse
Response message for the ListPhraseSets method.
ListRecognizersRequest
Request message for the ListRecognizers method.
ListRecognizersResponse
Response message for the ListRecognizers method.
LocationsMetadata
Main metadata for the Locations API for STT V2. Currently this is just the metadata about locales, models, and features
ModelFeature
Representes a singular feature of a model. If the feature is
recognizer
, the release_state of the feature represents the
release_state of the model
ModelFeatures
Represents the collection of features belonging to a model
ModelMetadata
The metadata about the models in a given region for a specific locale. Currently this is just the features of the model
NativeOutputFileFormatConfig
Output configurations for serialized BatchRecognizeResults
protos.
OperationMetadata
Represents the metadata of a long-running operation.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
OutputFormatConfig
Configuration for the format of the results stored to output
.
PhraseSet
PhraseSet for biasing in speech recognition. A PhraseSet is used to provide "hints" to the speech recognizer to favor specific words and phrases in the results.
RecognitionConfig
Provides information to the Recognizer that specifies how to process the recognition request.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionFeatures
Available recognition features.
RecognitionOutputConfig
Configuration options for the output(s) of recognition.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionResponseMetadata
Metadata about the recognition request and response.
RecognizeRequest
Request message for the
Recognize method. Either
content
or uri
must be supplied. Supplying both or neither
returns INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
. See
content
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognizeResponse
Response message for the Recognize method.
Recognizer
A Recognizer message. Stores recognition configuration and metadata.
SpeakerDiarizationConfig
Configuration to enable speaker diarization.
SpeechAdaptation
Provides "hints" to the speech recognizer to favor specific words and phrases in the results. PhraseSets can be specified as an inline resource, or a reference to an existing PhraseSet resource.
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list).
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
SrtOutputFileFormatConfig
Output configurations SubRip
Text <https://www.matroska.org/technical/subtitles.html#srt-subtitles>
__
formatted subtitle file.
StreamingRecognitionConfig
Provides configuration information for the StreamingRecognize request.
StreamingRecognitionFeatures
Available recognition features specific to streaming recognition requests.
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
StreamingRecognizeRequest
Request message for the StreamingRecognize method. Multiple StreamingRecognizeRequest messages are sent in one call.
If the Recognizer referenced by recognizer contains a fully specified request configuration then the stream may only contain messages with only audio set.
Otherwise the first message must contain a recognizer and a streaming_config message that together fully specify the request configuration and must not contain audio. All subsequent messages must only have audio set.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the
client by StreamingRecognize
. A series of zero or more
StreamingRecognizeResponse
messages are streamed back to the
client. If there is no recognizable audio then no messages are
streamed back to the client.
Here are some examples of StreamingRecognizeResponse
\ s that
might be returned while processing audio:
results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }
Notes:
Only two of the above responses #4 and #7 contain final results; they are indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question".The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
.The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary.In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
TranscriptNormalization
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
TranslationConfig
Translation configuration. Use to translate the given audio into text for the desired language.
UndeleteCustomClassRequest
Request message for the UndeleteCustomClass method.
UndeletePhraseSetRequest
Request message for the UndeletePhraseSet method.
UndeleteRecognizerRequest
Request message for the UndeleteRecognizer method.
UpdateConfigRequest
Request message for the UpdateConfig method.
UpdateCustomClassRequest
Request message for the UpdateCustomClass method.
UpdatePhraseSetRequest
Request message for the UpdatePhraseSet method.
UpdateRecognizerRequest
Request message for the UpdateRecognizer method.
VttOutputFileFormatConfig
Output configurations for
WebVTT <https://www.w3.org/TR/webvtt1/>
__ formatted subtitle file.
WordInfo
Word-specific information for recognized words.