API documentation for speech_v2.types
package.
Classes
AutoDetectDecodingConfig
Automatically detected decoding parameters. Supported for the following encodings:
WAV_LINEAR16: 16-bit signed little-endian PCM samples in a WAV container.
WAV_MULAW: 8-bit companded mulaw samples in a WAV container.
WAV_ALAW: 8-bit companded alaw samples in a WAV container.
RFC4867_5_AMR: AMR frames with an rfc4867.5 header.
RFC4867_5_AMRWB: AMR-WB frames with an rfc4867.5 header.
FLAC: FLAC frames in the "native FLAC" container format.
MP3: MPEG audio frames with optional (ignored) ID3 metadata.
OGG_OPUS: Opus audio frames in an Ogg container.
WEBM_OPUS: Opus audio frames in a WebM container.
BatchRecognizeFileMetadata
Metadata about a single file in a batch for BatchRecognize.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
BatchRecognizeFileResult
Final results for a single file.
BatchRecognizeMetadata
Operation metadata for BatchRecognize.
BatchRecognizeRequest
Request message for the BatchRecognize method.
BatchRecognizeResponse
Response message for
BatchRecognize that
is packaged into a longrunning
Operation][google.longrunning.Operation]
.
BatchRecognizeResults
Output type for Cloud Storage of BatchRecognize transcripts. Though this proto isn't returned in this API anywhere, the Cloud Storage transcripts will be this proto serialized and should be parsed as such.
BatchRecognizeTranscriptionMetadata
Metadata about transcription for a single file (for example, progress percent).
Config
Message representing the config for the Speech-to-Text API. This
includes an optional KMS
key <https://cloud.google.com/kms/docs/resource-hierarchy#keys>
__
with which incoming data will be encrypted.
CreateCustomClassRequest
Request message for the CreateCustomClass method.
CreatePhraseSetRequest
Request message for the CreatePhraseSet method.
CreateRecognizerRequest
Request message for the CreateRecognizer method.
CustomClass
CustomClass for biasing in speech recognition. Used to define a set of words or phrases that represents a common concept or theme likely to appear in your audio, for example a list of passenger ship names.
DeleteCustomClassRequest
Request message for the DeleteCustomClass method.
DeletePhraseSetRequest
Request message for the DeletePhraseSet method.
DeleteRecognizerRequest
Request message for the DeleteRecognizer method.
ExplicitDecodingConfig
Explicitly specified decoding parameters.
GcsOutputConfig
Output configurations for Cloud Storage.
GetConfigRequest
Request message for the GetConfig method.
GetCustomClassRequest
Request message for the GetCustomClass method.
GetPhraseSetRequest
Request message for the GetPhraseSet method.
GetRecognizerRequest
Request message for the GetRecognizer method.
InlineOutputConfig
Output configurations for inline response.
ListCustomClassesRequest
Request message for the ListCustomClasses method.
ListCustomClassesResponse
Response message for the ListCustomClasses method.
ListPhraseSetsRequest
Request message for the ListPhraseSets method.
ListPhraseSetsResponse
Response message for the ListPhraseSets method.
ListRecognizersRequest
Request message for the ListRecognizers method.
ListRecognizersResponse
Response message for the ListRecognizers method.
OperationMetadata
Represents the metadata of a long-running operation.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
PhraseSet
PhraseSet for biasing in speech recognition. A PhraseSet is used to provide "hints" to the speech recognizer to favor specific words and phrases in the results.
RecognitionConfig
Provides information to the Recognizer that specifies how to process the recognition request.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionFeatures
Available recognition features.
RecognitionOutputConfig
Configuration options for the output(s) of recognition.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognitionResponseMetadata
Metadata about the recognition request and response.
RecognizeRequest
Request message for the
Recognize method. Either
content
or uri
must be supplied. Supplying both or neither
returns INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
. See
content
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
RecognizeResponse
Response message for the Recognize method.
Recognizer
A Recognizer message. Stores recognition configuration and metadata.
SpeakerDiarizationConfig
Configuration to enable speaker diarization.
SpeechAdaptation
Provides "hints" to the speech recognizer to favor specific words and phrases in the results. PhraseSets can be specified as an inline resource, or a reference to an existing PhraseSet resource.
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list).
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
StreamingRecognitionConfig
Provides configuration information for the StreamingRecognize request.
StreamingRecognitionFeatures
Available recognition features specific to streaming recognition requests.
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
StreamingRecognizeRequest
Request message for the StreamingRecognize method. Multiple StreamingRecognizeRequest messages are sent. The first message must contain a recognizer and optionally a streaming_config message and must not contain audio. All subsequent messages must contain audio and must not contain a streaming_config message.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the
client by StreamingRecognize
. A series of zero or more
StreamingRecognizeResponse
messages are streamed back to the
client. If there is no recognizable audio then no messages are
streamed back to the client.
Here are some examples of StreamingRecognizeResponse
\ s that
might be returned while processing audio:
results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }
Notes:
Only two of the above responses #4 and #7 contain final results; they are indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question".The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
.The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary.In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
UndeleteCustomClassRequest
Request message for the UndeleteCustomClass method.
UndeletePhraseSetRequest
Request message for the UndeletePhraseSet method.
UndeleteRecognizerRequest
Request message for the UndeleteRecognizer method.
UpdateConfigRequest
Request message for the UpdateConfig method.
UpdateCustomClassRequest
Request message for the UpdateCustomClass method.
UpdatePhraseSetRequest
Request message for the UpdatePhraseSet method.
UpdateRecognizerRequest
Request message for the UpdateRecognizer method.
WordInfo
Word-specific information for recognized words.