Package types (2.17.1)

API documentation for speech_v2.types package.

Classes

AutoDetectDecodingConfig

Automatically detected decoding parameters. Supported for the following encodings:

WAV_LINEAR16: 16-bit signed little-endian PCM samples in a WAV container.
WAV_MULAW: 8-bit companded mulaw samples in a WAV container.
WAV_ALAW: 8-bit companded alaw samples in a WAV container.
RFC4867_5_AMR: AMR frames with an rfc4867.5 header.
RFC4867_5_AMRWB: AMR-WB frames with an rfc4867.5 header.
FLAC: FLAC frames in the "native FLAC" container format.
MP3: MPEG audio frames with optional (ignored) ID3 metadata.
OGG_OPUS: Opus audio frames in an Ogg container.
WEBM_OPUS: Opus audio frames in a WebM container.

BatchRecognizeFileMetadata

Metadata about a single file in a batch for BatchRecognize.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

BatchRecognizeRequest

Request message for the BatchRecognize method.

BatchRecognizeResponse

Response message for BatchRecognize that is packaged into a longrunning Operation][google.longrunning.Operation].

BatchRecognizeTranscriptionMetadata

Metadata about transcription for a single file (for example, progress percent).

Message representing the config for the Speech-to-Text API. This includes an optional KMS key <https://cloud.google.com/kms/docs/resource-hierarchy#keys>__ with which incoming data will be encrypted.

CreateCustomClassRequest

Request message for the CreateCustomClass method.

CreatePhraseSetRequest

Request message for the CreatePhraseSet method.

CreateRecognizerRequest

Request message for the CreateRecognizer method.

CustomClass

CustomClass for biasing in speech recognition. Used to define a set of words or phrases that represents a common concept or theme likely to appear in your audio, for example a list of passenger ship names.

DeleteCustomClassRequest

Request message for the DeleteCustomClass method.

DeletePhraseSetRequest

Request message for the DeletePhraseSet method.

DeleteRecognizerRequest

Request message for the DeleteRecognizer method.

ExplicitDecodingConfig

Explicitly specified decoding parameters.

GetConfigRequest

Request message for the GetConfig method.

GetCustomClassRequest

Request message for the GetCustomClass method.

GetPhraseSetRequest

Request message for the GetPhraseSet method.

GetRecognizerRequest

Request message for the GetRecognizer method.

ListCustomClassesRequest

Request message for the ListCustomClasses method.

ListCustomClassesResponse

Response message for the ListCustomClasses method.

ListPhraseSetsRequest

Request message for the ListPhraseSets method.

ListPhraseSetsResponse

Response message for the ListPhraseSets method.

ListRecognizersRequest

Request message for the ListRecognizers method.

ListRecognizersResponse

Response message for the ListRecognizers method.

OperationMetadata

Represents the metadata of a long-running operation.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

PhraseSet

PhraseSet for biasing in speech recognition. A PhraseSet is used to provide "hints" to the speech recognizer to favor specific words and phrases in the results.

RecognitionConfig

Provides information to the Recognizer that specifies how to process the recognition request.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

RecognitionFeatures

Available recognition features.

RecognitionResponseMetadata

Metadata about the recognition request and response.

RecognizeRequest

Request message for the Recognize method. Either content or uri must be supplied. Supplying both or neither returns INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]. See content limits <https://cloud.google.com/speech-to-text/quotas#content>__.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

RecognizeResponse

Response message for the Recognize method.

Recognizer

A Recognizer message. Stores recognition configuration and metadata.

SpeakerDiarizationConfig

Configuration to enable speaker diarization.

SpeechAdaptation

Provides "hints" to the speech recognizer to favor specific words and phrases in the results. Phrase sets can be specified as an inline resource, or a reference to an existing phrase set resource.

SpeechRecognitionAlternative

Alternative hypotheses (a.k.a. n-best list).

SpeechRecognitionResult

A speech recognition result corresponding to a portion of the audio.

StreamingRecognitionConfig

Provides configuration information for the StreamingRecognize request.

StreamingRecognitionFeatures

Available recognition features specific to streaming recognition requests.

StreamingRecognitionResult

A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.

StreamingRecognizeRequest

Request message for the StreamingRecognize method. Multiple StreamingRecognizeRequest messages are sent. The first message must contain a recognizer and optionally a streaming_config message and must not contain audio. All subsequent messages must contain audio and must not contain a streaming_config message.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

StreamingRecognizeResponse

StreamingRecognizeResponse is the only message returned to the client by StreamingRecognize. A series of zero or more StreamingRecognizeResponse messages are streamed back to the client. If there is no recognizable audio then no messages are streamed back to the client.

Here are some examples of StreamingRecognizeResponse\ s that might be returned while processing audio:

results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }

Notes:

Only two of the above responses #4 and #7 contain final results; they are indicated by is_final: true. Concatenating these together generates the full transcript: "to be or not to be that is the question".
The others contain interim results. #3 and #6 contain two interim results: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stability results.
The specific stability and confidence values shown above are only for illustrative purposes. Actual values may vary.
In each response, only one of these fields will be set: error, speech_event_type, or one or more (repeated) results.