Class InputAudioConfig (0.8.0)

InputAudioConfig(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Instructs the speech recognizer on how to process the audio content.

Attributes: audio_encoding (google.cloud.dialogflowcx_v3beta1.types.AudioEncoding): Required. Audio encoding of the audio content to process. sample_rate_hertz (int): Sample rate (in Hertz) of the audio content sent in the query. Refer to Cloud Speech API documentation <https://cloud.google.com/speech-to-text/docs/basics>__ for more details. enable_word_info (bool): Optional. If true, Dialogflow returns [SpeechWordInfo][google.cloud.dialogflow.cx.v3beta1.SpeechWordInfo] in [StreamingRecognitionResult][google.cloud.dialogflow.cx.v3beta1.StreamingRecognitionResult] with information about the recognized speech words, e.g. start and end time offsets. If false or unspecified, Speech doesn't return any word-level information. phrase_hints (Sequence[str]): Optional. A list of strings containing words and phrases that the speech recognizer should recognize with higher likelihood.

    See `the Cloud Speech
    documentation <https://cloud.google.com/speech-to-text/docs/basics#phrase-hints>`__
    for more details.
model (str):
    Optional. Which Speech model to select for the given
    request. Select the model best suited to your domain to get
    best results. If a model is not explicitly specified, then
    we auto-select a model based on the parameters in the
    InputAudioConfig. If enhanced speech model is enabled for
    the agent and an enhanced version of the specified model for
    the language does not exist, then the speech is recognized
    using the standard version of the specified model. Refer to
    `Cloud Speech API
    documentation <https://cloud.google.com/speech-to-text/docs/basics#select-model>`__
    for more details.
model_variant (google.cloud.dialogflowcx_v3beta1.types.SpeechModelVariant):
    Optional. Which variant of the [Speech
    model][google.cloud.dialogflow.cx.v3beta1.InputAudioConfig.model]
    to use.
single_utterance (bool):
    Optional. If ``false`` (default), recognition does not cease
    until the client closes the stream. If ``true``, the
    recognizer will detect a single spoken utterance in input
    audio. Recognition ceases when it detects the audio's voice
    has stopped or paused. In this case, once a detected intent
    is received, the client should close the stream and start a
    new request with a new stream as needed. Note: This setting
    is relevant only for streaming methods.

Inheritance

builtins.object > proto.message.Message > InputAudioConfig