RecognitionMetadata(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Description of audio data to be recognized.
Attributes | |
---|---|
Name | Description |
interaction_type |
google.cloud.speech_v1.types.RecognitionMetadata.InteractionType
The use case most closely describing the audio content to be recognized. |
industry_naics_code_of_audio |
int
The industry vertical to which this speech recognition request most closely applies. This is most indicative of the topics contained in the audio. Use the 6-digit NAICS code to identify the industry vertical - see https://www.naics.com/search/. |
microphone_distance |
google.cloud.speech_v1.types.RecognitionMetadata.MicrophoneDistance
The audio type that most closely describes the audio being recognized. |
original_media_type |
google.cloud.speech_v1.types.RecognitionMetadata.OriginalMediaType
The original media the speech was recorded on. |
recording_device_type |
google.cloud.speech_v1.types.RecognitionMetadata.RecordingDeviceType
The type of device the speech was recorded with. |
recording_device_name |
str
The device used to make the recording. Examples 'Nexus 5X' or 'Polycom SoundStation IP 6000' or 'POTS' or 'VoIP' or 'Cardioid Microphone'. |
original_mime_type |
str
Mime type of the original audio file. For example audio/m4a , audio/x-alaw-basic , audio/mp3 ,
audio/3gpp . A list of possible audio mime types is
maintained at
http://www.iana.org/assignments/media-types/media-types.xhtml#audio
|
audio_topic |
str
Description of the content. Eg. "Recordings of federal supreme court hearings from 2012". |
Classes
InteractionType
InteractionType(value)
Use case categories that the audio recognition request can be described by.
Values: INTERACTION_TYPE_UNSPECIFIED (0): Use case is either unknown or is something other than one of the other values below. DISCUSSION (1): Multiple people in a conversation or discussion. For example in a meeting with two or more people actively participating. Typically all the primary people speaking would be in the same room (if not, see PHONE_CALL) PRESENTATION (2): One or more persons lecturing or presenting to others, mostly uninterrupted. PHONE_CALL (3): A phone-call or video-conference in which two or more people, who are not in the same room, are actively participating. VOICEMAIL (4): A recorded message intended for another person to listen to. PROFESSIONALLY_PRODUCED (5): Professionally produced audio (eg. TV Show, Podcast). VOICE_SEARCH (6): Transcribe spoken questions and queries into text. VOICE_COMMAND (7): Transcribe voice commands, such as for controlling a device. DICTATION (8): Transcribe speech to text to create a written document, such as a text-message, email or report.
MicrophoneDistance
MicrophoneDistance(value)
Enumerates the types of capture settings describing an audio file.
Values: MICROPHONE_DISTANCE_UNSPECIFIED (0): Audio type is not known. NEARFIELD (1): The audio was captured from a closely placed microphone. Eg. phone, dictaphone, or handheld microphone. Generally if there speaker is within 1 meter of the microphone. MIDFIELD (2): The speaker if within 3 meters of the microphone. FARFIELD (3): The speaker is more than 3 meters away from the microphone.
OriginalMediaType
OriginalMediaType(value)
The original media the speech was recorded on.
Values: ORIGINAL_MEDIA_TYPE_UNSPECIFIED (0): Unknown original media type. AUDIO (1): The speech data is an audio recording. VIDEO (2): The speech data originally recorded on a video.
RecordingDeviceType
RecordingDeviceType(value)
The type of device the speech was recorded with.
Values: RECORDING_DEVICE_TYPE_UNSPECIFIED (0): The recording device is unknown. SMARTPHONE (1): Speech was recorded on a smartphone. PC (2): Speech was recorded using a personal computer or tablet. PHONE_LINE (3): Speech was recorded over a phone line. VEHICLE (4): Speech was recorded in a vehicle. OTHER_OUTDOOR_DEVICE (5): Speech was recorded outdoors. OTHER_INDOOR_DEVICE (6): Speech was recorded indoors.