Google Cloud Speech v1p1beta1 API - Namespace Google.Cloud.Speech.V1P1Beta1 (3.0.0-beta07)

Classes

Adaptation

Service that implements Google Cloud Speech Adaptation API.

Adaptation.AdaptationBase

Base class for server-side implementations of Adaptation

AdaptationClient

Adaptation client wrapper, for convenient use.

AdaptationClientBuilder

Builder class for AdaptationClient to provide simple configuration of credentials, endpoint etc.

AdaptationClientImpl

Adaptation client wrapper implementation, for convenient use.

AdaptationSettings

Settings for AdaptationClient instances.

CreateCustomClassRequest

Message sent by the client for the CreateCustomClass method.

CreatePhraseSetRequest

Message sent by the client for the CreatePhraseSet method.

A set of words or phrases that represents a common concept likely to appear in your audio, for example a list of passenger ship names. CustomClass items can be substituted into placeholders that you set in PhraseSet phrases.

CustomClass.Types

Container for nested types declared in the CustomClass message type.

CustomClass.Types.ClassItem

An item of the class.

CustomClassName

Resource name for the CustomClass resource.

DeleteCustomClassRequest

Message sent by the client for the DeleteCustomClass method.

DeletePhraseSetRequest

Message sent by the client for the DeletePhraseSet method.

GetCustomClassRequest

Message sent by the client for the GetCustomClass method.

GetPhraseSetRequest

Message sent by the client for the GetPhraseSet method.

LanguageCodes

A helper class forming a hierarchy of supported language codes, via nested classes. All language codes are eventually represented as string constants. This is simply a code-convenient form of the table at https://cloud.google.com/speech/docs/languages. It is regenerated regularly, but not guaranteed to be complete at any moment in time; if the language you wish to use is present in the table but not covered here, please use the listed language code as a hard-coded string until this class catches up.

LanguageCodes.Afrikaans

Language codes for Afrikaans.

LanguageCodes.Amharic

Language codes for Amharic.

LanguageCodes.Arabic

Language codes for Arabic.

LanguageCodes.Armenian

Language codes for Armenian.

LanguageCodes.Azerbaijani

Language codes for Azerbaijani.

LanguageCodes.Basque

Language codes for Basque.

LanguageCodes.Bengali

Language codes for Bengali.

LanguageCodes.Bulgarian

Language codes for Bulgarian.

LanguageCodes.Catalan

Language codes for Catalan.

LanguageCodes.ChineseCantonese

Language codes for Chinese, Cantonese.

LanguageCodes.ChineseMandarin

Language codes for Chinese, Mandarin.

LanguageCodes.Croatian

Language codes for Croatian.

LanguageCodes.Czech

Language codes for Czech.

LanguageCodes.Danish

Language codes for Danish.

LanguageCodes.Dutch

Language codes for Dutch.

LanguageCodes.English

Language codes for English.

LanguageCodes.Filipino

Language codes for Filipino.

LanguageCodes.Finnish

Language codes for Finnish.

LanguageCodes.French

Language codes for French.

LanguageCodes.Galician

Language codes for Galician.

LanguageCodes.Georgian

Language codes for Georgian.

LanguageCodes.German

Language codes for German.

LanguageCodes.Greek

Language codes for Greek.

LanguageCodes.Gujarati

Language codes for Gujarati.

LanguageCodes.Hebrew

Language codes for Hebrew.

LanguageCodes.Hindi

Language codes for Hindi.

LanguageCodes.Hungarian

Language codes for Hungarian.

LanguageCodes.Icelandic

Language codes for Icelandic.

LanguageCodes.Indonesian

Language codes for Indonesian.

LanguageCodes.Italian

Language codes for Italian.

LanguageCodes.Japanese

Language codes for Japanese.

LanguageCodes.Javanese

Language codes for Javanese.

LanguageCodes.Kannada

Language codes for Kannada.

LanguageCodes.Khmer

Language codes for Khmer.

LanguageCodes.Korean

Language codes for Korean.

LanguageCodes.Lao

Language codes for Lao.

LanguageCodes.Latvian

Language codes for Latvian.

LanguageCodes.Lithuanian

Language codes for Lithuanian.

LanguageCodes.Malay

Language codes for Malay.

LanguageCodes.Malayalam

Language codes for Malayalam.

LanguageCodes.Marathi

Language codes for Marathi.

LanguageCodes.Nepali

Language codes for Nepali.

LanguageCodes.NorwegianBokmal

Language codes for Norwegian Bokmål.

LanguageCodes.Persian

Language codes for Persian.

LanguageCodes.Polish

Language codes for Polish.

LanguageCodes.Portuguese

Language codes for Portuguese.

LanguageCodes.Romanian

Language codes for Romanian.

LanguageCodes.Russian

Language codes for Russian.

LanguageCodes.Serbian

Language codes for Serbian.

LanguageCodes.Sinhala

Language codes for Sinhala.

LanguageCodes.Slovak

Language codes for Slovak.

LanguageCodes.Slovenian

Language codes for Slovenian.

LanguageCodes.Spanish

Language codes for Spanish.

LanguageCodes.Sundanese

Language codes for Sundanese.

LanguageCodes.Swahili

Language codes for Swahili.

LanguageCodes.Swedish

Language codes for Swedish.

LanguageCodes.Tamil

Language codes for Tamil.

LanguageCodes.Telugu

Language codes for Telugu.

LanguageCodes.Thai

Language codes for Thai.

LanguageCodes.Turkish

Language codes for Turkish.

LanguageCodes.Ukrainian

Language codes for Ukrainian.

LanguageCodes.Urdu

Language codes for Urdu.

LanguageCodes.Vietnamese

Language codes for Vietnamese.

LanguageCodes.Zulu

Language codes for Zulu.

ListCustomClassesRequest

Message sent by the client for the ListCustomClasses method.

ListCustomClassesResponse

Message returned to the client by the ListCustomClasses method.

ListPhraseSetRequest

Message sent by the client for the ListPhraseSet method.

ListPhraseSetResponse

Message returned to the client by the ListPhraseSet method.

LongRunningRecognizeMetadata

Describes the progress of a long-running LongRunningRecognize call. It is included in the metadata field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

LongRunningRecognizeRequest

The top-level message sent by the client for the LongRunningRecognize method.

LongRunningRecognizeResponse

The only message returned to the client by the LongRunningRecognize method. It contains the result as zero or more sequential SpeechRecognitionResult messages. It is included in the result.response field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

PhraseSet

Provides "hints" to the speech recognizer to favor specific words and phrases in the results.

PhraseSet.Types

Container for nested types declared in the PhraseSet message type.

PhraseSet.Types.Phrase

A phrases containing words and phrase "hints" so that the speech recognition is more likely to recognize them. This can be used to improve the accuracy for specific words and phrases, for example, if specific commands are typically spoken by the user. This can also be used to add additional words to the vocabulary of the recognizer. See usage limits.

List items can also include pre-built or custom classes containing groups of words that represent common concepts that occur in natural language. For example, rather than providing a phrase hint for every month of the year (e.g. "i was born in january", "i was born in febuary", ...), use the pre-built $MONTH class improves the likelihood of correctly transcribing audio that includes months (e.g. "i was born in $month"). To refer to pre-built classes, use the class' symbol prepended with $ e.g. $MONTH. To refer to custom classes that were defined inline in the request, set the class's custom_class_id to a string unique to all class resources and inline classes. Then use the class' id wrapped in ${...} e.g. "${my-months}". To refer to custom classes resources, use the class' id wrapped in ${} (e.g. ${my-months}).

Speech-to-Text supports three locations: global, us (US North America), and eu (Europe). If you are calling the speech.googleapis.com endpoint, use the global location. To specify a region, use a regional endpoint with matching us or eu location value.

PhraseSetName

Resource name for the PhraseSet resource.

RecognitionAudio

Contains audio data in the encoding specified in the RecognitionConfig. Either content or uri must be supplied. Supplying both or neither returns [google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]. See content limits.

RecognitionConfig

Provides information to the recognizer that specifies how to process the request.

RecognitionConfig.Types

Container for nested types declared in the RecognitionConfig message type.

RecognitionMetadata

Description of audio data to be recognized.

RecognitionMetadata.Types

Container for nested types declared in the RecognitionMetadata message type.

RecognizeRequest

The top-level message sent by the client for the Recognize method.

RecognizeResponse

The only message returned to the client by the Recognize method. It contains the result as zero or more sequential SpeechRecognitionResult messages.

SpeakerDiarizationConfig

Config to enable speaker diarization.

Speech

Service that implements Google Cloud Speech API.

Speech.SpeechBase

Base class for server-side implementations of Speech

Speech.SpeechClient

Client for Speech

SpeechAdaptation

Speech adaptation configuration.

SpeechAdaptation.Types

Container for nested types declared in the SpeechAdaptation message type.

SpeechAdaptation.Types.ABNFGrammar

SpeechAdaptationInfo

Information on speech adaptation use in results

SpeechClient

Speech client wrapper, for convenient use.

SpeechClient.StreamingRecognizeStream

Bidirectional streaming methods for StreamingRecognize(CallSettings, BidirectionalStreamingSettings).

SpeechClientBuilder

Builder class for SpeechClient to provide simple configuration of credentials, endpoint etc.

SpeechClientImpl

Speech client wrapper implementation, for convenient use.

SpeechContext

Provides "hints" to the speech recognizer to favor specific words and phrases in the results.

SpeechRecognitionAlternative

Alternative hypotheses (a.k.a. n-best list).

SpeechRecognitionResult

A speech recognition result corresponding to a portion of the audio.

SpeechSettings

Settings for SpeechClient instances.

StreamingRecognitionConfig

Provides information to the recognizer that specifies how to process the request.

StreamingRecognitionConfig.Types

Container for nested types declared in the StreamingRecognitionConfig message type.

StreamingRecognitionConfig.Types.VoiceActivityTimeout

Events that a timeout can be set on for voice activity.

StreamingRecognitionResult

A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.

StreamingRecognizeRequest

The top-level message sent by the client for the StreamingRecognize method. Multiple StreamingRecognizeRequest messages are sent. The first message must contain a streaming_config message and must not contain audio_content. All subsequent messages must contain audio_content and must not contain a streaming_config message.

StreamingRecognizeResponse

StreamingRecognizeResponse is the only message returned to the client by StreamingRecognize. A series of zero or more StreamingRecognizeResponse messages are streamed back to the client. If there is no recognizable audio, and single_utterance is set to false, then no messages are streamed back to the client.

Here's an example of a series of StreamingRecognizeResponses that might be returned while processing audio:

results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }

Notes:

Only two of the above responses #4 and #7 contain final results; they are indicated by is_final: true. Concatenating these together generates the full transcript: "to be or not to be that is the question".
The others contain interim results. #3 and #6 contain two interim results: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stability results.
The specific stability and confidence values shown above are only for illustrative purposes. Actual values may vary.
In each response, only one of these fields will be set: error, speech_event_type, or one or more (repeated) results.

StreamingRecognizeResponse.Types

Container for nested types declared in the StreamingRecognizeResponse message type.

TranscriptNormalization

Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.

TranscriptNormalization.Types

Container for nested types declared in the TranscriptNormalization message type.

TranscriptNormalization.Types.Entry

A single replacement configuration.

TranscriptOutputConfig

Specifies an optional destination for the recognition results.

UpdateCustomClassRequest

Message sent by the client for the UpdateCustomClass method.

UpdatePhraseSetRequest

Message sent by the client for the UpdatePhraseSet method.

WordInfo

Word-specific information for recognized words.

Enums

CustomClassName.ResourceNameType

The possible contents of CustomClassName.

PhraseSetName.ResourceNameType

The possible contents of PhraseSetName.

RecognitionAudio.AudioSourceOneofCase

Enum of possible cases for the "audio_source" oneof.

RecognitionConfig.Types.AudioEncoding

The encoding of the audio data sent in the request.

All encodings support only 1 channel (mono) audio, unless the audio_channel_count and enable_separate_recognition_per_channel fields are set.

For best results, the audio source should be captured and transmitted using a lossless encoding (FLAC or LINEAR16). The accuracy of the speech recognition can be reduced if lossy codecs are used to capture or transmit audio, particularly if background noise is present. Lossy codecs include MULAW, AMR, AMR_WB, OGG_OPUS, SPEEX_WITH_HEADER_BYTE, MP3, and WEBM_OPUS.

The FLAC and WAV audio file formats include a header that describes the included audio content. You can request recognition for WAV files that contain either LINEAR16 or MULAW encoded audio. If you send FLAC or WAV audio file format in your request, you do not need to specify an AudioEncoding; the audio encoding format is determined from the file header. If you specify an AudioEncoding when you send send FLAC or WAV audio, the encoding configuration must match the encoding described in the audio header; otherwise the request returns an [google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT] error code.