- 4.48.0 (latest)
- 4.47.0
- 4.46.0
- 4.44.0
- 4.43.0
- 4.42.0
- 4.41.0
- 4.40.0
- 4.39.0
- 4.38.0
- 4.37.0
- 4.36.0
- 4.35.0
- 4.34.0
- 4.32.0
- 4.31.0
- 4.30.0
- 4.29.0
- 4.28.0
- 4.27.0
- 4.26.0
- 4.25.0
- 4.24.0
- 4.23.0
- 4.22.0
- 4.19.0
- 4.18.0
- 4.17.0
- 4.16.0
- 4.15.0
- 4.14.0
- 4.13.0
- 4.12.0
- 4.11.0
- 4.10.0
- 4.9.0
- 4.8.0
- 4.7.0
- 4.6.0
- 4.4.0
- 4.3.0
- 4.2.0
- 4.1.0
- 4.0.0
- 3.0.0
- 2.6.1
- 2.5.9
- 2.4.0
- 2.3.0
- 2.2.15
The interfaces provided are listed below, along with usage samples.
SpeechClient
Service Description: Service that implements Google Cloud Speech API.
Sample for SpeechClient:
try (SpeechClient speechClient = SpeechClient.create()) {
RecognitionConfig config = RecognitionConfig.newBuilder().build();
RecognitionAudio audio = RecognitionAudio.newBuilder().build();
RecognizeResponse response = speechClient.recognize(config, audio);
}
AdaptationClient
Service Description: Service that implements Google Cloud Speech Adaptation API.
Sample for AdaptationClient:
try (AdaptationClient adaptationClient = AdaptationClient.create()) {
LocationName parent = LocationName.of("[PROJECT]", "[LOCATION]");
PhraseSet phraseSet = PhraseSet.newBuilder().build();
String phraseSetId = "phraseSetId959902180";
PhraseSet response = adaptationClient.createPhraseSet(parent, phraseSet, phraseSetId);
}
Classes
AdaptationClient
Service Description: Service that implements Google Cloud Speech Adaptation API.
This class provides the ability to make remote calls to the backing service through method calls that map to API methods. Sample code to get started:
try (AdaptationClient adaptationClient = AdaptationClient.create()) {
LocationName parent = LocationName.of("[PROJECT]", "[LOCATION]");
PhraseSet phraseSet = PhraseSet.newBuilder().build();
String phraseSetId = "phraseSetId959902180";
PhraseSet response = adaptationClient.createPhraseSet(parent, phraseSet, phraseSetId);
}
Note: close() needs to be called on the AdaptationClient object to clean up resources such as threads. In the example above, try-with-resources is used, which automatically calls close().
The surface of this class includes several types of Java methods for each of the API's methods:
- A "flattened" method. With this type of method, the fields of the request type have been converted into function parameters. It may be the case that not all fields are available as parameters, and not every API method will have a flattened method entry point.
- A "request object" method. This type of method only takes one parameter, a request object, which must be constructed before the call. Not every API method will have a request object method.
- A "callable" method. This type of method takes no parameters and returns an immutable API callable object, which can be used to initiate calls to the service.
See the individual methods for example code.
Many parameters require resource names to be formatted in a particular way. To assist with these names, this class includes a format method for each type of name, and additionally a parse method to extract the individual identifiers contained within names that are returned.
This class can be customized by passing in a custom instance of AdaptationSettings to create(). For example:
To customize credentials:
AdaptationSettings adaptationSettings =
AdaptationSettings.newBuilder()
.setCredentialsProvider(FixedCredentialsProvider.create(myCredentials))
.build();
AdaptationClient adaptationClient = AdaptationClient.create(adaptationSettings);
To customize the endpoint:
AdaptationSettings adaptationSettings =
AdaptationSettings.newBuilder().setEndpoint(myEndpoint).build();
AdaptationClient adaptationClient = AdaptationClient.create(adaptationSettings);
Please refer to the GitHub repository's samples for more quickstart code snippets.
AdaptationClient.ListCustomClassesFixedSizeCollection
AdaptationClient.ListCustomClassesPage
AdaptationClient.ListCustomClassesPagedResponse
AdaptationClient.ListPhraseSetFixedSizeCollection
AdaptationClient.ListPhraseSetPage
AdaptationClient.ListPhraseSetPagedResponse
AdaptationGrpc
Service that implements Google Cloud Speech Adaptation API.
AdaptationGrpc.AdaptationBlockingStub
Service that implements Google Cloud Speech Adaptation API.
AdaptationGrpc.AdaptationFutureStub
Service that implements Google Cloud Speech Adaptation API.
AdaptationGrpc.AdaptationImplBase
Service that implements Google Cloud Speech Adaptation API.
AdaptationGrpc.AdaptationStub
Service that implements Google Cloud Speech Adaptation API.
AdaptationSettings
Settings class to configure an instance of AdaptationClient.
The default instance has everything set to sensible defaults:
- The default service address (speech.googleapis.com) and default port (443) are used.
- Credentials are acquired automatically through Application Default Credentials.
- Retries are configured for idempotent methods but not for non-idempotent methods.
The builder of this class is recursive, so contained classes are themselves builders. When build() is called, the tree of builders is called to create the complete settings object.
For example, to set the total timeout of createPhraseSet to 30 seconds:
AdaptationSettings.Builder adaptationSettingsBuilder = AdaptationSettings.newBuilder();
adaptationSettingsBuilder
.createPhraseSetSettings()
.setRetrySettings(
adaptationSettingsBuilder
.createPhraseSetSettings()
.getRetrySettings()
.toBuilder()
.setTotalTimeout(Duration.ofSeconds(30))
.build());
AdaptationSettings adaptationSettings = adaptationSettingsBuilder.build();
AdaptationSettings.Builder
Builder for AdaptationSettings.
CreateCustomClassRequest
Message sent by the client for the CreateCustomClass
method.
Protobuf type google.cloud.speech.v1p1beta1.CreateCustomClassRequest
CreateCustomClassRequest.Builder
Message sent by the client for the CreateCustomClass
method.
Protobuf type google.cloud.speech.v1p1beta1.CreateCustomClassRequest
CreatePhraseSetRequest
Message sent by the client for the CreatePhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.CreatePhraseSetRequest
CreatePhraseSetRequest.Builder
Message sent by the client for the CreatePhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.CreatePhraseSetRequest
CustomClass
A set of words or phrases that represents a common concept likely to appear in your audio, for example a list of passenger ship names. CustomClass items can be substituted into placeholders that you set in PhraseSet phrases.
Protobuf type google.cloud.speech.v1p1beta1.CustomClass
CustomClass.Builder
A set of words or phrases that represents a common concept likely to appear in your audio, for example a list of passenger ship names. CustomClass items can be substituted into placeholders that you set in PhraseSet phrases.
Protobuf type google.cloud.speech.v1p1beta1.CustomClass
CustomClass.ClassItem
An item of the class.
Protobuf type google.cloud.speech.v1p1beta1.CustomClass.ClassItem
CustomClass.ClassItem.Builder
An item of the class.
Protobuf type google.cloud.speech.v1p1beta1.CustomClass.ClassItem
CustomClassName
CustomClassName.Builder
Builder for projects/{project}/locations/{location}/customClasses/{custom_class}.
DeleteCustomClassRequest
Message sent by the client for the DeleteCustomClass
method.
Protobuf type google.cloud.speech.v1p1beta1.DeleteCustomClassRequest
DeleteCustomClassRequest.Builder
Message sent by the client for the DeleteCustomClass
method.
Protobuf type google.cloud.speech.v1p1beta1.DeleteCustomClassRequest
DeletePhraseSetRequest
Message sent by the client for the DeletePhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.DeletePhraseSetRequest
DeletePhraseSetRequest.Builder
Message sent by the client for the DeletePhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.DeletePhraseSetRequest
GetCustomClassRequest
Message sent by the client for the GetCustomClass
method.
Protobuf type google.cloud.speech.v1p1beta1.GetCustomClassRequest
GetCustomClassRequest.Builder
Message sent by the client for the GetCustomClass
method.
Protobuf type google.cloud.speech.v1p1beta1.GetCustomClassRequest
GetPhraseSetRequest
Message sent by the client for the GetPhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.GetPhraseSetRequest
GetPhraseSetRequest.Builder
Message sent by the client for the GetPhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.GetPhraseSetRequest
ListCustomClassesRequest
Message sent by the client for the ListCustomClasses
method.
Protobuf type google.cloud.speech.v1p1beta1.ListCustomClassesRequest
ListCustomClassesRequest.Builder
Message sent by the client for the ListCustomClasses
method.
Protobuf type google.cloud.speech.v1p1beta1.ListCustomClassesRequest
ListCustomClassesResponse
Message returned to the client by the ListCustomClasses
method.
Protobuf type google.cloud.speech.v1p1beta1.ListCustomClassesResponse
ListCustomClassesResponse.Builder
Message returned to the client by the ListCustomClasses
method.
Protobuf type google.cloud.speech.v1p1beta1.ListCustomClassesResponse
ListPhraseSetRequest
Message sent by the client for the ListPhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.ListPhraseSetRequest
ListPhraseSetRequest.Builder
Message sent by the client for the ListPhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.ListPhraseSetRequest
ListPhraseSetResponse
Message returned to the client by the ListPhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.ListPhraseSetResponse
ListPhraseSetResponse.Builder
Message returned to the client by the ListPhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.ListPhraseSetResponse
LocationName
LocationName.Builder
Builder for projects/{project}/locations/{location}.
LongRunningRecognizeMetadata
Describes the progress of a long-running LongRunningRecognize
call. It is
included in the metadata
field of the Operation
returned by the
GetOperation
call of the google::longrunning::Operations
service.
Protobuf type google.cloud.speech.v1p1beta1.LongRunningRecognizeMetadata
LongRunningRecognizeMetadata.Builder
Describes the progress of a long-running LongRunningRecognize
call. It is
included in the metadata
field of the Operation
returned by the
GetOperation
call of the google::longrunning::Operations
service.
Protobuf type google.cloud.speech.v1p1beta1.LongRunningRecognizeMetadata
LongRunningRecognizeRequest
The top-level message sent by the client for the LongRunningRecognize
method.
Protobuf type google.cloud.speech.v1p1beta1.LongRunningRecognizeRequest
LongRunningRecognizeRequest.Builder
The top-level message sent by the client for the LongRunningRecognize
method.
Protobuf type google.cloud.speech.v1p1beta1.LongRunningRecognizeRequest
LongRunningRecognizeResponse
The only message returned to the client by the LongRunningRecognize
method.
It contains the result as zero or more sequential SpeechRecognitionResult
messages. It is included in the result.response
field of the Operation
returned by the GetOperation
call of the google::longrunning::Operations
service.
Protobuf type google.cloud.speech.v1p1beta1.LongRunningRecognizeResponse
LongRunningRecognizeResponse.Builder
The only message returned to the client by the LongRunningRecognize
method.
It contains the result as zero or more sequential SpeechRecognitionResult
messages. It is included in the result.response
field of the Operation
returned by the GetOperation
call of the google::longrunning::Operations
service.
Protobuf type google.cloud.speech.v1p1beta1.LongRunningRecognizeResponse
PhraseSet
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
Protobuf type google.cloud.speech.v1p1beta1.PhraseSet
PhraseSet.Builder
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
Protobuf type google.cloud.speech.v1p1beta1.PhraseSet
PhraseSet.Phrase
A phrases containing words and phrase "hints" so that
the speech recognition is more likely to recognize them. This can be used
to improve the accuracy for specific words and phrases, for example, if
specific commands are typically spoken by the user. This can also be used
to add additional words to the vocabulary of the recognizer. See
usage limits.
List items can also include pre-built or custom classes containing groups
of words that represent common concepts that occur in natural language. For
example, rather than providing a phrase hint for every month of the
year (e.g. "i was born in january", "i was born in febuary", ...), use the
pre-built $MONTH
class improves the likelihood of correctly transcribing
audio that includes months (e.g. "i was born in $month").
To refer to pre-built classes, use the class' symbol prepended with $
e.g. $MONTH
. To refer to custom classes that were defined inline in the
request, set the class's custom_class_id
to a string unique to all class
resources and inline classes. Then use the class' id wrapped in ${...}
e.g. "${my-months}". To refer to custom classes resources, use the class'
id wrapped in ${}
(e.g. ${my-months}
).
Speech-to-Text supports three locations: global
, us
(US North America),
and eu
(Europe). If you are calling the speech.googleapis.com
endpoint, use the global
location. To specify a region, use a
regional endpoint with matching us
or
eu
location value.
Protobuf type google.cloud.speech.v1p1beta1.PhraseSet.Phrase
PhraseSet.Phrase.Builder
A phrases containing words and phrase "hints" so that
the speech recognition is more likely to recognize them. This can be used
to improve the accuracy for specific words and phrases, for example, if
specific commands are typically spoken by the user. This can also be used
to add additional words to the vocabulary of the recognizer. See
usage limits.
List items can also include pre-built or custom classes containing groups
of words that represent common concepts that occur in natural language. For
example, rather than providing a phrase hint for every month of the
year (e.g. "i was born in january", "i was born in febuary", ...), use the
pre-built $MONTH
class improves the likelihood of correctly transcribing
audio that includes months (e.g. "i was born in $month").
To refer to pre-built classes, use the class' symbol prepended with $
e.g. $MONTH
. To refer to custom classes that were defined inline in the
request, set the class's custom_class_id
to a string unique to all class
resources and inline classes. Then use the class' id wrapped in ${...}
e.g. "${my-months}". To refer to custom classes resources, use the class'
id wrapped in ${}
(e.g. ${my-months}
).
Speech-to-Text supports three locations: global
, us
(US North America),
and eu
(Europe). If you are calling the speech.googleapis.com
endpoint, use the global
location. To specify a region, use a
regional endpoint with matching us
or
eu
location value.
Protobuf type google.cloud.speech.v1p1beta1.PhraseSet.Phrase
PhraseSetName
PhraseSetName.Builder
Builder for projects/{project}/locations/{location}/phraseSets/{phrase_set}.
RecognitionAudio
Contains audio data in the encoding specified in the RecognitionConfig
.
Either content
or uri
must be supplied. Supplying both or neither
returns google.rpc.Code.INVALID_ARGUMENT. See
content limits.
Protobuf type google.cloud.speech.v1p1beta1.RecognitionAudio
RecognitionAudio.Builder
Contains audio data in the encoding specified in the RecognitionConfig
.
Either content
or uri
must be supplied. Supplying both or neither
returns google.rpc.Code.INVALID_ARGUMENT. See
content limits.
Protobuf type google.cloud.speech.v1p1beta1.RecognitionAudio
RecognitionConfig
Provides information to the recognizer that specifies how to process the request.
Protobuf type google.cloud.speech.v1p1beta1.RecognitionConfig
RecognitionConfig.Builder
Provides information to the recognizer that specifies how to process the request.
Protobuf type google.cloud.speech.v1p1beta1.RecognitionConfig
RecognitionMetadata
Description of audio data to be recognized.
Protobuf type google.cloud.speech.v1p1beta1.RecognitionMetadata
RecognitionMetadata.Builder
Description of audio data to be recognized.
Protobuf type google.cloud.speech.v1p1beta1.RecognitionMetadata
RecognizeRequest
The top-level message sent by the client for the Recognize
method.
Protobuf type google.cloud.speech.v1p1beta1.RecognizeRequest
RecognizeRequest.Builder
The top-level message sent by the client for the Recognize
method.
Protobuf type google.cloud.speech.v1p1beta1.RecognizeRequest
RecognizeResponse
The only message returned to the client by the Recognize
method. It
contains the result as zero or more sequential SpeechRecognitionResult
messages.
Protobuf type google.cloud.speech.v1p1beta1.RecognizeResponse
RecognizeResponse.Builder
The only message returned to the client by the Recognize
method. It
contains the result as zero or more sequential SpeechRecognitionResult
messages.
Protobuf type google.cloud.speech.v1p1beta1.RecognizeResponse
SpeakerDiarizationConfig
Config to enable speaker diarization.
Protobuf type google.cloud.speech.v1p1beta1.SpeakerDiarizationConfig
SpeakerDiarizationConfig.Builder
Config to enable speaker diarization.
Protobuf type google.cloud.speech.v1p1beta1.SpeakerDiarizationConfig
SpeechAdaptation
Speech adaptation configuration.
Protobuf type google.cloud.speech.v1p1beta1.SpeechAdaptation
SpeechAdaptation.Builder
Speech adaptation configuration.
Protobuf type google.cloud.speech.v1p1beta1.SpeechAdaptation
SpeechAdaptationProto
SpeechClient
Service Description: Service that implements Google Cloud Speech API.
This class provides the ability to make remote calls to the backing service through method calls that map to API methods. Sample code to get started:
try (SpeechClient speechClient = SpeechClient.create()) {
RecognitionConfig config = RecognitionConfig.newBuilder().build();
RecognitionAudio audio = RecognitionAudio.newBuilder().build();
RecognizeResponse response = speechClient.recognize(config, audio);
}
Note: close() needs to be called on the SpeechClient object to clean up resources such as threads. In the example above, try-with-resources is used, which automatically calls close().
The surface of this class includes several types of Java methods for each of the API's methods:
- A "flattened" method. With this type of method, the fields of the request type have been converted into function parameters. It may be the case that not all fields are available as parameters, and not every API method will have a flattened method entry point.
- A "request object" method. This type of method only takes one parameter, a request object, which must be constructed before the call. Not every API method will have a request object method.
- A "callable" method. This type of method takes no parameters and returns an immutable API callable object, which can be used to initiate calls to the service.
See the individual methods for example code.
Many parameters require resource names to be formatted in a particular way. To assist with these names, this class includes a format method for each type of name, and additionally a parse method to extract the individual identifiers contained within names that are returned.
This class can be customized by passing in a custom instance of SpeechSettings to create(). For example:
To customize credentials:
SpeechSettings speechSettings =
SpeechSettings.newBuilder()
.setCredentialsProvider(FixedCredentialsProvider.create(myCredentials))
.build();
SpeechClient speechClient = SpeechClient.create(speechSettings);
To customize the endpoint:
SpeechSettings speechSettings = SpeechSettings.newBuilder().setEndpoint(myEndpoint).build();
SpeechClient speechClient = SpeechClient.create(speechSettings);
Please refer to the GitHub repository's samples for more quickstart code snippets.
SpeechContext
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
Protobuf type google.cloud.speech.v1p1beta1.SpeechContext
SpeechContext.Builder
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
Protobuf type google.cloud.speech.v1p1beta1.SpeechContext
SpeechGrpc
Service that implements Google Cloud Speech API.
SpeechGrpc.SpeechBlockingStub
Service that implements Google Cloud Speech API.
SpeechGrpc.SpeechFutureStub
Service that implements Google Cloud Speech API.
SpeechGrpc.SpeechImplBase
Service that implements Google Cloud Speech API.
SpeechGrpc.SpeechStub
Service that implements Google Cloud Speech API.
SpeechProto
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list).
Protobuf type google.cloud.speech.v1p1beta1.SpeechRecognitionAlternative
SpeechRecognitionAlternative.Builder
Alternative hypotheses (a.k.a. n-best list).
Protobuf type google.cloud.speech.v1p1beta1.SpeechRecognitionAlternative
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
Protobuf type google.cloud.speech.v1p1beta1.SpeechRecognitionResult
SpeechRecognitionResult.Builder
A speech recognition result corresponding to a portion of the audio.
Protobuf type google.cloud.speech.v1p1beta1.SpeechRecognitionResult
SpeechResourceProto
SpeechSettings
Settings class to configure an instance of SpeechClient.
The default instance has everything set to sensible defaults:
- The default service address (speech.googleapis.com) and default port (443) are used.
- Credentials are acquired automatically through Application Default Credentials.
- Retries are configured for idempotent methods but not for non-idempotent methods.
The builder of this class is recursive, so contained classes are themselves builders. When build() is called, the tree of builders is called to create the complete settings object.
For example, to set the total timeout of recognize to 30 seconds:
SpeechSettings.Builder speechSettingsBuilder = SpeechSettings.newBuilder();
speechSettingsBuilder
.recognizeSettings()
.setRetrySettings(
speechSettingsBuilder
.recognizeSettings()
.getRetrySettings()
.toBuilder()
.setTotalTimeout(Duration.ofSeconds(30))
.build());
SpeechSettings speechSettings = speechSettingsBuilder.build();
SpeechSettings.Builder
Builder for SpeechSettings.
StreamingRecognitionConfig
Provides information to the recognizer that specifies how to process the request.
Protobuf type google.cloud.speech.v1p1beta1.StreamingRecognitionConfig
StreamingRecognitionConfig.Builder
Provides information to the recognizer that specifies how to process the request.
Protobuf type google.cloud.speech.v1p1beta1.StreamingRecognitionConfig
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
Protobuf type google.cloud.speech.v1p1beta1.StreamingRecognitionResult
StreamingRecognitionResult.Builder
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
Protobuf type google.cloud.speech.v1p1beta1.StreamingRecognitionResult
StreamingRecognizeRequest
The top-level message sent by the client for the StreamingRecognize
method.
Multiple StreamingRecognizeRequest
messages are sent. The first message
must contain a streaming_config
message and must not contain
audio_content
. All subsequent messages must contain audio_content
and
must not contain a streaming_config
message.
Protobuf type google.cloud.speech.v1p1beta1.StreamingRecognizeRequest
StreamingRecognizeRequest.Builder
The top-level message sent by the client for the StreamingRecognize
method.
Multiple StreamingRecognizeRequest
messages are sent. The first message
must contain a streaming_config
message and must not contain
audio_content
. All subsequent messages must contain audio_content
and
must not contain a streaming_config
message.
Protobuf type google.cloud.speech.v1p1beta1.StreamingRecognizeRequest
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the client by
StreamingRecognize
. A series of zero or more StreamingRecognizeResponse
messages are streamed back to the client. If there is no recognizable
audio, and single_utterance
is set to false, then no messages are streamed
back to the client.
Here's an example of a series of StreamingRecognizeResponse
s that might be
returned while processing audio:
- results { alternatives { transcript: "tube" } stability: 0.01 }
- results { alternatives { transcript: "to be a" } stability: 0.01 }
- results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
- results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
- results { alternatives { transcript: " that's" } stability: 0.01 }
- results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
- results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true } Notes:
- Only two of the above responses #4 and #7 contain final results; they are
indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question". - The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
. - The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary. - In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
Protobuf type google.cloud.speech.v1p1beta1.StreamingRecognizeResponse
StreamingRecognizeResponse.Builder
StreamingRecognizeResponse
is the only message returned to the client by
StreamingRecognize
. A series of zero or more StreamingRecognizeResponse
messages are streamed back to the client. If there is no recognizable
audio, and single_utterance
is set to false, then no messages are streamed
back to the client.
Here's an example of a series of StreamingRecognizeResponse
s that might be
returned while processing audio:
- results { alternatives { transcript: "tube" } stability: 0.01 }
- results { alternatives { transcript: "to be a" } stability: 0.01 }
- results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
- results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
- results { alternatives { transcript: " that's" } stability: 0.01 }
- results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
- results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true } Notes:
- Only two of the above responses #4 and #7 contain final results; they are
indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question". - The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
. - The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary. - In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
Protobuf type google.cloud.speech.v1p1beta1.StreamingRecognizeResponse
TranscriptNormalization
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
Protobuf type google.cloud.speech.v1p1beta1.TranscriptNormalization
TranscriptNormalization.Builder
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
Protobuf type google.cloud.speech.v1p1beta1.TranscriptNormalization
TranscriptNormalization.Entry
A single replacement configuration.
Protobuf type google.cloud.speech.v1p1beta1.TranscriptNormalization.Entry
TranscriptNormalization.Entry.Builder
A single replacement configuration.
Protobuf type google.cloud.speech.v1p1beta1.TranscriptNormalization.Entry
TranscriptOutputConfig
Specifies an optional destination for the recognition results.
Protobuf type google.cloud.speech.v1p1beta1.TranscriptOutputConfig
TranscriptOutputConfig.Builder
Specifies an optional destination for the recognition results.
Protobuf type google.cloud.speech.v1p1beta1.TranscriptOutputConfig
UpdateCustomClassRequest
Message sent by the client for the UpdateCustomClass
method.
Protobuf type google.cloud.speech.v1p1beta1.UpdateCustomClassRequest
UpdateCustomClassRequest.Builder
Message sent by the client for the UpdateCustomClass
method.
Protobuf type google.cloud.speech.v1p1beta1.UpdateCustomClassRequest
UpdatePhraseSetRequest
Message sent by the client for the UpdatePhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.UpdatePhraseSetRequest
UpdatePhraseSetRequest.Builder
Message sent by the client for the UpdatePhraseSet
method.
Protobuf type google.cloud.speech.v1p1beta1.UpdatePhraseSetRequest
WordInfo
Word-specific information for recognized words.
Protobuf type google.cloud.speech.v1p1beta1.WordInfo
WordInfo.Builder
Word-specific information for recognized words.
Protobuf type google.cloud.speech.v1p1beta1.WordInfo
Interfaces
CreateCustomClassRequestOrBuilder
CreatePhraseSetRequestOrBuilder
CustomClass.ClassItemOrBuilder
CustomClassOrBuilder
DeleteCustomClassRequestOrBuilder
DeletePhraseSetRequestOrBuilder
GetCustomClassRequestOrBuilder
GetPhraseSetRequestOrBuilder
ListCustomClassesRequestOrBuilder
ListCustomClassesResponseOrBuilder
ListPhraseSetRequestOrBuilder
ListPhraseSetResponseOrBuilder
LongRunningRecognizeMetadataOrBuilder
LongRunningRecognizeRequestOrBuilder
LongRunningRecognizeResponseOrBuilder
PhraseSet.PhraseOrBuilder
PhraseSetOrBuilder
RecognitionAudioOrBuilder
RecognitionConfigOrBuilder
RecognitionMetadataOrBuilder
RecognizeRequestOrBuilder
RecognizeResponseOrBuilder
SpeakerDiarizationConfigOrBuilder
SpeechAdaptationOrBuilder
SpeechContextOrBuilder
SpeechRecognitionAlternativeOrBuilder
SpeechRecognitionResultOrBuilder
StreamingRecognitionConfigOrBuilder
StreamingRecognitionResultOrBuilder
StreamingRecognizeRequestOrBuilder
StreamingRecognizeResponseOrBuilder
TranscriptNormalization.EntryOrBuilder
TranscriptNormalizationOrBuilder
TranscriptOutputConfigOrBuilder
UpdateCustomClassRequestOrBuilder
UpdatePhraseSetRequestOrBuilder
WordInfoOrBuilder
Enums
RecognitionAudio.AudioSourceCase
RecognitionConfig.AudioEncoding
The encoding of the audio data sent in the request.
All encodings support only 1 channel (mono) audio, unless the
audio_channel_count
and enable_separate_recognition_per_channel
fields
are set.
For best results, the audio source should be captured and transmitted using
a lossless encoding (FLAC
or LINEAR16
). The accuracy of the speech
recognition can be reduced if lossy codecs are used to capture or transmit
audio, particularly if background noise is present. Lossy codecs include
MULAW
, AMR
, AMR_WB
, OGG_OPUS
, SPEEX_WITH_HEADER_BYTE
, MP3
,
and WEBM_OPUS
.
The FLAC
and WAV
audio file formats include a header that describes the
included audio content. You can request recognition for WAV
files that
contain either LINEAR16
or MULAW
encoded audio.
If you send FLAC
or WAV
audio file format in
your request, you do not need to specify an AudioEncoding
; the audio
encoding format is determined from the file header. If you specify
an AudioEncoding
when you send send FLAC
or WAV
audio, the
encoding configuration must match the encoding described in the audio
header; otherwise the request returns an
google.rpc.Code.INVALID_ARGUMENT error code.
Protobuf enum google.cloud.speech.v1p1beta1.RecognitionConfig.AudioEncoding
RecognitionMetadata.InteractionType
Use case categories that the audio recognition request can be described by.
Protobuf enum google.cloud.speech.v1p1beta1.RecognitionMetadata.InteractionType
RecognitionMetadata.MicrophoneDistance
Enumerates the types of capture settings describing an audio file.
Protobuf enum google.cloud.speech.v1p1beta1.RecognitionMetadata.MicrophoneDistance
RecognitionMetadata.OriginalMediaType
The original media the speech was recorded on.
Protobuf enum google.cloud.speech.v1p1beta1.RecognitionMetadata.OriginalMediaType
RecognitionMetadata.RecordingDeviceType
The type of device the speech was recorded with.
Protobuf enum google.cloud.speech.v1p1beta1.RecognitionMetadata.RecordingDeviceType
StreamingRecognizeRequest.StreamingRequestCase
StreamingRecognizeResponse.SpeechEventType
Indicates the type of speech event.
Protobuf enum google.cloud.speech.v1p1beta1.StreamingRecognizeResponse.SpeechEventType