English
Deutsch
Español – América Latina
Français
Português – Brasil
中文 – 简体
日本語
한국어

Console

Contact Us Start free

Package com.google.cloud.speech.v1beta1 (4.6.0)

Classes

AsyncRecognizeMetadata

Describes the progress of a long-running AsyncRecognize call. It is included in the metadata field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

Protobuf type google.cloud.speech.v1beta1.AsyncRecognizeMetadata

AsyncRecognizeMetadata.Builder

Describes the progress of a long-running AsyncRecognize call. It is included in the metadata field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

Protobuf type google.cloud.speech.v1beta1.AsyncRecognizeMetadata

AsyncRecognizeRequest

The top-level message sent by the client for the AsyncRecognize method.

Protobuf type google.cloud.speech.v1beta1.AsyncRecognizeRequest

AsyncRecognizeRequest.Builder

The top-level message sent by the client for the AsyncRecognize method.

Protobuf type google.cloud.speech.v1beta1.AsyncRecognizeRequest

AsyncRecognizeResponse

The only message returned to the client by AsyncRecognize. It contains the result as zero or more sequential SpeechRecognitionResult messages. It is included in the result.response field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

Protobuf type google.cloud.speech.v1beta1.AsyncRecognizeResponse

AsyncRecognizeResponse.Builder

The only message returned to the client by AsyncRecognize. It contains the result as zero or more sequential SpeechRecognitionResult messages. It is included in the result.response field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

Protobuf type google.cloud.speech.v1beta1.AsyncRecognizeResponse

RecognitionAudio

Contains audio data in the encoding specified in the RecognitionConfig. Either content or uri must be supplied. Supplying both or neither returns google.rpc.Code.INVALID_ARGUMENT. See audio limits.

Protobuf type google.cloud.speech.v1beta1.RecognitionAudio

RecognitionAudio.Builder

Contains audio data in the encoding specified in the RecognitionConfig. Either content or uri must be supplied. Supplying both or neither returns google.rpc.Code.INVALID_ARGUMENT. See audio limits.

Protobuf type google.cloud.speech.v1beta1.RecognitionAudio

RecognitionConfig

Provides information to the recognizer that specifies how to process the request.

Protobuf type google.cloud.speech.v1beta1.RecognitionConfig

RecognitionConfig.Builder

Provides information to the recognizer that specifies how to process the request.

Protobuf type google.cloud.speech.v1beta1.RecognitionConfig

SpeechContext

Provides "hints" to the speech recognizer to favor specific words and phrases in the results.

Protobuf type google.cloud.speech.v1beta1.SpeechContext

SpeechContext.Builder

Provides "hints" to the speech recognizer to favor specific words and phrases in the results.

Protobuf type google.cloud.speech.v1beta1.SpeechContext

SpeechGrpc

Service that implements Google Cloud Speech API.

SpeechGrpc.SpeechBlockingStub

Service that implements Google Cloud Speech API.

SpeechGrpc.SpeechFutureStub

Service that implements Google Cloud Speech API.

SpeechGrpc.SpeechImplBase

Service that implements Google Cloud Speech API.

SpeechGrpc.SpeechStub

Service that implements Google Cloud Speech API.

SpeechProto

SpeechRecognitionAlternative

Alternative hypotheses (a.k.a. n-best list).

Protobuf type google.cloud.speech.v1beta1.SpeechRecognitionAlternative

SpeechRecognitionAlternative.Builder

Alternative hypotheses (a.k.a. n-best list).

Protobuf type google.cloud.speech.v1beta1.SpeechRecognitionAlternative

SpeechRecognitionResult

A speech recognition result corresponding to a portion of the audio.

Protobuf type google.cloud.speech.v1beta1.SpeechRecognitionResult

SpeechRecognitionResult.Builder

A speech recognition result corresponding to a portion of the audio.

Protobuf type google.cloud.speech.v1beta1.SpeechRecognitionResult

StreamingRecognitionConfig

Provides information to the recognizer that specifies how to process the request.

Protobuf type google.cloud.speech.v1beta1.StreamingRecognitionConfig

StreamingRecognitionConfig.Builder

Provides information to the recognizer that specifies how to process the request.

Protobuf type google.cloud.speech.v1beta1.StreamingRecognitionConfig

StreamingRecognitionResult

A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.

Protobuf type google.cloud.speech.v1beta1.StreamingRecognitionResult

StreamingRecognitionResult.Builder

A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.

Protobuf type google.cloud.speech.v1beta1.StreamingRecognitionResult

StreamingRecognizeRequest

The top-level message sent by the client for the StreamingRecognize method. Multiple StreamingRecognizeRequest messages are sent. The first message must contain a streaming_config message and must not contain audio data. All subsequent messages must contain audio data and must not contain a streaming_config message.

Protobuf type google.cloud.speech.v1beta1.StreamingRecognizeRequest

StreamingRecognizeRequest.Builder

The top-level message sent by the client for the StreamingRecognize method. Multiple StreamingRecognizeRequest messages are sent. The first message must contain a streaming_config message and must not contain audio data. All subsequent messages must contain audio data and must not contain a streaming_config message.

Protobuf type google.cloud.speech.v1beta1.StreamingRecognizeRequest

StreamingRecognizeResponse

StreamingRecognizeResponse is the only message returned to the client by StreamingRecognize. A series of one or more StreamingRecognizeResponse messages are streamed back to the client. Here's an example of a series of ten StreamingRecognizeResponses that might be returned while processing audio:

endpointer_type: START_OF_SPEECH
results { alternatives { transcript: "tube" } stability: 0.01 } result_index: 0
results { alternatives { transcript: "to be a" } stability: 0.01 } result_index: 0
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 } result_index: 0
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true } result_index: 0
results { alternatives { transcript: " that's" } stability: 0.01 } result_index: 1
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 } result_index: 1
endpointer_type: END_OF_SPEECH
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true } result_index: 1
endpointer_type: END_OF_AUDIO Notes:
Only two of the above responses #5 and #9 contain final results, they are indicated by is_final: true. Concatenating these together generates the full transcript: "to be or not to be that is the question".
The others contain interim results. #4 and #7 contain two interim results, the first portion has a high stability and is less likely to change, the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stability results.
The specific stability and confidence values shown above are only for illustrative purposes. Actual values may vary.
The result_index indicates the portion of audio that has had final results returned, and is no longer being processed. For example, the results in #6 and later correspond to the portion of audio after "to be or not to be".

Protobuf type google.cloud.speech.v1beta1.StreamingRecognizeResponse

StreamingRecognizeResponse.Builder

StreamingRecognizeResponse is the only message returned to the client by StreamingRecognize. A series of one or more StreamingRecognizeResponse messages are streamed back to the client. Here's an example of a series of ten StreamingRecognizeResponses that might be returned while processing audio:

endpointer_type: START_OF_SPEECH
results { alternatives { transcript: "tube" } stability: 0.01 } result_index: 0
results { alternatives { transcript: "to be a" } stability: 0.01 } result_index: 0
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 } result_index: 0
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true } result_index: 0
results { alternatives { transcript: " that's" } stability: 0.01 } result_index: 1
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 } result_index: 1
endpointer_type: END_OF_SPEECH
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true } result_index: 1
endpointer_type: END_OF_AUDIO Notes:
Only two of the above responses #5 and #9 contain final results, they are indicated by is_final: true. Concatenating these together generates the full transcript: "to be or not to be that is the question".
The others contain interim results. #4 and #7 contain two interim results, the first portion has a high stability and is less likely to change, the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stability results.
The specific stability and confidence values shown above are only for illustrative purposes. Actual values may vary.
The result_index indicates the portion of audio that has had final results returned, and is no longer being processed. For example, the results in #6 and later correspond to the portion of audio after "to be or not to be".

Protobuf type google.cloud.speech.v1beta1.StreamingRecognizeResponse

SyncRecognizeRequest

The top-level message sent by the client for the SyncRecognize method.

Protobuf type google.cloud.speech.v1beta1.SyncRecognizeRequest

SyncRecognizeRequest.Builder

The top-level message sent by the client for the SyncRecognize method.

Protobuf type google.cloud.speech.v1beta1.SyncRecognizeRequest

SyncRecognizeResponse

The only message returned to the client by SyncRecognize. method. It contains the result as zero or more sequential SpeechRecognitionResult messages.

Protobuf type google.cloud.speech.v1beta1.SyncRecognizeResponse

SyncRecognizeResponse.Builder

The only message returned to the client by SyncRecognize. method. It contains the result as zero or more sequential SpeechRecognitionResult messages.

Protobuf type google.cloud.speech.v1beta1.SyncRecognizeResponse

Interfaces

AsyncRecognizeMetadataOrBuilder

AsyncRecognizeRequestOrBuilder

AsyncRecognizeResponseOrBuilder

RecognitionAudioOrBuilder

RecognitionConfigOrBuilder

SpeechContextOrBuilder

SpeechRecognitionAlternativeOrBuilder

SpeechRecognitionResultOrBuilder

StreamingRecognitionConfigOrBuilder

StreamingRecognitionResultOrBuilder

StreamingRecognizeRequestOrBuilder

StreamingRecognizeResponseOrBuilder

SyncRecognizeRequestOrBuilder

SyncRecognizeResponseOrBuilder

Enums

RecognitionAudio.AudioSourceCase

RecognitionConfig.AudioEncoding

Audio encoding of the data sent in the audio message. All encodings support only 1 channel (mono) audio. Only FLAC includes a header that describes the bytes of audio that follow the header. The other encodings are raw audio bytes with no header. For best results, the audio source should be captured and transmitted using a lossless encoding (FLAC or LINEAR16). Recognition accuracy may be reduced if lossy codecs (such as AMR, AMR_WB and MULAW) are used to capture or transmit the audio, particularly if background noise is present.

Protobuf enum google.cloud.speech.v1beta1.RecognitionConfig.AudioEncoding

StreamingRecognizeRequest.StreamingRequestCase

StreamingRecognizeResponse.EndpointerType

Indicates the type of endpointer event.

Protobuf enum google.cloud.speech.v1beta1.StreamingRecognizeResponse.EndpointerType

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-03-05 UTC.