public static final class PredictionServiceGrpc.PredictionServiceBlockingStub extends AbstractBlockingStub<PredictionServiceGrpc.PredictionServiceBlockingStub>
A stub to allow clients to do synchronous rpc calls to service PredictionService.
A service for online predictions and explanations.
Inheritance
java.lang.Object >
io.grpc.stub.AbstractStub >
io.grpc.stub.AbstractBlockingStub >
PredictionServiceGrpc.PredictionServiceBlockingStub
Inherited Members
io.grpc.stub.AbstractBlockingStub.<T>newStub(io.grpc.stub.AbstractStub.StubFactory<T>,io.grpc.Channel)
io.grpc.stub.AbstractBlockingStub.<T>newStub(io.grpc.stub.AbstractStub.StubFactory<T>,io.grpc.Channel,io.grpc.CallOptions)
io.grpc.stub.AbstractStub.<T>withOption(io.grpc.CallOptions.Key<T>,T)
io.grpc.stub.AbstractStub.build(io.grpc.Channel,io.grpc.CallOptions)
io.grpc.stub.AbstractStub.getCallOptions()
io.grpc.stub.AbstractStub.getChannel()
io.grpc.stub.AbstractStub.withCallCredentials(io.grpc.CallCredentials)
io.grpc.stub.AbstractStub.withChannel(io.grpc.Channel)
io.grpc.stub.AbstractStub.withCompression(java.lang.String)
io.grpc.stub.AbstractStub.withDeadline(io.grpc.Deadline)
io.grpc.stub.AbstractStub.withDeadlineAfter(long,java.util.concurrent.TimeUnit)
io.grpc.stub.AbstractStub.withExecutor(java.util.concurrent.Executor)
io.grpc.stub.AbstractStub.withInterceptors(io.grpc.ClientInterceptor...)
io.grpc.stub.AbstractStub.withMaxInboundMessageSize(int)
io.grpc.stub.AbstractStub.withMaxOutboundMessageSize(int)
io.grpc.stub.AbstractStub.withWaitForReady()
Methods
protected PredictionServiceGrpc.PredictionServiceBlockingStub build(Channel channel, CallOptions callOptions)
Parameters |
Name |
Description |
channel |
io.grpc.Channel
|
callOptions |
io.grpc.CallOptions
|
Overrides
io.grpc.stub.AbstractStub.build(io.grpc.Channel,io.grpc.CallOptions)
public Iterator<HttpBody> chatCompletions(ChatCompletionsRequest request)
Exposes an OpenAI-compatible endpoint for chat completions.
Returns |
Type |
Description |
Iterator<com.google.api.HttpBody> |
|
public CountTokensResponse countTokens(CountTokensRequest request)
Perform a token counting.
public DirectPredictResponse directPredict(DirectPredictRequest request)
Perform an unary online prediction request to a gRPC model server for
Vertex first-party products and frameworks.
public DirectRawPredictResponse directRawPredict(DirectRawPredictRequest request)
Perform an unary online prediction request to a gRPC model server for
custom containers.
public ExplainResponse explain(ExplainRequest request)
Perform an online explanation.
If
deployed_model_id
is specified, the corresponding DeployModel must have
explanation_spec
populated. If
deployed_model_id
is not specified, all DeployedModels must have
explanation_spec
populated.
generateContent(GenerateContentRequest request)
public GenerateContentResponse generateContent(GenerateContentRequest request)
Generate content with multimodal inputs.
public PredictResponse predict(PredictRequest request)
Perform an online prediction.
public HttpBody rawPredict(RawPredictRequest request)
Perform an online prediction with an arbitrary HTTP payload.
The response includes the following HTTP headers:
X-Vertex-AI-Endpoint-Id
: ID of the
Endpoint that served this
prediction.
X-Vertex-AI-Deployed-Model-Id
: ID of the Endpoint's
DeployedModel that served
this prediction.
Returns |
Type |
Description |
com.google.api.HttpBody |
|
public Iterator<StreamingPredictResponse> serverStreamingPredict(StreamingPredictRequest request)
Perform a server-side streaming online prediction request for Vertex
LLM streaming.
streamGenerateContent(GenerateContentRequest request)
public Iterator<GenerateContentResponse> streamGenerateContent(GenerateContentRequest request)
Generate content with multimodal inputs with streaming support.
public Iterator<HttpBody> streamRawPredict(StreamRawPredictRequest request)
Perform a streaming online prediction with an arbitrary HTTP payload.
Returns |
Type |
Description |
Iterator<com.google.api.HttpBody> |
|