Interface PredictionServiceGrpc.AsyncService (3.54.0)

public static interface PredictionServiceGrpc.AsyncService

A service for online predictions and explanations.

Methods

chatCompletions(ChatCompletionsRequest request, StreamObserver<HttpBody> responseObserver)

public default void chatCompletions(ChatCompletionsRequest request, StreamObserver<HttpBody> responseObserver)

Exposes an OpenAI-compatible endpoint for chat completions.

Parameters
Name Description
request ChatCompletionsRequest
responseObserver io.grpc.stub.StreamObserver<com.google.api.HttpBody>

countTokens(CountTokensRequest request, StreamObserver<CountTokensResponse> responseObserver)

public default void countTokens(CountTokensRequest request, StreamObserver<CountTokensResponse> responseObserver)

Perform a token counting.

Parameters
Name Description
request CountTokensRequest
responseObserver io.grpc.stub.StreamObserver<CountTokensResponse>

directPredict(DirectPredictRequest request, StreamObserver<DirectPredictResponse> responseObserver)

public default void directPredict(DirectPredictRequest request, StreamObserver<DirectPredictResponse> responseObserver)

Perform an unary online prediction request to a gRPC model server for Vertex first-party products and frameworks.

Parameters
Name Description
request DirectPredictRequest
responseObserver io.grpc.stub.StreamObserver<DirectPredictResponse>

directRawPredict(DirectRawPredictRequest request, StreamObserver<DirectRawPredictResponse> responseObserver)

public default void directRawPredict(DirectRawPredictRequest request, StreamObserver<DirectRawPredictResponse> responseObserver)

Perform an unary online prediction request to a gRPC model server for custom containers.

Parameters
Name Description
request DirectRawPredictRequest
responseObserver io.grpc.stub.StreamObserver<DirectRawPredictResponse>

explain(ExplainRequest request, StreamObserver<ExplainResponse> responseObserver)

public default void explain(ExplainRequest request, StreamObserver<ExplainResponse> responseObserver)

Perform an online explanation. If deployed_model_id is specified, the corresponding DeployModel must have explanation_spec populated. If deployed_model_id is not specified, all DeployedModels must have explanation_spec populated.

Parameters
Name Description
request ExplainRequest
responseObserver io.grpc.stub.StreamObserver<ExplainResponse>

generateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)

public default void generateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)

Generate content with multimodal inputs.

Parameters
Name Description
request GenerateContentRequest
responseObserver io.grpc.stub.StreamObserver<GenerateContentResponse>

predict(PredictRequest request, StreamObserver<PredictResponse> responseObserver)

public default void predict(PredictRequest request, StreamObserver<PredictResponse> responseObserver)

Perform an online prediction.

Parameters
Name Description
request PredictRequest
responseObserver io.grpc.stub.StreamObserver<PredictResponse>

rawPredict(RawPredictRequest request, StreamObserver<HttpBody> responseObserver)

public default void rawPredict(RawPredictRequest request, StreamObserver<HttpBody> responseObserver)

Perform an online prediction with an arbitrary HTTP payload. The response includes the following HTTP headers:

  • X-Vertex-AI-Endpoint-Id: ID of the Endpoint that served this prediction.
  • X-Vertex-AI-Deployed-Model-Id: ID of the Endpoint's DeployedModel that served this prediction.
Parameters
Name Description
request RawPredictRequest
responseObserver io.grpc.stub.StreamObserver<com.google.api.HttpBody>

serverStreamingPredict(StreamingPredictRequest request, StreamObserver<StreamingPredictResponse> responseObserver)

public default void serverStreamingPredict(StreamingPredictRequest request, StreamObserver<StreamingPredictResponse> responseObserver)

Perform a server-side streaming online prediction request for Vertex LLM streaming.

Parameters
Name Description
request StreamingPredictRequest
responseObserver io.grpc.stub.StreamObserver<StreamingPredictResponse>

streamDirectPredict(StreamObserver<StreamDirectPredictResponse> responseObserver)

public default StreamObserver<StreamDirectPredictRequest> streamDirectPredict(StreamObserver<StreamDirectPredictResponse> responseObserver)

Perform a streaming online prediction request to a gRPC model server for Vertex first-party products and frameworks.

Parameter
Name Description
responseObserver io.grpc.stub.StreamObserver<StreamDirectPredictResponse>
Returns
Type Description
io.grpc.stub.StreamObserver<StreamDirectPredictRequest>

streamDirectRawPredict(StreamObserver<StreamDirectRawPredictResponse> responseObserver)

public default StreamObserver<StreamDirectRawPredictRequest> streamDirectRawPredict(StreamObserver<StreamDirectRawPredictResponse> responseObserver)

Perform a streaming online prediction request to a gRPC model server for custom containers.

Parameter
Name Description
responseObserver io.grpc.stub.StreamObserver<StreamDirectRawPredictResponse>
Returns
Type Description
io.grpc.stub.StreamObserver<StreamDirectRawPredictRequest>

streamGenerateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)

public default void streamGenerateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)

Generate content with multimodal inputs with streaming support.

Parameters
Name Description
request GenerateContentRequest
responseObserver io.grpc.stub.StreamObserver<GenerateContentResponse>

streamRawPredict(StreamRawPredictRequest request, StreamObserver<HttpBody> responseObserver)

public default void streamRawPredict(StreamRawPredictRequest request, StreamObserver<HttpBody> responseObserver)

Perform a streaming online prediction with an arbitrary HTTP payload.

Parameters
Name Description
request StreamRawPredictRequest
responseObserver io.grpc.stub.StreamObserver<com.google.api.HttpBody>

streamingPredict(StreamObserver<StreamingPredictResponse> responseObserver)

public default StreamObserver<StreamingPredictRequest> streamingPredict(StreamObserver<StreamingPredictResponse> responseObserver)

Perform a streaming online prediction request for Vertex first-party products and frameworks.

Parameter
Name Description
responseObserver io.grpc.stub.StreamObserver<StreamingPredictResponse>
Returns
Type Description
io.grpc.stub.StreamObserver<StreamingPredictRequest>

streamingRawPredict(StreamObserver<StreamingRawPredictResponse> responseObserver)

public default StreamObserver<StreamingRawPredictRequest> streamingRawPredict(StreamObserver<StreamingRawPredictResponse> responseObserver)

Perform a streaming online prediction request through gRPC.

Parameter
Name Description
responseObserver io.grpc.stub.StreamObserver<StreamingRawPredictResponse>
Returns
Type Description
io.grpc.stub.StreamObserver<StreamingRawPredictRequest>