public static interface PredictionServiceGrpc.AsyncService
A service for online predictions and explanations.
Methods
chatCompletions(ChatCompletionsRequest request, StreamObserver<HttpBody> responseObserver)
public default void chatCompletions(ChatCompletionsRequest request, StreamObserver<HttpBody> responseObserver)
Exposes an OpenAI-compatible endpoint for chat completions.
Parameters |
Name |
Description |
request |
ChatCompletionsRequest
|
responseObserver |
io.grpc.stub.StreamObserver<com.google.api.HttpBody>
|
public default void countTokens(CountTokensRequest request, StreamObserver<CountTokensResponse> responseObserver)
Perform a token counting.
public default void directPredict(DirectPredictRequest request, StreamObserver<DirectPredictResponse> responseObserver)
Perform an unary online prediction request to a gRPC model server for
Vertex first-party products and frameworks.
public default void directRawPredict(DirectRawPredictRequest request, StreamObserver<DirectRawPredictResponse> responseObserver)
Perform an unary online prediction request to a gRPC model server for
custom containers.
public default void explain(ExplainRequest request, StreamObserver<ExplainResponse> responseObserver)
Perform an online explanation.
If
deployed_model_id
is specified, the corresponding DeployModel must have
explanation_spec
populated. If
deployed_model_id
is not specified, all DeployedModels must have
explanation_spec
populated.
generateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)
public default void generateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)
Generate content with multimodal inputs.
public default void predict(PredictRequest request, StreamObserver<PredictResponse> responseObserver)
Perform an online prediction.
rawPredict(RawPredictRequest request, StreamObserver<HttpBody> responseObserver)
public default void rawPredict(RawPredictRequest request, StreamObserver<HttpBody> responseObserver)
Perform an online prediction with an arbitrary HTTP payload.
The response includes the following HTTP headers:
X-Vertex-AI-Endpoint-Id
: ID of the
Endpoint that served this
prediction.
X-Vertex-AI-Deployed-Model-Id
: ID of the Endpoint's
DeployedModel that served
this prediction.
Parameters |
Name |
Description |
request |
RawPredictRequest
|
responseObserver |
io.grpc.stub.StreamObserver<com.google.api.HttpBody>
|
public default void serverStreamingPredict(StreamingPredictRequest request, StreamObserver<StreamingPredictResponse> responseObserver)
Perform a server-side streaming online prediction request for Vertex
LLM streaming.
public default StreamObserver<StreamDirectPredictRequest> streamDirectPredict(StreamObserver<StreamDirectPredictResponse> responseObserver)
Perform a streaming online prediction request to a gRPC model server for
Vertex first-party products and frameworks.
public default StreamObserver<StreamDirectRawPredictRequest> streamDirectRawPredict(StreamObserver<StreamDirectRawPredictResponse> responseObserver)
Perform a streaming online prediction request to a gRPC model server for
custom containers.
streamGenerateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)
public default void streamGenerateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)
Generate content with multimodal inputs with streaming support.
public default StreamObserver<StreamingPredictRequest> streamingPredict(StreamObserver<StreamingPredictResponse> responseObserver)
Perform a streaming online prediction request for Vertex first-party
products and frameworks.
public default StreamObserver<StreamingRawPredictRequest> streamingRawPredict(StreamObserver<StreamingRawPredictResponse> responseObserver)
Perform a streaming online prediction request through gRPC.