Perform a server-side streaming online prediction request for Vertex LLM streaming.
Endpoint
posthttps://{service-endpoint}/v1/{endpoint}:serverStreamingPredict
Where {service-endpoint}
is one of the supported service endpoints.
Path parameters
endpoint
string
Required. The name of the Endpoint requested to serve the prediction. Format: projects/{project}/locations/{location}/endpoints/{endpoint}
Request body
The request body contains data with the following structure:
inputs[]
object (Tensor
)
The prediction input.
parameters
object (Tensor
)
The parameters that govern the prediction.
Response body
If successful, the response body contains a stream of StreamingPredictResponse
instances.