Perform a server-side streaming online prediction request for Vertex LLM streaming.
HTTP request
POST https://{service-endpoint}/v1/{endpoint}:serverStreamingPredict
Where {service-endpoint}
is one of the supported service endpoints.
Path parameters
Parameters | |
---|---|
endpoint |
Required. The name of the Endpoint requested to serve the prediction. Format: |
Request body
The request body contains data with the following structure:
JSON representation |
---|
{ "inputs": [ { object ( |
Fields | |
---|---|
inputs[] |
The prediction input. |
parameters |
The parameters that govern the prediction. |
Response body
If successful, the response body contains a stream of StreamingPredictResponse
instances.
Authorization scopes
Requires one of the following OAuth scopes:
https://www.googleapis.com/auth/cloud-platform
https://www.googleapis.com/auth/cloud-platform.read-only
https://www.googleapis.com/auth/cloud-vertex-ai.firstparty.predict
For more information, see the Authentication Overview.
IAM Permissions
Requires the following IAM permission on the endpoint
resource:
aiplatform.endpoints.predict
For more information, see the IAM documentation.