Method: projects.locations.publishers.models.serverStreamingPredict

Perform a server-side streaming online prediction request for Vertex LLM streaming.

HTTP request

POST https://{service-endpoint}/v1beta1/{endpoint}:serverStreamingPredict

Where {service-endpoint} is one of the supported service endpoints.

Path parameters

Parameters
endpoint

string

Required. The name of the Endpoint requested to serve the prediction. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

Request body

The request body contains data with the following structure:

JSON representation
{
  "inputs": [
    {
      object (Tensor)
    }
  ],
  "parameters": {
    object (Tensor)
  }
}
Fields
inputs[]

object (Tensor)

The prediction input.

parameters

object (Tensor)

The parameters that govern the prediction.

Response body

If successful, the response body contains a stream of StreamingPredictResponse instances.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the endpoint resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.