Console

Method: endpoints.serverStreamingPredict

Full name: projects.locations.endpoints.serverStreamingPredict

Perform a server-side streaming online prediction request for Vertex LLM streaming.

post https://{service-endpoint}/v1/{endpoint}:serverStreamingPredict

Where {service-endpoint} is one of the supported service endpoints.

endpoint string

Required. The name of the Endpoint requested to serve the prediction. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

The request body contains data with the following structure:

Fields

inputs[] object (Tensor)

The prediction input.

parameters object (Tensor)

The parameters that govern the prediction.

If successful, the response body contains a stream of StreamingPredictResponse instances.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-06-27 UTC.