A service for online predictions and explanations.
Equality
Instances of this class created via copy-construction or copy-assignment always compare equal. Instances created with equal std::shared_ptr<*Connection>
objects compare equal. Objects that compare equal share the same underlying resources.
Performance
Creating a new instance of this class is a relatively expensive operation, new objects establish new connections to the service. In contrast, copy-construction, move-construction, and the corresponding assignment operations are relatively efficient as the copies share all underlying resources.
Thread Safety
Concurrent access to different instances of this class, even if they compare equal, is guaranteed to work. Two or more threads operating on the same instance of this class is not guaranteed to work. Since copy-construction and move-construction is a relatively efficient operation, consider using such a copy when using this class from multiple threads.
Constructors
PredictionServiceClient(PredictionServiceClient const &)
Copy and move support
Parameter | |
---|---|
Name | Description |
|
PredictionServiceClient const &
|
PredictionServiceClient(PredictionServiceClient &&)
Copy and move support
Parameter | |
---|---|
Name | Description |
|
PredictionServiceClient &&
|
PredictionServiceClient(std::shared_ptr< PredictionServiceConnection >, Options)
Parameters | |
---|---|
Name | Description |
connection |
std::shared_ptr< PredictionServiceConnection >
|
opts |
Options
|
Operators
operator=(PredictionServiceClient const &)
Copy and move support
Parameter | |
---|---|
Name | Description |
|
PredictionServiceClient const &
|
Returns | |
---|---|
Type | Description |
PredictionServiceClient & |
operator=(PredictionServiceClient &&)
Copy and move support
Parameter | |
---|---|
Name | Description |
|
PredictionServiceClient &&
|
Returns | |
---|---|
Type | Description |
PredictionServiceClient & |
Functions
Predict(std::string const &, std::vector< google::protobuf::Value > const &, google::protobuf::Value const &, Options)
Perform an online prediction.
Parameters | |
---|---|
Name | Description |
endpoint |
std::string const &
Required. The name of the Endpoint requested to serve the prediction. Format: |
instances |
std::vector< google::protobuf::Value > const &
Required. The instances that are the input to the prediction call. A DeployedModel may have an upper limit on the number of instances it supports per request, and when it is exceeded the prediction call errors in case of AutoML Models, or, in case of customer created Models, the behaviour is as documented by that Model. The schema of any single instance may be specified via Endpoint's DeployedModels' [Model's][google.cloud.aiplatform.v1.DeployedModel.model] [PredictSchemata's][google.cloud.aiplatform.v1.Model.predict_schemata] [instance_schema_uri][google.cloud.aiplatform.v1.PredictSchemata.instance_schema_uri]. |
parameters |
google::protobuf::Value const &
The parameters that govern the prediction. The schema of the parameters may be specified via Endpoint's DeployedModels' [Model's ][google.cloud.aiplatform.v1.DeployedModel.model] [PredictSchemata's][google.cloud.aiplatform.v1.Model.predict_schemata] [parameters_schema_uri][google.cloud.aiplatform.v1.PredictSchemata.parameters_schema_uri]. |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::cloud::aiplatform::v1::PredictResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.PredictResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
Predict(google::cloud::aiplatform::v1::PredictRequest const &, Options)
Perform an online prediction.
Parameters | |
---|---|
Name | Description |
request |
google::cloud::aiplatform::v1::PredictRequest const &
Unary RPCs, such as the one wrapped by this function, receive a single |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::cloud::aiplatform::v1::PredictResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.PredictResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
RawPredict(std::string const &, google::api::HttpBody const &, Options)
Perform an online prediction with an arbitrary HTTP payload.
The response includes the following HTTP headers:
X-Vertex-AI-Endpoint-Id
: ID of the [Endpoint][google.cloud.aiplatform.v1.Endpoint] that served this prediction.X-Vertex-AI-Deployed-Model-Id
: ID of the Endpoint's [DeployedModel][google.cloud.aiplatform.v1.DeployedModel] that served this prediction.
Parameters | |
---|---|
Name | Description |
endpoint |
std::string const &
Required. The name of the Endpoint requested to serve the prediction. Format: |
http_body |
google::api::HttpBody const &
The prediction input. Supports HTTP headers and arbitrary data payload. |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::api::HttpBody > |
the result of the RPC. The response message type (google.api.HttpBody) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
RawPredict(google::cloud::aiplatform::v1::RawPredictRequest const &, Options)
Perform an online prediction with an arbitrary HTTP payload.
The response includes the following HTTP headers:
X-Vertex-AI-Endpoint-Id
: ID of the [Endpoint][google.cloud.aiplatform.v1.Endpoint] that served this prediction.X-Vertex-AI-Deployed-Model-Id
: ID of the Endpoint's [DeployedModel][google.cloud.aiplatform.v1.DeployedModel] that served this prediction.
Parameters | |
---|---|
Name | Description |
request |
google::cloud::aiplatform::v1::RawPredictRequest const &
Unary RPCs, such as the one wrapped by this function, receive a single |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::api::HttpBody > |
the result of the RPC. The response message type (google.api.HttpBody) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
StreamRawPredict(std::string const &, google::api::HttpBody const &, Options)
Perform a streaming online prediction with an arbitrary HTTP payload.
Parameters | |
---|---|
Name | Description |
endpoint |
std::string const &
Required. The name of the Endpoint requested to serve the prediction. Format: |
http_body |
google::api::HttpBody const &
The prediction input. Supports HTTP headers and arbitrary data payload. |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StreamRange< google::api::HttpBody > |
the result of the RPC. The response message type (google.api.HttpBody) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
StreamRawPredict(google::cloud::aiplatform::v1::StreamRawPredictRequest const &, Options)
Perform a streaming online prediction with an arbitrary HTTP payload.
Parameters | |
---|---|
Name | Description |
request |
google::cloud::aiplatform::v1::StreamRawPredictRequest const &
Unary RPCs, such as the one wrapped by this function, receive a single |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StreamRange< google::api::HttpBody > |
the result of the RPC. The response message type (google.api.HttpBody) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
DirectPredict(google::cloud::aiplatform::v1::DirectPredictRequest const &, Options)
Perform an unary online prediction request to a gRPC model server for Vertex first-party products and frameworks.
Parameters | |
---|---|
Name | Description |
request |
google::cloud::aiplatform::v1::DirectPredictRequest const &
Unary RPCs, such as the one wrapped by this function, receive a single |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::cloud::aiplatform::v1::DirectPredictResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.DirectPredictResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
DirectRawPredict(google::cloud::aiplatform::v1::DirectRawPredictRequest const &, Options)
Perform an unary online prediction request to a gRPC model server for custom containers.
Parameters | |
---|---|
Name | Description |
request |
google::cloud::aiplatform::v1::DirectRawPredictRequest const &
Unary RPCs, such as the one wrapped by this function, receive a single |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::cloud::aiplatform::v1::DirectRawPredictResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.DirectRawPredictResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
AsyncStreamDirectPredict(Options)
Perform a streaming online prediction request to a gRPC model server for Vertex first-party products and frameworks.
Parameter | |
---|---|
Name | Description |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
std::unique_ptr<::google::cloud::AsyncStreamingReadWriteRpc< google::cloud::aiplatform::v1::StreamDirectPredictRequest, google::cloud::aiplatform::v1::StreamDirectPredictResponse > > |
An object representing the bidirectional streaming RPC. Applications can send multiple request messages and receive multiple response messages through this API. Bidirectional streaming RPCs can impose restrictions on the sequence of request and response messages. Please consult the service documentation for details. The request message type (google.cloud.aiplatform.v1.StreamDirectPredictRequest) and response messages (google.cloud.aiplatform.v1.StreamDirectPredictResponse) are mapped to C++ classes using the Protobuf mapping rules. |
AsyncStreamDirectRawPredict(Options)
Perform a streaming online prediction request to a gRPC model server for custom containers.
Parameter | |
---|---|
Name | Description |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
std::unique_ptr<::google::cloud::AsyncStreamingReadWriteRpc< google::cloud::aiplatform::v1::StreamDirectRawPredictRequest, google::cloud::aiplatform::v1::StreamDirectRawPredictResponse > > |
An object representing the bidirectional streaming RPC. Applications can send multiple request messages and receive multiple response messages through this API. Bidirectional streaming RPCs can impose restrictions on the sequence of request and response messages. Please consult the service documentation for details. The request message type (google.cloud.aiplatform.v1.StreamDirectRawPredictRequest) and response messages (google.cloud.aiplatform.v1.StreamDirectRawPredictResponse) are mapped to C++ classes using the Protobuf mapping rules. |
AsyncStreamingPredict(Options)
Perform a streaming online prediction request for Vertex first-party products and frameworks.
Parameter | |
---|---|
Name | Description |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
std::unique_ptr<::google::cloud::AsyncStreamingReadWriteRpc< google::cloud::aiplatform::v1::StreamingPredictRequest, google::cloud::aiplatform::v1::StreamingPredictResponse > > |
An object representing the bidirectional streaming RPC. Applications can send multiple request messages and receive multiple response messages through this API. Bidirectional streaming RPCs can impose restrictions on the sequence of request and response messages. Please consult the service documentation for details. The request message type (google.cloud.aiplatform.v1.StreamingPredictRequest) and response messages (google.cloud.aiplatform.v1.StreamingPredictResponse) are mapped to C++ classes using the Protobuf mapping rules. |
ServerStreamingPredict(google::cloud::aiplatform::v1::StreamingPredictRequest const &, Options)
Perform a server-side streaming online prediction request for Vertex LLM streaming.
Parameters | |
---|---|
Name | Description |
request |
google::cloud::aiplatform::v1::StreamingPredictRequest const &
Unary RPCs, such as the one wrapped by this function, receive a single |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StreamRange< google::cloud::aiplatform::v1::StreamingPredictResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.StreamingPredictResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
AsyncStreamingRawPredict(Options)
Perform a streaming online prediction request through gRPC.
Parameter | |
---|---|
Name | Description |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
std::unique_ptr<::google::cloud::AsyncStreamingReadWriteRpc< google::cloud::aiplatform::v1::StreamingRawPredictRequest, google::cloud::aiplatform::v1::StreamingRawPredictResponse > > |
An object representing the bidirectional streaming RPC. Applications can send multiple request messages and receive multiple response messages through this API. Bidirectional streaming RPCs can impose restrictions on the sequence of request and response messages. Please consult the service documentation for details. The request message type (google.cloud.aiplatform.v1.StreamingRawPredictRequest) and response messages (google.cloud.aiplatform.v1.StreamingRawPredictResponse) are mapped to C++ classes using the Protobuf mapping rules. |
Explain(std::string const &, std::vector< google::protobuf::Value > const &, google::protobuf::Value const &, std::string const &, Options)
Perform an online explanation.
If deployed_model_id is specified, the corresponding DeployModel must have [explanation_spec][google.cloud.aiplatform.v1.DeployedModel.explanation_spec] populated. If deployed_model_id is not specified, all DeployedModels must have [explanation_spec][google.cloud.aiplatform.v1.DeployedModel.explanation_spec] populated.
Parameters | |
---|---|
Name | Description |
endpoint |
std::string const &
Required. The name of the Endpoint requested to serve the explanation. Format: |
instances |
std::vector< google::protobuf::Value > const &
Required. The instances that are the input to the explanation call. A DeployedModel may have an upper limit on the number of instances it supports per request, and when it is exceeded the explanation call errors in case of AutoML Models, or, in case of customer created Models, the behaviour is as documented by that Model. The schema of any single instance may be specified via Endpoint's DeployedModels' [Model's][google.cloud.aiplatform.v1.DeployedModel.model] [PredictSchemata's][google.cloud.aiplatform.v1.Model.predict_schemata] [instance_schema_uri][google.cloud.aiplatform.v1.PredictSchemata.instance_schema_uri]. |
parameters |
google::protobuf::Value const &
The parameters that govern the prediction. The schema of the parameters may be specified via Endpoint's DeployedModels' [Model's ][google.cloud.aiplatform.v1.DeployedModel.model] [PredictSchemata's][google.cloud.aiplatform.v1.Model.predict_schemata] [parameters_schema_uri][google.cloud.aiplatform.v1.PredictSchemata.parameters_schema_uri]. |
deployed_model_id |
std::string const &
If specified, this ExplainRequest will be served by the chosen DeployedModel, overriding [Endpoint.traffic_split][google.cloud.aiplatform.v1.Endpoint.traffic_split]. |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::cloud::aiplatform::v1::ExplainResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.ExplainResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
Explain(google::cloud::aiplatform::v1::ExplainRequest const &, Options)
Perform an online explanation.
If deployed_model_id is specified, the corresponding DeployModel must have [explanation_spec][google.cloud.aiplatform.v1.DeployedModel.explanation_spec] populated. If deployed_model_id is not specified, all DeployedModels must have [explanation_spec][google.cloud.aiplatform.v1.DeployedModel.explanation_spec] populated.
Parameters | |
---|---|
Name | Description |
request |
google::cloud::aiplatform::v1::ExplainRequest const &
Unary RPCs, such as the one wrapped by this function, receive a single |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::cloud::aiplatform::v1::ExplainResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.ExplainResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
GenerateContent(std::string const &, std::vector< google::cloud::aiplatform::v1::Content > const &, Options)
Generate content with multimodal inputs.
Parameters | |
---|---|
Name | Description |
model |
std::string const &
Required. The fully qualified name of the publisher model or tuned model endpoint to use. |
contents |
std::vector< google::cloud::aiplatform::v1::Content > const &
Required. The content of the current conversation with the model. |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::cloud::aiplatform::v1::GenerateContentResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.GenerateContentResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
GenerateContent(google::cloud::aiplatform::v1::GenerateContentRequest const &, Options)
Generate content with multimodal inputs.
Parameters | |
---|---|
Name | Description |
request |
google::cloud::aiplatform::v1::GenerateContentRequest const &
Unary RPCs, such as the one wrapped by this function, receive a single |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StatusOr< google::cloud::aiplatform::v1::GenerateContentResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.GenerateContentResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
StreamGenerateContent(std::string const &, std::vector< google::cloud::aiplatform::v1::Content > const &, Options)
Generate content with multimodal inputs with streaming support.
Parameters | |
---|---|
Name | Description |
model |
std::string const &
Required. The fully qualified name of the publisher model or tuned model endpoint to use. |
contents |
std::vector< google::cloud::aiplatform::v1::Content > const &
Required. The content of the current conversation with the model. |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StreamRange< google::cloud::aiplatform::v1::GenerateContentResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.GenerateContentResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |
StreamGenerateContent(google::cloud::aiplatform::v1::GenerateContentRequest const &, Options)
Generate content with multimodal inputs with streaming support.
Parameters | |
---|---|
Name | Description |
request |
google::cloud::aiplatform::v1::GenerateContentRequest const &
Unary RPCs, such as the one wrapped by this function, receive a single |
opts |
Options
Optional. Override the class-level options, such as retry and backoff policies. |
Returns | |
---|---|
Type | Description |
StreamRange< google::cloud::aiplatform::v1::GenerateContentResponse > |
the result of the RPC. The response message type (google.cloud.aiplatform.v1.GenerateContentResponse) is mapped to a C++ class using the Protobuf mapping rules. If the request fails, the |