PerformanceRequirements(mapping=None, *, ignore_unknown_fields=False, **kwargs)Performance requirements for a profile and or model deployment.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
Attributes |
|
|---|---|
| Name | Description |
target_ntpot_milliseconds |
int
Optional. The target Normalized Time Per Output Token (NTPOT) in milliseconds. NTPOT is calculated as request_latency / total_output_tokens. If not provided,
this target will not be enforced.
This field is a member of oneof_ _target_ntpot_milliseconds.
|
target_ttft_milliseconds |
int
Optional. The target Time To First Token (TTFT) in milliseconds. TTFT is the time it takes to generate the first token for a request. If not provided, this target will not be enforced. This field is a member of oneof_ _target_ttft_milliseconds.
|
target_cost |
google.cloud.gkerecommender_v1.types.Cost
Optional. The target cost for running a profile's model server. If not provided, this requirement will not be enforced. |