Interface InferenceParameterOrBuilder (4.55.0)
bookmark_border Stay organized with collections Save and categorize content based on your preferences.

public interface InferenceParameterOrBuilder extends MessageOrBuilder

Implements

MessageOrBuilder

Methods

getMaxOutputTokens()

public abstract int getMaxOutputTokens()

Optional. Maximum number of the output tokens for the generator.

optional int32 max_output_tokens = 1 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`int`	The maxOutputTokens.

getTemperature()

public abstract double getTemperature()

Optional. Controls the randomness of LLM predictions. Low temperature = less random. High temperature = more random. If unset (or 0), uses a default value of 0.

optional double temperature = 2 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`double`	The temperature.

getTopK()

public abstract int getTopK()

Optional. Top-k changes how the model selects tokens for output. A top-k of 1 means the selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-k of 3 means that the next token is selected from among the 3 most probable tokens (using temperature). For each token selection step, the top K tokens with the highest probabilities are sampled. Then tokens are further filtered based on topP with the final token selected using temperature sampling. Specify a lower value for less random responses and a higher value for more random responses. Acceptable value is [1, 40], default to 40.

optional int32 top_k = 3 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`int`	The topK.

getTopP()

public abstract double getTopP()

Optional. Top-p changes how the model selects tokens for output. Tokens are selected from most K (see topK parameter) probable to least until the sum of their probabilities equals the top-p value. For example, if tokens A, B, and C have a probability of 0.3, 0.2, and 0.1 and the top-p value is 0.5, then the model will select either A or B as the next token (using temperature) and doesn't consider C. The default top-p value is 0.95. Specify a lower value for less random responses and a higher value for more random responses. Acceptable value is [0.0, 1.0], default to 0.95.

optional double top_p = 4 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`double`	The topP.

hasMaxOutputTokens()

public abstract boolean hasMaxOutputTokens()