- 1.73.0 (latest)
- 1.72.0
- 1.71.1
- 1.70.0
- 1.69.0
- 1.68.0
- 1.67.1
- 1.66.0
- 1.65.0
- 1.63.0
- 1.62.0
- 1.60.0
- 1.59.0
- 1.58.0
- 1.57.0
- 1.56.0
- 1.55.0
- 1.54.1
- 1.53.0
- 1.52.0
- 1.51.0
- 1.50.0
- 1.49.0
- 1.48.0
- 1.47.0
- 1.46.0
- 1.45.0
- 1.44.0
- 1.43.0
- 1.39.0
- 1.38.1
- 1.37.0
- 1.36.4
- 1.35.0
- 1.34.0
- 1.33.1
- 1.32.0
- 1.31.1
- 1.30.1
- 1.29.0
- 1.28.1
- 1.27.1
- 1.26.1
- 1.25.0
- 1.24.1
- 1.23.0
- 1.22.1
- 1.21.0
- 1.20.0
- 1.19.1
- 1.18.3
- 1.17.1
- 1.16.1
- 1.15.1
- 1.14.0
- 1.13.1
- 1.12.1
- 1.11.0
- 1.10.0
- 1.9.0
- 1.8.1
- 1.7.1
- 1.6.2
- 1.5.0
- 1.4.3
- 1.3.0
- 1.2.0
- 1.1.1
- 1.0.1
- 0.9.0
- 0.8.0
- 0.7.1
- 0.6.0
- 0.5.1
- 0.4.0
- 0.3.1
Vertex AI SDK
The vertexai module.
vertexai.init(*, project: Optional[str] = None, location: Optional[str] = None, experiment: Optional[str] = None, experiment_description: Optional[str] = None, experiment_tensorboard: Optional[Union[str, google.cloud.aiplatform.tensorboard.tensorboard_resource.Tensorboard, bool]] = None, staging_bucket: Optional[str] = None, credentials: Optional[google.auth.credentials.Credentials] = None, encryption_spec_key_name: Optional[str] = None, network: Optional[str] = None, service_account: Optional[str] = None, api_endpoint: Optional[str] = None, api_transport: Optional[str] = None)
Updates common initialization parameters with provided options.
Parameters
project (str) – The default project to use when making API calls.
location (str) – The default location to use when making API calls. If not set defaults to us-central-1.
experiment (str) – Optional. The experiment name.
experiment_description (str) – Optional. The description of the experiment.
experiment_tensorboard (Union[str, *[tensorboard_resource.Tensorboard](../aiplatform/services.md#google.cloud.aiplatform.Tensorboard), [bool](https://python.readthedocs.io/en/latest/library/functions.html#bool)]*) – Optional. The Vertex AI TensorBoard instance, Tensorboard resource name, or Tensorboard resource ID to use as a backing Tensorboard for the provided experiment.
Example tensorboard resource name format: “projects/123/locations/us-central1/tensorboards/456”
If experiment_tensorboard is provided and experiment is not, the provided experiment_tensorboard will be set as the global Tensorboard. Any subsequent calls to aiplatform.init() with experiment and without experiment_tensorboard will automatically assign the global Tensorboard to the experiment.
If experiment_tensorboard is ommitted or set to True or None the global Tensorboard will be assigned to the experiment. If a global Tensorboard is not set, the default Tensorboard instance will be used, and created if it does not exist.
To disable creating and using Tensorboard with experiment, set experiment_tensorboard to False. Any subsequent calls to aiplatform.init() should include this setting as well.
staging_bucket (str) – The default staging bucket to use to stage artifacts when making API calls. In the form gs://…
credentials (google.auth.credentials.Credentials) – The default custom credentials to use when making API calls. If not provided credentials will be ascertained from the environment.
encryption_spec_key_name (Optional[str]) – Optional. The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form:
projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key
. The key needs to be in the same region as where the compute resource is created.If set, this resource and all sub-resources will be secured by this key.
network (str) – Optional. The full name of the Compute Engine network to which jobs and resources should be peered. E.g. “projects/12345/global/networks/myVPC”. Private services access must already be configured for the network. If specified, all eligible jobs and resources created will be peered with this VPC.
service_account (str) – Optional. The service account used to launch jobs and deploy models. Jobs that use service_account: BatchPredictionJob, CustomJob, PipelineJob, HyperparameterTuningJob, CustomTrainingJob, CustomPythonPackageTrainingJob, CustomContainerTrainingJob, ModelEvaluationJob.
api_endpoint (str) – Optional. The desired API endpoint, e.g., us-central1-aiplatform.googleapis.com
api_transport (str) – Optional. The transport method which is either ‘grpc’ or ‘rest’. NOTE: “rest” transport functionality is currently in a beta state (preview).
Raises
ValueError – If experiment_description is provided but experiment is not.
Classes for working with the Gemini models.
class vertexai.generative_models.Candidate()
Bases: object
A response candidate generated by the model.
class vertexai.generative_models.ChatSession(model: vertexai.generative_models._generative_models._GenerativeModel, *, history: Optional[List[vertexai.generative_models._generative_models.Content]] = None, response_validation: bool = True)
Bases: object
Chat session holds the chat history.
send_message(content: Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]], *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, stream: bool = False)
Generates content.
Parameters
content – Content to send to the model. Supports a value that can be converted to a Part or a list of such values. Supports * str, Image, Part, * List[Union[str, Image, Part]],
generation_config – Parameters for the generation.
safety_settings – Safety settings as a mapping from HarmCategory to HarmBlockThreshold.
tools – A list of tools (functions) that the model can try calling.
stream – Whether to stream the response.
Returns
A single GenerationResponse object if stream == False A stream of GenerationResponse objects if stream == True
Raises
ResponseValidationError – If the response was blocked or is incomplete.
send_message_async(content: Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]], *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, stream: bool = False)
Generates content asynchronously.
Parameters
content – Content to send to the model. Supports a value that can be converted to a Part or a list of such values. Supports * str, Image, Part, * List[Union[str, Image, Part]],
generation_config – Parameters for the generation.
safety_settings – Safety settings as a mapping from HarmCategory to HarmBlockThreshold.
tools – A list of tools (functions) that the model can try calling.
stream – Whether to stream the response.
Returns
An awaitable for a single GenerationResponse object if stream == False An awaitable for a stream of GenerationResponse objects if stream == True
Raises
ResponseValidationError – If the response was blocked or is incomplete.
class vertexai.generative_models.Content(*, parts: Optional[List[vertexai.generative_models._generative_models.Part]] = None, role: Optional[str] = None)
Bases: object
The multi-part content of a message.
Usage:
```python
``
```
\`
response = model.generate_content(contents=[
> Content(role=”user”, parts=[Part.from_text(“Why is sky blue?”)])
class vertexai.generative_models.FinishReason(value)
Bases: proto.enums.Enum
The reason why the model stopped generating tokens. If empty, the model has not stopped generating the tokens.
Values:
FINISH_REASON_UNSPECIFIED (0):
The finish reason is unspecified.
STOP (1):
Natural stop point of the model or provided
stop sequence.
MAX_TOKENS (2):
The maximum number of tokens as specified in
the request was reached.
SAFETY (3):
The token generation was stopped as the
response was flagged for safety reasons. NOTE:
When streaming the Candidate.content will be
empty if content filters blocked the output.
RECITATION (4):
The token generation was stopped as the
response was flagged for unauthorized citations.
OTHER (5):
All other reasons that stopped the token
generation
BLOCKLIST (6):
The token generation was stopped as the
response was flagged for the terms which are
included from the terminology blocklist.
PROHIBITED_CONTENT (7):
The token generation was stopped as the
response was flagged for the prohibited
contents.
SPII (8):
The token generation was stopped as the
response was flagged for Sensitive Personally
Identifiable Information (SPII) contents.
class vertexai.generative_models.FunctionDeclaration(*, name: str, parameters: Dict[str, Any], description: Optional[str] = None)
Bases: object
A representation of a function declaration.
Usage:
Create function declaration and tool:
```python
``
```
\`
get_current_weather_func = generative_models.FunctionDeclaration(
> name=”get_current_weather”,
> description=”Get the current weather in a given location”,
> parameters={
> > “type”: “object”,
> > “properties”: {
> > > “location”: {
> > > “type”: “string”,
> > > “description”: “The city and state, e.g. San Francisco, CA”
> > > },
> > > “unit”: {
> > > > “type”: “string”,
> > > > “enum”: [
> > > > > “celsius”,
> > > > > “fahrenheit”,
> > > > ]
> > > }
> > },
> > “required”: [
> > > “location”
> > ]
> },
)
weather_tool = generative_models.Tool(
> function_declarations=[get_current_weather_func],
Use tool in GenerativeModel.generate_content:
```python
``
```
\`
model = GenerativeModel(“gemini-pro”)
print(model.generate_content(
> “What is the weather like in Boston?”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
Use tool in chat:
```python
``
```
\`
model = GenerativeModel(
> “gemini-pro”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
)
chat = model.start_chat()
print(chat.send_message(“What is the weather like in Boston?”))
print(chat.send_message(
> Part.from_function_response(
> name=”get_current_weather”,
> response={
> > “content”: {“weather_there”: “super nice”},
> }
> ),
Constructs a FunctionDeclaration.
Parameters
name – The name of the function that the model can call.
parameters – Describes the parameters to this function in JSON Schema Object format.
description – Description and purpose of the function. Model uses it to decide how and whether to call the function.
class vertexai.generative_models.GenerationConfig(*, temperature: Optional[float] = None, top_p: Optional[float] = None, top_k: Optional[int] = None, candidate_count: Optional[int] = None, max_output_tokens: Optional[int] = None, stop_sequences: Optional[List[str]] = None)
Bases: object
Parameters for the generation.
Constructs a GenerationConfig object.
Parameters
temperature – Controls the randomness of predictions. Range: [0.0, 1.0]
top_p – If specified, nucleus sampling will be used. Range: (0.0, 1.0]
top_k – If specified, top-k sampling will be used.
candidate_count – Number of candidates to generate.
max_output_tokens – The maximum number of output tokens to generate per message.
stop_sequences – A list of stop sequences.
Usage:
```python
``
```
\`
response = model.generate_content(
> “Why is sky blue?”,
> generation_config=GenerationConfig(
> > temperature=0.1,
> > top_p=0.95,
> > top_k=20,
> > candidate_count=1,
> > max_output_tokens=100,
> > stop_sequences=[”nnn”],
> )
class vertexai.generative_models.GenerationResponse()
Bases: object
The response from the model.
class vertexai.generative_models.GenerativeModel(model_name: str, *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, tool_config: Optional[vertexai.generative_models._generative_models.ToolConfig] = None, system_instruction: Optional[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]]] = None)
Bases: vertexai.generative_models._generative_models._GenerativeModel
Initializes GenerativeModel.
Usage:
`\`
model = GenerativeModel("gemini-pro")
print(model.generate_content("Hello"))
\``
Parameters
model_name – Model Garden model resource name. Alternatively, a tuned model endpoint resource name can be provided.
generation_config – Default generation config to use in generate_content.
safety_settings – Default safety settings to use in generate_content.
tools – Default tools to use in generate_content.
tool_config – Default tool config to use in generate_content.
system_instruction – Default system instruction to use in generate_content. Note: Only text should be used in parts. Content of each part will become a separate paragraph.
count_tokens(contents: Union[List[vertexai.generative_models._generative_models.Content], List[Dict[str, Any]], str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]])
Counts tokens.
Parameters
contents – Contents to send to the model. Supports either a list of Content objects (passing a multi-turn conversation) or a value that can be converted to a single Content object (passing a single message). Supports * str, Image, Part, * List[Union[str, Image, Part]], * List[Content]
Returns
total_tokens: The total number of tokens counted across all instances from the request. total_billable_characters: The total number of billable characters counted across all instances from the request.
Return type
A CountTokensResponse object that has the following attributes
async count_tokens_async(contents: Union[List[vertexai.generative_models._generative_models.Content], List[Dict[str, Any]], str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]])
Counts tokens asynchronously.
Parameters
contents – Contents to send to the model. Supports either a list of Content objects (passing a multi-turn conversation) or a value that can be converted to a single Content object (passing a single message). Supports * str, Image, Part, * List[Union[str, Image, Part]], * List[Content]
Returns
total_tokens: The total number of tokens counted across all instances from the request. total_billable_characters: The total number of billable characters counted across all instances from the request.
Return type
And awaitable for a CountTokensResponse object that has the following attributes
generate_content(contents: Union[List[vertexai.generative_models._generative_models.Content], List[Dict[str, Any]], str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]], *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, tool_config: Optional[vertexai.generative_models._generative_models.ToolConfig] = None, stream: bool = False)
Generates content.
Parameters
contents – Contents to send to the model. Supports either a list of Content objects (passing a multi-turn conversation) or a value that can be converted to a single Content object (passing a single message). Supports * str, Image, Part, * List[Union[str, Image, Part]], * List[Content]
generation_config – Parameters for the generation.
safety_settings – Safety settings as a mapping from HarmCategory to HarmBlockThreshold.
tools – A list of tools (functions) that the model can try calling.
tool_config – Config shared for all tools provided in the request.
stream – Whether to stream the response.
Returns
A single GenerationResponse object if stream == False A stream of GenerationResponse objects if stream == True
async generate_content_async(contents: Union[List[vertexai.generative_models._generative_models.Content], List[Dict[str, Any]], str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]], *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, tool_config: Optional[vertexai.generative_models._generative_models.ToolConfig] = None, stream: bool = False)
Generates content asynchronously.
Parameters
contents – Contents to send to the model. Supports either a list of Content objects (passing a multi-turn conversation) or a value that can be converted to a single Content object (passing a single message). Supports * str, Image, Part, * List[Union[str, Image, Part]], * List[Content]
generation_config – Parameters for the generation.
safety_settings – Safety settings as a mapping from HarmCategory to HarmBlockThreshold.
tools – A list of tools (functions) that the model can try calling.
tool_config – Config shared for all tools provided in the request.
stream – Whether to stream the response.
Returns
An awaitable for a single GenerationResponse object if stream == False An awaitable for a stream of GenerationResponse objects if stream == True
start_chat(*, history: Optional[List[vertexai.generative_models._generative_models.Content]] = None, response_validation: bool = True)
Creates a stateful chat session.
Parameters
history – Previous history to initialize the chat session.
response_validation – Whether to validate responses before adding them to chat history. By default, send_message will raise error if the request or response is blocked or if the response is incomplete due to going over the max token limit. If set to False, the chat session history will always accumulate the request and response messages even if the reponse if blocked or incomplete. This can result in an unusable chat session state.
Returns
A ChatSession object.
class vertexai.generative_models.HarmBlockThreshold(value)
Bases: proto.enums.Enum
Probability based thresholds levels for blocking.
Values:
HARM_BLOCK_THRESHOLD_UNSPECIFIED (0):
Unspecified harm block threshold.
BLOCK_LOW_AND_ABOVE (1):
Block low threshold and above (i.e. block
more).
BLOCK_MEDIUM_AND_ABOVE (2):
Block medium threshold and above.
BLOCK_ONLY_HIGH (3):
Block only high threshold (i.e. block less).
BLOCK_NONE (4):
Block none.
class vertexai.generative_models.HarmCategory(value)
Bases: proto.enums.Enum
Harm categories that will block the content.
Values:
HARM_CATEGORY_UNSPECIFIED (0):
The harm category is unspecified.
HARM_CATEGORY_HATE_SPEECH (1):
The harm category is hate speech.
HARM_CATEGORY_DANGEROUS_CONTENT (2):
The harm category is dangerous content.
HARM_CATEGORY_HARASSMENT (3):
The harm category is harassment.
HARM_CATEGORY_SEXUALLY_EXPLICIT (4):
The harm category is sexually explicit
content.
class vertexai.generative_models.Image()
Bases: object
The image that can be sent to a generative model.
property data(: [bytes](https://python.readthedocs.io/en/latest/library/stdtypes.html#bytes )
Returns the image data.
static from_bytes(data: bytes)
Loads image from image bytes.
Parameters
data – Image bytes.
Returns
Loaded image as an Image object.
static load_from_file(location: str)
Loads image from file.
Parameters
location – Local path from where to load the image.
Returns
Loaded image as an Image object.
class vertexai.generative_models.Part()
Bases: object
A part of a multi-part Content message.
Usage:
```python
``
```
\`
text_part = Part.from_text(“Why is sky blue?”)
image_part = Part.from_image(Image.load_from_file(“image.jpg”))
video_part = Part.from_uri(uri=”gs://…/video.mp4”, mime_type=”video/mp4”)
function_response_part = Part.from_function_response(
> name=”get_current_weather”,
> response={
> > “content”: {“weather_there”: “super nice”},
> }
)
response1 = model.generate_content([text_part, image_part])
response2 = model.generate_content(video_part)
response3 = chat.send_message(function_response_part)
```python
``
```
```python
`
```
exception vertexai.generative_models.ResponseValidationError(message: str, request_contents: List[vertexai.generative_models._generative_models.Content], responses: List[vertexai.generative_models._generative_models.GenerationResponse])
Bases: vertexai.generative_models._generative_models.ResponseBlockedError
with_traceback()
Exception.with_traceback(tb) – set self.traceback to tb and return self.
class vertexai.generative_models.SafetySetting(*, category: google.cloud.aiplatform_v1beta1.types.content.HarmCategory, threshold: google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold, method: Optional[google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockMethod] = None)
Bases: object
Parameters for the generation.
Safety settings.
Parameters
category – Harm category.
threshold – The harm block threshold.
method – Specify if the threshold is used for probability or severity score. If not specified, the threshold is used for probability score.
class HarmBlockMethod(value)
Bases: proto.enums.Enum
Probability vs severity.
Values:
HARM_BLOCK_METHOD_UNSPECIFIED (0):
The harm block method is unspecified.
SEVERITY (1):
The harm block method uses both probability
and severity scores.
PROBABILITY (2):
The harm block method uses the probability
score.
class HarmBlockThreshold(value)
Bases: proto.enums.Enum
Probability based thresholds levels for blocking.
Values:
HARM_BLOCK_THRESHOLD_UNSPECIFIED (0):
Unspecified harm block threshold.
BLOCK_LOW_AND_ABOVE (1):
Block low threshold and above (i.e. block
more).
BLOCK_MEDIUM_AND_ABOVE (2):
Block medium threshold and above.
BLOCK_ONLY_HIGH (3):
Block only high threshold (i.e. block less).
BLOCK_NONE (4):
Block none.
class HarmCategory(value)
Bases: proto.enums.Enum
Harm categories that will block the content.
Values:
HARM_CATEGORY_UNSPECIFIED (0):
The harm category is unspecified.
HARM_CATEGORY_HATE_SPEECH (1):
The harm category is hate speech.
HARM_CATEGORY_DANGEROUS_CONTENT (2):
The harm category is dangerous content.
HARM_CATEGORY_HARASSMENT (3):
The harm category is harassment.
HARM_CATEGORY_SEXUALLY_EXPLICIT (4):
The harm category is sexually explicit
content.
class vertexai.generative_models.Tool(function_declarations: List[vertexai.generative_models._generative_models.FunctionDeclaration])
Bases: object
A collection of functions that the model may use to generate response.
Usage:
Create tool from function declarations:
```python
``
```
\`
get_current_weather_func = generative_models.FunctionDeclaration(…)
weather_tool = generative_models.Tool(
> function_declarations=[get_current_weather_func],
Use tool in GenerativeModel.generate_content:
```python
``
```
\`
model = GenerativeModel(“gemini-pro”)
print(model.generate_content(
> “What is the weather like in Boston?”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
Use tool in chat:
```python
``
```
\`
model = GenerativeModel(
> “gemini-pro”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
)
chat = model.start_chat()
print(chat.send_message(“What is the weather like in Boston?”))
print(chat.send_message(
> Part.from_function_response(
> name=”get_current_weather”,
> response={
> > “content”: {“weather_there”: “super nice”},
> }
> ),
Classes for working with the Gemini models.
class vertexai.preview.generative_models.AutomaticFunctionCallingResponder(max_automatic_function_calls: int = 1)
Bases: object
Responder that automatically responds to model’s function calls.
Initializes the responder.
Parameters
max_automatic_function_calls – Maximum number of automatic function calls.
class vertexai.preview.generative_models.CallableFunctionDeclaration(name: str, function: Callable[[...], Any], parameters: Dict[str, Any], description: Optional[str] = None)
Bases: vertexai.generative_models._generative_models.FunctionDeclaration
A function declaration plus a function.
Constructs a FunctionDeclaration.
Parameters
name – The name of the function that the model can call.
parameters – Describes the parameters to this function in JSON Schema Object format.
description – Description and purpose of the function. Model uses it to decide how and whether to call the function.
classmethod from_func(func: Callable[[...], Any])
Automatically creates a CallableFunctionDeclaration from a Python function.
The function parameter schema is automatically extracted. :param func: The function from which to extract schema.
Returns
CallableFunctionDeclaration.
class vertexai.preview.generative_models.Candidate()
Bases: object
A response candidate generated by the model.
class vertexai.preview.generative_models.ChatSession(model: vertexai.generative_models._generative_models._GenerativeModel, *, history: Optional[List[vertexai.generative_models._generative_models.Content]] = None, response_validation: bool = True, responder: Optional[vertexai.generative_models._generative_models.AutomaticFunctionCallingResponder] = None, raise_on_blocked: Optional[bool] = None)
Bases: vertexai.generative_models._generative_models._PreviewChatSession
Chat session holds the chat history.
send_message(content: Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]], *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, stream: bool = False)
Generates content.
Parameters
content – Content to send to the model. Supports a value that can be converted to a Part or a list of such values. Supports * str, Image, Part, * List[Union[str, Image, Part]],
generation_config – Parameters for the generation.
safety_settings – Safety settings as a mapping from HarmCategory to HarmBlockThreshold.
tools – A list of tools (functions) that the model can try calling.
stream – Whether to stream the response.
Returns
A single GenerationResponse object if stream == False A stream of GenerationResponse objects if stream == True
Raises
ResponseValidationError – If the response was blocked or is incomplete.
send_message_async(content: Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]], *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, stream: bool = False)
Generates content asynchronously.
Parameters
content – Content to send to the model. Supports a value that can be converted to a Part or a list of such values. Supports * str, Image, Part, * List[Union[str, Image, Part]],
generation_config – Parameters for the generation.
safety_settings – Safety settings as a mapping from HarmCategory to HarmBlockThreshold.
tools – A list of tools (functions) that the model can try calling.
stream – Whether to stream the response.
Returns
An awaitable for a single GenerationResponse object if stream == False An awaitable for a stream of GenerationResponse objects if stream == True
Raises
ResponseValidationError – If the response was blocked or is incomplete.
class vertexai.preview.generative_models.Content(*, parts: Optional[List[vertexai.generative_models._generative_models.Part]] = None, role: Optional[str] = None)
Bases: object
The multi-part content of a message.
Usage:
```python
``
```
\`
response = model.generate_content(contents=[
> Content(role=”user”, parts=[Part.from_text(“Why is sky blue?”)])
class vertexai.preview.generative_models.FinishReason(value)
Bases: proto.enums.Enum
The reason why the model stopped generating tokens. If empty, the model has not stopped generating the tokens.
Values:
FINISH_REASON_UNSPECIFIED (0):
The finish reason is unspecified.
STOP (1):
Natural stop point of the model or provided
stop sequence.
MAX_TOKENS (2):
The maximum number of tokens as specified in
the request was reached.
SAFETY (3):
The token generation was stopped as the
response was flagged for safety reasons. NOTE:
When streaming the Candidate.content will be
empty if content filters blocked the output.
RECITATION (4):
The token generation was stopped as the
response was flagged for unauthorized citations.
OTHER (5):
All other reasons that stopped the token
generation
BLOCKLIST (6):
The token generation was stopped as the
response was flagged for the terms which are
included from the terminology blocklist.
PROHIBITED_CONTENT (7):
The token generation was stopped as the
response was flagged for the prohibited
contents.
SPII (8):
The token generation was stopped as the
response was flagged for Sensitive Personally
Identifiable Information (SPII) contents.
class vertexai.preview.generative_models.FunctionDeclaration(*, name: str, parameters: Dict[str, Any], description: Optional[str] = None)
Bases: object
A representation of a function declaration.
Usage:
Create function declaration and tool:
```python
``
```
\`
get_current_weather_func = generative_models.FunctionDeclaration(
> name=”get_current_weather”,
> description=”Get the current weather in a given location”,
> parameters={
> > “type”: “object”,
> > “properties”: {
> > > “location”: {
> > > “type”: “string”,
> > > “description”: “The city and state, e.g. San Francisco, CA”
> > > },
> > > “unit”: {
> > > > “type”: “string”,
> > > > “enum”: [
> > > > > “celsius”,
> > > > > “fahrenheit”,
> > > > ]
> > > }
> > },
> > “required”: [
> > > “location”
> > ]
> },
)
weather_tool = generative_models.Tool(
> function_declarations=[get_current_weather_func],
Use tool in GenerativeModel.generate_content:
```python
``
```
\`
model = GenerativeModel(“gemini-pro”)
print(model.generate_content(
> “What is the weather like in Boston?”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
Use tool in chat:
```python
``
```
\`
model = GenerativeModel(
> “gemini-pro”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
)
chat = model.start_chat()
print(chat.send_message(“What is the weather like in Boston?”))
print(chat.send_message(
> Part.from_function_response(
> name=”get_current_weather”,
> response={
> > “content”: {“weather_there”: “super nice”},
> }
> ),
Constructs a FunctionDeclaration.
Parameters
name – The name of the function that the model can call.
parameters – Describes the parameters to this function in JSON Schema Object format.
description – Description and purpose of the function. Model uses it to decide how and whether to call the function.
class vertexai.preview.generative_models.GenerationConfig(*, temperature: Optional[float] = None, top_p: Optional[float] = None, top_k: Optional[int] = None, candidate_count: Optional[int] = None, max_output_tokens: Optional[int] = None, stop_sequences: Optional[List[str]] = None)
Bases: object
Parameters for the generation.
Constructs a GenerationConfig object.
Parameters
temperature – Controls the randomness of predictions. Range: [0.0, 1.0]
top_p – If specified, nucleus sampling will be used. Range: (0.0, 1.0]
top_k – If specified, top-k sampling will be used.
candidate_count – Number of candidates to generate.
max_output_tokens – The maximum number of output tokens to generate per message.
stop_sequences – A list of stop sequences.
Usage:
```python
``
```
\`
response = model.generate_content(
> “Why is sky blue?”,
> generation_config=GenerationConfig(
> > temperature=0.1,
> > top_p=0.95,
> > top_k=20,
> > candidate_count=1,
> > max_output_tokens=100,
> > stop_sequences=[”nnn”],
> )
class vertexai.preview.generative_models.GenerationResponse()
Bases: object
The response from the model.
class vertexai.preview.generative_models.GenerativeModel(model_name: str, *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, tool_config: Optional[vertexai.generative_models._generative_models.ToolConfig] = None, system_instruction: Optional[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]]] = None)
Bases: vertexai.preview.generative_models._PreviewGenerativeModel
Initializes GenerativeModel.
Usage:
`\`
model = GenerativeModel("gemini-pro")
print(model.generate_content("Hello"))
\``
Parameters
model_name – Model Garden model resource name. Alternatively, a tuned model endpoint resource name can be provided.
generation_config – Default generation config to use in generate_content.
safety_settings – Default safety settings to use in generate_content.
tools – Default tools to use in generate_content.
tool_config – Default tool config to use in generate_content.
system_instruction – Default system instruction to use in generate_content. Note: Only text should be used in parts. Content of each part will become a separate paragraph.
count_tokens(contents: Union[List[vertexai.generative_models._generative_models.Content], List[Dict[str, Any]], str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]])
Counts tokens.
Parameters
contents – Contents to send to the model. Supports either a list of Content objects (passing a multi-turn conversation) or a value that can be converted to a single Content object (passing a single message). Supports * str, Image, Part, * List[Union[str, Image, Part]], * List[Content]
Returns
total_tokens: The total number of tokens counted across all instances from the request. total_billable_characters: The total number of billable characters counted across all instances from the request.
Return type
A CountTokensResponse object that has the following attributes
async count_tokens_async(contents: Union[List[vertexai.generative_models._generative_models.Content], List[Dict[str, Any]], str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]])
Counts tokens asynchronously.
Parameters
contents – Contents to send to the model. Supports either a list of Content objects (passing a multi-turn conversation) or a value that can be converted to a single Content object (passing a single message). Supports * str, Image, Part, * List[Union[str, Image, Part]], * List[Content]
Returns
total_tokens: The total number of tokens counted across all instances from the request. total_billable_characters: The total number of billable characters counted across all instances from the request.
Return type
And awaitable for a CountTokensResponse object that has the following attributes
generate_content(contents: Union[List[vertexai.generative_models._generative_models.Content], List[Dict[str, Any]], str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]], *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, tool_config: Optional[vertexai.generative_models._generative_models.ToolConfig] = None, stream: bool = False)
Generates content.
Parameters
contents – Contents to send to the model. Supports either a list of Content objects (passing a multi-turn conversation) or a value that can be converted to a single Content object (passing a single message). Supports * str, Image, Part, * List[Union[str, Image, Part]], * List[Content]
generation_config – Parameters for the generation.
safety_settings – Safety settings as a mapping from HarmCategory to HarmBlockThreshold.
tools – A list of tools (functions) that the model can try calling.
tool_config – Config shared for all tools provided in the request.
stream – Whether to stream the response.
Returns
A single GenerationResponse object if stream == False A stream of GenerationResponse objects if stream == True
async generate_content_async(contents: Union[List[vertexai.generative_models._generative_models.Content], List[Dict[str, Any]], str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part, List[Union[str, vertexai.generative_models._generative_models.Image, vertexai.generative_models._generative_models.Part]]], *, generation_config: Optional[Union[vertexai.generative_models._generative_models.GenerationConfig, Dict[str, Any]]] = None, safety_settings: Optional[Union[List[vertexai.generative_models._generative_models.SafetySetting], Dict[google.cloud.aiplatform_v1beta1.types.content.HarmCategory, google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold]]] = None, tools: Optional[List[vertexai.generative_models._generative_models.Tool]] = None, tool_config: Optional[vertexai.generative_models._generative_models.ToolConfig] = None, stream: bool = False)
Generates content asynchronously.
Parameters
contents – Contents to send to the model. Supports either a list of Content objects (passing a multi-turn conversation) or a value that can be converted to a single Content object (passing a single message). Supports * str, Image, Part, * List[Union[str, Image, Part]], * List[Content]
generation_config – Parameters for the generation.
safety_settings – Safety settings as a mapping from HarmCategory to HarmBlockThreshold.
tools – A list of tools (functions) that the model can try calling.
tool_config – Config shared for all tools provided in the request.
stream – Whether to stream the response.
Returns
An awaitable for a single GenerationResponse object if stream == False An awaitable for a stream of GenerationResponse objects if stream == True
start_chat(*, history: Optional[List[vertexai.generative_models._generative_models.Content]] = None, response_validation: bool = True, responder: Optional[vertexai.generative_models._generative_models.AutomaticFunctionCallingResponder] = None)
Creates a stateful chat session.
Parameters
history – Previous history to initialize the chat session.
response_validation – Whether to validate responses before adding them to chat history. By default, send_message will raise error if the request or response is blocked or if the response is incomplete due to going over the max token limit. If set to False, the chat session history will always accumulate the request and response messages even if the response if blocked or incomplete. This can result in an unusable chat session state.
responder – An responder object that can automatically respond to some model messages. Supported responder classes: AutomaticFunctionCallingResponder.
Returns
A ChatSession object.
class vertexai.preview.generative_models.HarmBlockThreshold(value)
Bases: proto.enums.Enum
Probability based thresholds levels for blocking.
Values:
HARM_BLOCK_THRESHOLD_UNSPECIFIED (0):
Unspecified harm block threshold.
BLOCK_LOW_AND_ABOVE (1):
Block low threshold and above (i.e. block
more).
BLOCK_MEDIUM_AND_ABOVE (2):
Block medium threshold and above.
BLOCK_ONLY_HIGH (3):
Block only high threshold (i.e. block less).
BLOCK_NONE (4):
Block none.
class vertexai.preview.generative_models.HarmCategory(value)
Bases: proto.enums.Enum
Harm categories that will block the content.
Values:
HARM_CATEGORY_UNSPECIFIED (0):
The harm category is unspecified.
HARM_CATEGORY_HATE_SPEECH (1):
The harm category is hate speech.
HARM_CATEGORY_DANGEROUS_CONTENT (2):
The harm category is dangerous content.
HARM_CATEGORY_HARASSMENT (3):
The harm category is harassment.
HARM_CATEGORY_SEXUALLY_EXPLICIT (4):
The harm category is sexually explicit
content.
class vertexai.preview.generative_models.Image()
Bases: object
The image that can be sent to a generative model.
property data(: [bytes](https://python.readthedocs.io/en/latest/library/stdtypes.html#bytes )
Returns the image data.
static from_bytes(data: bytes)
Loads image from image bytes.
Parameters
data – Image bytes.
Returns
Loaded image as an Image object.
static load_from_file(location: str)
Loads image from file.
Parameters
location – Local path from where to load the image.
Returns
Loaded image as an Image object.
class vertexai.preview.generative_models.Part()
Bases: object
A part of a multi-part Content message.
Usage:
```python
``
```
\`
text_part = Part.from_text(“Why is sky blue?”)
image_part = Part.from_image(Image.load_from_file(“image.jpg”))
video_part = Part.from_uri(uri=”gs://…/video.mp4”, mime_type=”video/mp4”)
function_response_part = Part.from_function_response(
> name=”get_current_weather”,
> response={
> > “content”: {“weather_there”: “super nice”},
> }
)
response1 = model.generate_content([text_part, image_part])
response2 = model.generate_content(video_part)
response3 = chat.send_message(function_response_part)
```python
``
```
```python
`
```
exception vertexai.preview.generative_models.ResponseBlockedError(message: str, request_contents: List[vertexai.generative_models._generative_models.Content], responses: List[vertexai.generative_models._generative_models.GenerationResponse])
Bases: Exception
with_traceback()
Exception.with_traceback(tb) – set self.traceback to tb and return self.
exception vertexai.preview.generative_models.ResponseValidationError(message: str, request_contents: List[vertexai.generative_models._generative_models.Content], responses: List[vertexai.generative_models._generative_models.GenerationResponse])
Bases: vertexai.generative_models._generative_models.ResponseBlockedError
with_traceback()
Exception.with_traceback(tb) – set self.traceback to tb and return self.
class vertexai.preview.generative_models.SafetySetting(*, category: google.cloud.aiplatform_v1beta1.types.content.HarmCategory, threshold: google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockThreshold, method: Optional[google.cloud.aiplatform_v1beta1.types.content.SafetySetting.HarmBlockMethod] = None)
Bases: object
Parameters for the generation.
Safety settings.
Parameters
category – Harm category.
threshold – The harm block threshold.
method – Specify if the threshold is used for probability or severity score. If not specified, the threshold is used for probability score.
class HarmBlockMethod(value)
Bases: proto.enums.Enum
Probability vs severity.
Values:
HARM_BLOCK_METHOD_UNSPECIFIED (0):
The harm block method is unspecified.
SEVERITY (1):
The harm block method uses both probability
and severity scores.
PROBABILITY (2):
The harm block method uses the probability
score.
class HarmBlockThreshold(value)
Bases: proto.enums.Enum
Probability based thresholds levels for blocking.
Values:
HARM_BLOCK_THRESHOLD_UNSPECIFIED (0):
Unspecified harm block threshold.
BLOCK_LOW_AND_ABOVE (1):
Block low threshold and above (i.e. block
more).
BLOCK_MEDIUM_AND_ABOVE (2):
Block medium threshold and above.
BLOCK_ONLY_HIGH (3):
Block only high threshold (i.e. block less).
BLOCK_NONE (4):
Block none.
class HarmCategory(value)
Bases: proto.enums.Enum
Harm categories that will block the content.
Values:
HARM_CATEGORY_UNSPECIFIED (0):
The harm category is unspecified.
HARM_CATEGORY_HATE_SPEECH (1):
The harm category is hate speech.
HARM_CATEGORY_DANGEROUS_CONTENT (2):
The harm category is dangerous content.
HARM_CATEGORY_HARASSMENT (3):
The harm category is harassment.
HARM_CATEGORY_SEXUALLY_EXPLICIT (4):
The harm category is sexually explicit
content.
class vertexai.preview.generative_models.Tool(function_declarations: List[vertexai.generative_models._generative_models.FunctionDeclaration])
Bases: object
A collection of functions that the model may use to generate response.
Usage:
Create tool from function declarations:
```python
``
```
\`
get_current_weather_func = generative_models.FunctionDeclaration(…)
weather_tool = generative_models.Tool(
> function_declarations=[get_current_weather_func],
Use tool in GenerativeModel.generate_content:
```python
``
```
\`
model = GenerativeModel(“gemini-pro”)
print(model.generate_content(
> “What is the weather like in Boston?”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
Use tool in chat:
```python
``
```
\`
model = GenerativeModel(
> “gemini-pro”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
)
chat = model.start_chat()
print(chat.send_message(“What is the weather like in Boston?”))
print(chat.send_message(
> Part.from_function_response(
> name=”get_current_weather”,
> response={
> > “content”: {“weather_there”: “super nice”},
> }
> ),
class vertexai.preview.generative_models.ToolConfig(function_calling_config: vertexai.generative_models._generative_models.ToolConfig.FunctionCallingConfig)
Bases: object
Config shared for all tools provided in the request.
Usage:
Create ToolConfig
```python
``
```
\`
tool_config = ToolConfig(
> function_calling_config=ToolConfig.FunctionCallingConfig(
> mode=ToolConfig.FunctionCallingConfig.Mode.ANY,
> allowed_function_names=[“get_current_weather_func”],
Use ToolConfig in GenerativeModel.generate_content:
```python
``
```
\`
model = GenerativeModel(“gemini-pro”)
print(model.generate_content(
> “What is the weather like in Boston?”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
> tool_config=tool_config,
Use ToolConfig in chat:
```python
``
```
\`
model = GenerativeModel(
> “gemini-pro”,
> # You can specify tools when creating a model to avoid having to send them with every request.
> tools=[weather_tool],
> tool_config=tool_config,
)
chat = model.start_chat()
print(chat.send_message(“What is the weather like in Boston?”))
print(chat.send_message(
> Part.from_function_response(
> name=”get_current_weather”,
> response={
> > “content”: {“weather_there”: “super nice”},
> }
> ),
class vertexai.preview.generative_models.grounding()
Bases: object
Grounding namespace.
class GoogleSearchRetrieval(disable_attribution: Optional[bool] = None)
Bases: object
Tool to retrieve public web data for grounding, powered by Google Search.
disable_attribution()
Optional. Disable using the result from this tool in detecting grounding attribution. This does not affect how the result is given to the model for generation.
Type
Initializes a Google Search Retrieval tool.
Parameters
disable_attribution (bool) – Optional. Disable using the result from this tool in detecting grounding attribution. This does not affect how the result is given to the model for generation.
class Retrieval(source: vertexai.generative_models._generative_models.grounding.VertexAISearch, disable_attribution: Optional[bool] = None)
Bases: object
Defines a retrieval tool that model can call to access external knowledge.
Initializes a Retrieval tool.
Parameters
source (VertexAISearch) – Set to use data source powered by Vertex AI Search.
disable_attribution (bool) – Optional. Disable using the result from this tool in detecting grounding attribution. This does not affect how the result is given to the model for generation.
class VertexAISearch(datastore: str)
Bases: object
Retrieve from Vertex AI Search datastore for grounding. See https://cloud.google.com/vertex-ai-search-and-conversation
Initializes a Vertex AI Search tool.
Parameters
datastore (str) – Required. Fully-qualified Vertex AI Search’s datastore resource ID. projects/<>/locations/<>/collections/<>/dataStores/<>
Classes for working with language models.
class vertexai.language_models.ChatMessage(content: str, author: str)
Bases: object
A chat message.
content()
Content of the message.
Type
author()
Author of the message.
Type
class vertexai.language_models.ChatModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai.language_models._language_models._ChatModelBase
, vertexai.language_models._language_models._TunableChatModelMixin
, vertexai.language_models._language_models._RlhfTunableModelMixin
ChatModel represents a language model that is capable of chat.
Examples:
chat_model = ChatModel.from_pretrained("chat-bison@001")
chat = chat_model.start_chat(
context="My name is Ned. You are my personal assistant. My favorite movies are Lord of the Rings and Hobbit.",
examples=[
InputOutputTextPair(
input_text="Who do you work for?",
output_text="I work for Ned.",
),
InputOutputTextPair(
input_text="What do I like?",
output_text="Ned likes watching movies.",
),
],
temperature=0.3,
)
chat.send_message("Do you know any cool events this weekend?")
Creates a LanguageModel.
This constructor should not be called directly. Use LanguageModel.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Vertex LLM. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
classmethod get_tuned_model(tuned_model_name: str)
Loads the specified tuned language model.
list_tuned_model_names()
Lists the names of tuned models.
Returns
A list of tuned models that can be used with the get_tuned_model method.
start_chat(*, context: Optional[str] = None, examples: Optional[List[vertexai.language_models.InputOutputTextPair]] = None, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, message_history: Optional[List[vertexai.language_models.ChatMessage]] = None, stop_sequences: Optional[List[str]] = None)
Starts a chat session with the model.
Parameters
context – Context shapes how the model responds throughout the conversation. For example, you can use context to specify words the model can or cannot use, topics to focus on or avoid, or the response format or style
examples – List of structured messages to the model to learn how to respond to the conversation. A list of InputOutputTextPair objects.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024].
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0.
top_k – The number of highest probability vocabulary tokens to keep for top-k-filtering. Range: [1, 40]. Default: 40.
top_p – The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling. Range: [0, 1]. Default: 0.95.
message_history – A list of previously sent and received messages.
stop_sequences – Customized stop sequences to stop the decoding process.
Returns
A ChatSession object.
tune_model(training_data: Union[str, pandas.core.frame.DataFrame], *, train_steps: Optional[int] = None, learning_rate_multiplier: Optional[float] = None, tuning_job_location: Optional[str] = None, tuned_model_location: Optional[str] = None, model_display_name: Optional[str] = None, default_context: Optional[str] = None, accelerator_type: Optional[Literal['TPU', 'GPU']] = None, tuning_evaluation_spec: Optional[TuningEvaluationSpec] = None)
Tunes a model based on training data.
This method launches and returns an asynchronous model tuning job.
Usage:
\
tuning_job = model.tune_model(...)
... do some other work
tuned_model = tuning_job.get_tuned_model() # Blocks until tuning is complete
``
Parameters
training_data – A Pandas DataFrame or a URI pointing to data in JSON lines format. The dataset schema is model-specific. See https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models#dataset_format
train_steps – Number of training batches to tune on (batch size is 8 samples).
learning_rate – Deprecated. Use learning_rate_multiplier instead. Learning rate to use in tuning.
learning_rate_multiplier – Learning rate multiplier to use in tuning.
tuning_job_location – GCP location where the tuning job should be run.
tuned_model_location – GCP location where the tuned model should be deployed.
model_display_name – Custom display name for the tuned model.
default_context – The context to use for all training samples by default.
accelerator_type – Type of accelerator to use. Can be “TPU” or “GPU”.
tuning_evaluation_spec – Specification for the model evaluation during tuning.
Returns
A LanguageModelTuningJob object that represents the tuning job. Calling job.result() blocks until the tuning is complete and returns a LanguageModel object.
Raises
ValueError – If the “tuning_job_location” value is not supported
ValueError – If the “tuned_model_location” value is not supported
RuntimeError – If the model does not support tuning
AttributeError – If any attribute in the “tuning_evaluation_spec” is not supported
tune_model_rlhf(*, prompt_data: Union[str, pandas.core.frame.DataFrame], preference_data: Union[str, pandas.core.frame.DataFrame], model_display_name: Optional[str] = None, prompt_sequence_length: Optional[int] = None, target_sequence_length: Optional[int] = None, reward_model_learning_rate_multiplier: Optional[float] = None, reinforcement_learning_rate_multiplier: Optional[float] = None, reward_model_train_steps: Optional[int] = None, reinforcement_learning_train_steps: Optional[int] = None, kl_coeff: Optional[float] = None, default_context: Optional[str] = None, tuning_job_location: Optional[str] = None, accelerator_type: Optional[Literal['TPU', 'GPU']] = None, tuning_evaluation_spec: Optional[TuningEvaluationSpec] = None)
Tunes a model using reinforcement learning from human feedback.
This method launches and returns an asynchronous model tuning job.
Usage:
\
tuning_job = model.tune_model_rlhf(...)
... do some other work
tuned_model = tuning_job.get_tuned_model() # Blocks until tuning is complete
``
Parameters
prompt_data – A Pandas DataFrame or a URI pointing to data in JSON lines format. The dataset schema is model-specific. See https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-text-models-rlhf#prompt-dataset
preference_data – A Pandas DataFrame or a URI pointing to data in JSON lines format. The dataset schema is model-specific. See https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-text-models-rlhf#human-preference-dataset
model_display_name – Custom display name for the tuned model. If not provided, a default name will be created.
prompt_sequence_length – Maximum tokenized sequence length for input text. Higher values increase memory overhead. This value should be at most 8192. Default value is 512.
target_sequence_length – Maximum tokenized sequence length for target text. Higher values increase memory overhead. This value should be at most 1024. Default value is 64.
reward_model_learning_rate_multiplier – Constant used to adjust the base learning rate used when training a reward model. Multiply by a number > 1 to increase the magnitude of updates applied at each training step or multiply by a number < 1 to decrease the magnitude of updates. Default value is 1.0.
reinforcement_learning_rate_multiplier – Constant used to adjust the base learning rate used during reinforcement learning. Multiply by a number > 1 to increase the magnitude of updates applied at each training step or multiply by a number < 1 to decrease the magnitude of updates. Default value is 1.0.
reward_model_train_steps – Number of steps to use when training a reward model. Default value is 1000.
reinforcement_learning_train_steps – Number of reinforcement learning steps to perform when tuning a base model. Default value is 1000.
kl_coeff – Coefficient for KL penalty. This regularizes the policy model and penalizes if it diverges from its initial distribution. If set to 0, the reference language model is not loaded into memory. Default value is 0.1.
default_context – This field lets the model know what task to perform. Base models have been trained over a large set of varied instructions. You can give a simple and intuitive description of the task and the model will follow it, e.g. “Classify this movie review as positive or negative” or “Translate this sentence to Danish”. Do not specify this if your dataset already prepends the instruction to the inputs field.
tuning_job_location – GCP location where the tuning job should be run.
accelerator_type – Type of accelerator to use. Can be “TPU” or “GPU”.
tuning_evaluation_spec – Evaluation settings to use during tuning.
Returns
A LanguageModelTuningJob object that represents the tuning job. Calling job.result() blocks until the tuning is complete and returns a LanguageModel object.
Raises
ValueError – If the “tuning_job_location” value is not supported
RuntimeError – If the model does not support tuning
class vertexai.language_models.ChatSession(model: vertexai.language_models.ChatModel, context: Optional[str] = None, examples: Optional[List[vertexai.language_models.InputOutputTextPair]] = None, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, message_history: Optional[List[vertexai.language_models.ChatMessage]] = None, stop_sequences: Optional[List[str]] = None)
Bases: vertexai.language_models._language_models._ChatSessionBase
ChatSession represents a chat session with a language model.
Within a chat session, the model keeps context and remembers the previous conversation.
property message_history(: List[vertexai.language_models.ChatMessage )
List of previous messages.
send_message(message: str, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, stop_sequences: Optional[List[str]] = None, candidate_count: Optional[int] = None, grounding_source: Optional[Union[vertexai.language_models._language_models.WebSearch, vertexai.language_models._language_models.VertexAISearch, vertexai.language_models._language_models.InlineContext]] = None)
Sends message to the language model and gets a response.
Parameters
message – Message to send to the model
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024]. Uses the value specified when calling ChatModel.start_chat by default.
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0. Uses the value specified when calling ChatModel.start_chat by default.
top_k – The number of highest probability vocabulary tokens to keep for top-k-filtering. Range: [1, 40]. Default: 40. Uses the value specified when calling ChatModel.start_chat by default.
top_p – The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling. Range: [0, 1]. Default: 0.95. Uses the value specified when calling ChatModel.start_chat by default.
stop_sequences – Customized stop sequences to stop the decoding process.
candidate_count – Number of candidates to return.
grounding_source – If specified, grounding feature will be enabled using the grounding source. Default: None.
Returns
A MultiCandidateTextGenerationResponse object that contains the text produced by the model.
async send_message_async(message: str, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, stop_sequences: Optional[List[str]] = None, candidate_count: Optional[int] = None, grounding_source: Optional[Union[vertexai.language_models._language_models.WebSearch, vertexai.language_models._language_models.VertexAISearch, vertexai.language_models._language_models.InlineContext]] = None)
Asynchronously sends message to the language model and gets a response.
Parameters
message – Message to send to the model
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024]. Uses the value specified when calling ChatModel.start_chat by default.
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0. Uses the value specified when calling ChatModel.start_chat by default.
top_k – The number of highest probability vocabulary tokens to keep for top-k-filtering. Range: [1, 40]. Default: 40. Uses the value specified when calling ChatModel.start_chat by default.
top_p – The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling. Range: [0, 1]. Default: 0.95. Uses the value specified when calling ChatModel.start_chat by default.
stop_sequences – Customized stop sequences to stop the decoding process.
candidate_count – Number of candidates to return.
grounding_source – If specified, grounding feature will be enabled using the grounding source. Default: None.
Returns
A MultiCandidateTextGenerationResponse object that contains the text produced by the model.
send_message_streaming(message: str, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, stop_sequences: Optional[List[str]] = None)
Sends message to the language model and gets a streamed response.
The response is only added to the history once it’s fully read.
Parameters
message – Message to send to the model
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024]. Uses the value specified when calling ChatModel.start_chat by default.
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0. Uses the value specified when calling ChatModel.start_chat by default.
top_k – The number of highest probability vocabulary tokens to keep for top-k-filtering. Range: [1, 40]. Default: 40. Uses the value specified when calling ChatModel.start_chat by default.
top_p – The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling. Range: [0, 1]. Default: 0.95. Uses the value specified when calling ChatModel.start_chat by default.
stop_sequences – Customized stop sequences to stop the decoding process. Uses the value specified when calling ChatModel.start_chat by default.
Yields
A stream of TextGenerationResponse objects that contain partial responses produced by the model.
async send_message_streaming_async(message: str, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, stop_sequences: Optional[List[str]] = None)
Asynchronously sends message to the language model and gets a streamed response.
The response is only added to the history once it’s fully read.
Parameters
message – Message to send to the model
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024]. Uses the value specified when calling ChatModel.start_chat by default.
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0. Uses the value specified when calling ChatModel.start_chat by default.
top_k – The number of highest probability vocabulary tokens to keep for top-k-filtering. Range: [1, 40]. Default: 40. Uses the value specified when calling ChatModel.start_chat by default.
top_p – The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling. Range: [0, 1]. Default: 0.95. Uses the value specified when calling ChatModel.start_chat by default.
stop_sequences – Customized stop sequences to stop the decoding process. Uses the value specified when calling ChatModel.start_chat by default.
Yields
A stream of TextGenerationResponse objects that contain partial responses produced by the model.
class vertexai.language_models.CodeChatModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai.language_models._language_models._ChatModelBase
, vertexai.language_models._language_models._TunableChatModelMixin
CodeChatModel represents a model that is capable of completing code.
Examples
code_chat_model = CodeChatModel.from_pretrained(”codechat-bison@001”)
code_chat = code_chat_model.start_chat(
context=”I’m writing a large-scale enterprise application.”,
max_output_tokens=128,
temperature=0.2,
)
code_chat.send_message(“Please help write a function to calculate the min of two numbers”)
Creates a LanguageModel.
This constructor should not be called directly. Use LanguageModel.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Vertex LLM. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
classmethod get_tuned_model(tuned_model_name: str)
Loads the specified tuned language model.
list_tuned_model_names()
Lists the names of tuned models.
Returns
A list of tuned models that can be used with the get_tuned_model method.
start_chat(*, context: Optional[str] = None, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, message_history: Optional[List[vertexai.language_models.ChatMessage]] = None, stop_sequences: Optional[List[str]] = None)
Starts a chat session with the code chat model.
Parameters
context – Context shapes how the model responds throughout the conversation. For example, you can use context to specify words the model can or cannot use, topics to focus on or avoid, or the response format or style.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1000].
temperature – Controls the randomness of predictions. Range: [0, 1].
stop_sequences – Customized stop sequences to stop the decoding process.
Returns
A ChatSession object.
tune_model(training_data: Union[str, pandas.core.frame.DataFrame], *, train_steps: Optional[int] = None, learning_rate_multiplier: Optional[float] = None, tuning_job_location: Optional[str] = None, tuned_model_location: Optional[str] = None, model_display_name: Optional[str] = None, default_context: Optional[str] = None, accelerator_type: Optional[Literal['TPU', 'GPU']] = None, tuning_evaluation_spec: Optional[TuningEvaluationSpec] = None)
Tunes a model based on training data.
This method launches and returns an asynchronous model tuning job.
Usage:
\
tuning_job = model.tune_model(...)
... do some other work
tuned_model = tuning_job.get_tuned_model() # Blocks until tuning is complete
``
Parameters
training_data – A Pandas DataFrame or a URI pointing to data in JSON lines format. The dataset schema is model-specific. See https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models#dataset_format
train_steps – Number of training batches to tune on (batch size is 8 samples).
learning_rate – Deprecated. Use learning_rate_multiplier instead. Learning rate to use in tuning.
learning_rate_multiplier – Learning rate multiplier to use in tuning.
tuning_job_location – GCP location where the tuning job should be run.
tuned_model_location – GCP location where the tuned model should be deployed.
model_display_name – Custom display name for the tuned model.
default_context – The context to use for all training samples by default.
accelerator_type – Type of accelerator to use. Can be “TPU” or “GPU”.
tuning_evaluation_spec – Specification for the model evaluation during tuning.
Returns
A LanguageModelTuningJob object that represents the tuning job. Calling job.result() blocks until the tuning is complete and returns a LanguageModel object.
Raises
ValueError – If the “tuning_job_location” value is not supported
ValueError – If the “tuned_model_location” value is not supported
RuntimeError – If the model does not support tuning
AttributeError – If any attribute in the “tuning_evaluation_spec” is not supported
class vertexai.language_models.CodeChatSession(model: vertexai.language_models.CodeChatModel, context: Optional[str] = None, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, message_history: Optional[List[vertexai.language_models.ChatMessage]] = None, stop_sequences: Optional[List[str]] = None)
Bases: vertexai.language_models._language_models._ChatSessionBase
CodeChatSession represents a chat session with code chat language model.
Within a code chat session, the model keeps context and remembers the previous converstion.
property message_history(: List[vertexai.language_models.ChatMessage )
List of previous messages.
send_message(message: str, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, stop_sequences: Optional[List[str]] = None, candidate_count: Optional[int] = None)
Sends message to the code chat model and gets a response.
Parameters
message – Message to send to the model
max_output_tokens – Max length of the output text in tokens. Range: [1, 1000]. Uses the value specified when calling CodeChatModel.start_chat by default.
temperature – Controls the randomness of predictions. Range: [0, 1]. Uses the value specified when calling CodeChatModel.start_chat by default.
stop_sequences – Customized stop sequences to stop the decoding process.
candidate_count – Number of candidates to return.
Returns
A MultiCandidateTextGenerationResponse object that contains the text produced by the model.
async send_message_async(message: str, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, candidate_count: Optional[int] = None)
Asynchronously sends message to the code chat model and gets a response.
Parameters
message – Message to send to the model
max_output_tokens – Max length of the output text in tokens. Range: [1, 1000]. Uses the value specified when calling CodeChatModel.start_chat by default.
temperature – Controls the randomness of predictions. Range: [0, 1]. Uses the value specified when calling CodeChatModel.start_chat by default.
candidate_count – Number of candidates to return.
Returns
A MultiCandidateTextGenerationResponse object that contains the text produced by the model.
send_message_streaming(message: str, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, stop_sequences: Optional[List[str]] = None)
Sends message to the language model and gets a streamed response.
The response is only added to the history once it’s fully read.
Parameters
message – Message to send to the model
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024]. Uses the value specified when calling ChatModel.start_chat by default.
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0. Uses the value specified when calling ChatModel.start_chat by default.
stop_sequences – Customized stop sequences to stop the decoding process. Uses the value specified when calling ChatModel.start_chat by default.
Returns
A stream of TextGenerationResponse objects that contain partial responses produced by the model.
send_message_streaming_async(message: str, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, stop_sequences: Optional[List[str]] = None)
Asynchronously sends message to the language model and gets a streamed response.
The response is only added to the history once it’s fully read.
Parameters
message – Message to send to the model
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024]. Uses the value specified when calling ChatModel.start_chat by default.
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0. Uses the value specified when calling ChatModel.start_chat by default.
stop_sequences – Customized stop sequences to stop the decoding process. Uses the value specified when calling ChatModel.start_chat by default.
Returns
A stream of TextGenerationResponse objects that contain partial responses produced by the model.
class vertexai.language_models.CodeGenerationModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai.language_models._CodeGenerationModel
, vertexai.language_models._language_models._TunableTextModelMixin
, vertexai.language_models._language_models._ModelWithBatchPredict
Creates a LanguageModel.
This constructor should not be called directly. Use LanguageModel.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Vertex LLM. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
batch_predict(*, dataset: Union[str, List[str]], destination_uri_prefix: str, model_parameters: Optional[Dict] = None)
Starts a batch prediction job with the model.
Parameters
dataset – The location of the dataset. gs:// and bq:// URIs are supported.
destination_uri_prefix – The URI prefix for the prediction. gs:// and bq:// URIs are supported.
model_parameters – Model-specific parameters to send to the model.
Returns
A BatchPredictionJob object
Raises
ValueError – When source or destination URI is not supported.
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
classmethod get_tuned_model(tuned_model_name: str)
Loads the specified tuned language model.
list_tuned_model_names()
Lists the names of tuned models.
Returns
A list of tuned models that can be used with the get_tuned_model method.
predict(prefix: str, suffix: Optional[str] = None, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, stop_sequences: Optional[List[str]] = None, candidate_count: Optional[int] = None)
Gets model response for a single prompt.
Parameters
prefix – Code before the current point.
suffix – Code after the current point.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1000].
temperature – Controls the randomness of predictions. Range: [0, 1].
stop_sequences – Customized stop sequences to stop the decoding process.
candidate_count – Number of response candidates to return.
Returns
A MultiCandidateTextGenerationResponse object that contains the text produced by the model.
async predict_async(prefix: str, suffix: Optional[str] = None, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, stop_sequences: Optional[List[str]] = None, candidate_count: Optional[int] = None)
Asynchronously gets model response for a single prompt.
Parameters
prefix – Code before the current point.
suffix – Code after the current point.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1000].
temperature – Controls the randomness of predictions. Range: [0, 1].
stop_sequences – Customized stop sequences to stop the decoding process.
candidate_count – Number of response candidates to return.
Returns
A MultiCandidateTextGenerationResponse object that contains the text produced by the model.
predict_streaming(prefix: str, suffix: Optional[str] = None, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, stop_sequences: Optional[List[str]] = None)
Predicts the code based on previous code.
The result is a stream (generator) of partial responses.
Parameters
prefix – Code before the current point.
suffix – Code after the current point.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1000].
temperature – Controls the randomness of predictions. Range: [0, 1].
stop_sequences – Customized stop sequences to stop the decoding process.
Yields
A stream of TextGenerationResponse objects that contain partial responses produced by the model.
async predict_streaming_async(prefix: str, suffix: Optional[str] = None, *, max_output_tokens: Optional[int] = None, temperature: Optional[float] = None, stop_sequences: Optional[List[str]] = None)
Asynchronously predicts the code based on previous code.
The result is a stream (generator) of partial responses.
Parameters
prefix – Code before the current point.
suffix – Code after the current point.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1000].
temperature – Controls the randomness of predictions. Range: [0, 1].
stop_sequences – Customized stop sequences to stop the decoding process.
Yields
A stream of TextGenerationResponse objects that contain partial responses produced by the model.
tune_model(training_data: Union[str, pandas.core.frame.DataFrame], *, train_steps: Optional[int] = None, learning_rate_multiplier: Optional[float] = None, tuning_job_location: Optional[str] = None, tuned_model_location: Optional[str] = None, model_display_name: Optional[str] = None, tuning_evaluation_spec: Optional[TuningEvaluationSpec] = None, accelerator_type: Optional[Literal['TPU', 'GPU']] = None, max_context_length: Optional[str] = None)
Tunes a model based on training data.
This method launches and returns an asynchronous model tuning job. Usage:
``
` tuning_job = model.tune_model(…) … do some other work tuned_model = tuning_job.get_tuned_model() # Blocks until tuning is complete
Parameters
training_data – A Pandas DataFrame or a URI pointing to data in JSON lines format. The dataset schema is model-specific. See https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models#dataset_format
train_steps – Number of training batches to tune on (batch size is 8 samples).
learning_rate_multiplier – Learning rate multiplier to use in tuning.
tuning_job_location – GCP location where the tuning job should be run.
tuned_model_location – GCP location where the tuned model should be deployed.
model_display_name – Custom display name for the tuned model.
tuning_evaluation_spec – Specification for the model evaluation during tuning.
accelerator_type – Type of accelerator to use. Can be “TPU” or “GPU”.
max_context_length – The max context length used for tuning. Can be either ‘8k’ or ‘32k’
Returns
A LanguageModelTuningJob object that represents the tuning job. Calling job.result() blocks until the tuning is complete and returns a LanguageModel object.
Raises
ValueError – If the “tuning_job_location” value is not supported
ValueError – If the “tuned_model_location” value is not supported
RuntimeError – If the model does not support tuning
class vertexai.language_models.GroundingSource()
Bases: object
class InlineContext(inline_context: str)
Bases: vertexai.language_models._language_models._GroundingSourceBase
InlineContext represents a grounding source using provided inline context. .. attribute:: inline_context
The content used as inline context.
type
str
class VertexAISearch(data_store_id: str, location: str, project: Optional[str] = None, disable_attribution: bool = False)
Bases: vertexai.language_models._language_models._GroundingSourceBase
VertexAISearchDatastore represents a grounding source using Vertex AI Search datastore .. attribute:: data_store_id
Data store ID of the Vertex AI Search datastore.
type
str
location()
GCP multi region where you have set up your Vertex AI Search data store. Possible values can be global, us, eu, etc. Learn more about Vertex AI Search location here: https://cloud.google.com/generative-ai-app-builder/docs/locations
Type
project()
The project where you have set up your Vertex AI Search. If not specified, will assume that your Vertex AI Search is within your current project.
Type
Optional[str]
disable_attribution()
If set to True, skip finding claim attributions (i.e not generate grounding citation). Default: False.
Type
class WebSearch(disable_attribution: bool = False)
Bases: vertexai.language_models._language_models._GroundingSourceBase
WebSearch represents a grounding source using public web search. .. attribute:: disable_attribution
If set to True, skip finding claim attributions (i.e not generate grounding citation). Default: False.
type
bool
class vertexai.language_models.InputOutputTextPair(input_text: str, output_text: str)
Bases: object
InputOutputTextPair represents a pair of input and output texts.
class vertexai.language_models.TextEmbedding(values: List[float], statistics: Optional[vertexai.language_models.TextEmbeddingStatistics] = None, _prediction_response: Optional[google.cloud.aiplatform.models.Prediction] = None)
Bases: object
Text embedding vector and statistics.
class vertexai.language_models.TextEmbeddingInput(text: str, task_type: Optional[str] = None, title: Optional[str] = None)
Bases: object
Structural text embedding input.
text()
The main text content to embed.
Type
task_type()
The name of the downstream task the embeddings will be used for. Valid values: RETRIEVAL_QUERY
Specifies the given text is a query in a search/retrieval setting.
RETRIEVAL_DOCUMENT
Specifies the given text is a document from the corpus being searched.
SEMANTIC_SIMILARITY
Specifies the given text will be used for STS.
CLASSIFICATION
Specifies that the given text will be classified.
CLUSTERING
Specifies that the embeddings will be used for clustering.
QUESTION_ANSWERING
Specifies that the embeddings will be used for question answering.
FACT_VERIFICATION
Specifies that the embeddings will be used for fact verification.
Type
Optional[str]
title()
Optional identifier of the text content.
Type
Optional[str]
class vertexai.language_models.TextEmbeddingModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai.language_models._language_models._LanguageModel
TextEmbeddingModel class calculates embeddings for the given texts.
Examples:
# Getting embedding:
model = TextEmbeddingModel.from_pretrained("textembedding-gecko@001")
embeddings = model.get_embeddings(["What is life?"])
for embedding in embeddings:
vector = embedding.values
print(len(vector))
Creates a LanguageModel.
This constructor should not be called directly. Use LanguageModel.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Vertex LLM. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
get_embeddings(texts: List[Union[str, vertexai.language_models.TextEmbeddingInput]], *, auto_truncate: bool = True, output_dimensionality: Optional[int] = None)
Calculates embeddings for the given texts.
Parameters
texts – A list of texts or TextEmbeddingInput objects to embed.
auto_truncate – Whether to automatically truncate long texts. Default: True.
output_dimensionality – Optional dimensions of embeddings. Range: [1, 768]. Default: None.
Returns
A list of TextEmbedding objects.
async get_embeddings_async(texts: List[Union[str, vertexai.language_models.TextEmbeddingInput]], *, auto_truncate: bool = True, output_dimensionality: Optional[int] = None)
Asynchronously calculates embeddings for the given texts.
Parameters
texts – A list of texts or TextEmbeddingInput objects to embed.
auto_truncate – Whether to automatically truncate long texts. Default: True.
output_dimensionality – Optional dimensions of embeddings. Range: [1, 768]. Default: None.
Returns
A list of TextEmbedding objects.
class vertexai.language_models.TextGenerationModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai.language_models._language_models._TextGenerationModel
, vertexai.language_models._language_models._TunableTextModelMixin
, vertexai.language_models._language_models._ModelWithBatchPredict
, vertexai.language_models._language_models._RlhfTunableModelMixin
Creates a LanguageModel.
This constructor should not be called directly. Use LanguageModel.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Vertex LLM. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
batch_predict(*, dataset: Union[str, List[str]], destination_uri_prefix: str, model_parameters: Optional[Dict] = None)
Starts a batch prediction job with the model.
Parameters
dataset – The location of the dataset. gs:// and bq:// URIs are supported.
destination_uri_prefix – The URI prefix for the prediction. gs:// and bq:// URIs are supported.
model_parameters – Model-specific parameters to send to the model.
Returns
A BatchPredictionJob object
Raises
ValueError – When source or destination URI is not supported.
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
classmethod get_tuned_model(tuned_model_name: str)
Loads the specified tuned language model.
list_tuned_model_names()
Lists the names of tuned models.
Returns
A list of tuned models that can be used with the get_tuned_model method.
predict(prompt: str, *, max_output_tokens: Optional[int] = 128, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, stop_sequences: Optional[List[str]] = None, candidate_count: Optional[int] = None, grounding_source: Optional[Union[vertexai.language_models._language_models.WebSearch, vertexai.language_models._language_models.VertexAISearch, vertexai.language_models._language_models.InlineContext]] = None, logprobs: Optional[int] = None, presence_penalty: Optional[float] = None, frequency_penalty: Optional[float] = None, logit_bias: Optional[Dict[int, float]] = None)
Gets model response for a single prompt.
Parameters
prompt – Question to ask the model.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024].
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0.
top_k – The number of highest probability vocabulary tokens to keep for top-k-filtering. Range: [1, 40]. Default: 40.
top_p – The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling. Range: [0, 1]. Default: 0.95.
stop_sequences – Customized stop sequences to stop the decoding process.
candidate_count – Number of response candidates to return.
grounding_source – If specified, grounding feature will be enabled using the grounding source. Default: None.
logprobs – Returns the top logprobs most likely candidate tokens with their log probabilities at each generation step. The chosen tokens and their log probabilities at each step are always returned. The chosen token may or may not be in the top logprobs most likely candidates. The minimum value for logprobs is 0, which means only the chosen tokens and their log probabilities are returned. The maximum value for logprobs is 5.
presence_penalty – Positive values penalize tokens that have appeared in the generated text, thus increasing the possibility of generating more diversed topics. Range: [-2.0, 2.0]
frequency_penalty – Positive values penalize tokens that repeatedly appear in the generated text, thus decreasing the possibility of repeating the same content. Range: [-2.0, 2.0]
logit_bias – Mapping from token IDs (integers) to their bias values (floats). The bias values are added to the logits before sampling. Larger positive bias increases the probability of choosing the token. Smaller negative bias decreases the probability of choosing the token. Range: [-100.0, 100.0]
Returns
A MultiCandidateTextGenerationResponse object that contains the text produced by the model.
async predict_async(prompt: str, *, max_output_tokens: Optional[int] = 128, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, stop_sequences: Optional[List[str]] = None, candidate_count: Optional[int] = None, grounding_source: Optional[Union[vertexai.language_models._language_models.WebSearch, vertexai.language_models._language_models.VertexAISearch, vertexai.language_models._language_models.InlineContext]] = None, logprobs: Optional[int] = None, presence_penalty: Optional[float] = None, frequency_penalty: Optional[float] = None, logit_bias: Optional[Dict[int, float]] = None)
Asynchronously gets model response for a single prompt.
Parameters
prompt – Question to ask the model.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024].
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0.
top_k – The number of highest probability vocabulary tokens to keep for top-k-filtering. Range: [1, 40]. Default: 40.
top_p – The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling. Range: [0, 1]. Default: 0.95.
stop_sequences – Customized stop sequences to stop the decoding process.
candidate_count – Number of response candidates to return.
grounding_source – If specified, grounding feature will be enabled using the grounding source. Default: None.
logprobs – Returns the top logprobs most likely candidate tokens with their log probabilities at each generation step. The chosen tokens and their log probabilities at each step are always returned. The chosen token may or may not be in the top logprobs most likely candidates. The minimum value for logprobs is 0, which means only the chosen tokens and their log probabilities are returned. The maximum value for logprobs is 5.
presence_penalty – Positive values penalize tokens that have appeared in the generated text, thus increasing the possibility of generating more diversed topics. Range: [-2.0, 2.0]
frequency_penalty – Positive values penalize tokens that repeatedly appear in the generated text, thus decreasing the possibility of repeating the same content. Range: [-2.0, 2.0]
logit_bias – Mapping from token IDs (integers) to their bias values (floats). The bias values are added to the logits before sampling. Larger positive bias increases the probability of choosing the token. Smaller negative bias decreases the probability of choosing the token. Range: [-100.0, 100.0]
Returns
A MultiCandidateTextGenerationResponse object that contains the text produced by the model.
predict_streaming(prompt: str, *, max_output_tokens: int = 128, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, stop_sequences: Optional[List[str]] = None, logprobs: Optional[int] = None, presence_penalty: Optional[float] = None, frequency_penalty: Optional[float] = None, logit_bias: Optional[Dict[int, float]] = None)
Gets a streaming model response for a single prompt.
The result is a stream (generator) of partial responses.
Parameters
prompt – Question to ask the model.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024].
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0.
top_k – The number of highest probability vocabulary tokens to keep for top-k-filtering. Range: [1, 40]. Default: 40.
top_p – The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling. Range: [0, 1]. Default: 0.95.
stop_sequences – Customized stop sequences to stop the decoding process.
logprobs – Returns the top logprobs most likely candidate tokens with their log probabilities at each generation step. The chosen tokens and their log probabilities at each step are always returned. The chosen token may or may not be in the top logprobs most likely candidates. The minimum value for logprobs is 0, which means only the chosen tokens and their log probabilities are returned. The maximum value for logprobs is 5.
presence_penalty – Positive values penalize tokens that have appeared in the generated text, thus increasing the possibility of generating more diversed topics. Range: [-2.0, 2.0]
frequency_penalty – Positive values penalize tokens that repeatedly appear in the generated text, thus decreasing the possibility of repeating the same content. Range: [-2.0, 2.0]
logit_bias – Mapping from token IDs (integers) to their bias values (floats). The bias values are added to the logits before sampling. Larger positive bias increases the probability of choosing the token. Smaller negative bias decreases the probability of choosing the token. Range: [-100.0, 100.0]
Yields
A stream of TextGenerationResponse objects that contain partial responses produced by the model.
async predict_streaming_async(prompt: str, *, max_output_tokens: int = 128, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, stop_sequences: Optional[List[str]] = None, logprobs: Optional[int] = None, presence_penalty: Optional[float] = None, frequency_penalty: Optional[float] = None, logit_bias: Optional[Dict[int, float]] = None)
Asynchronously gets a streaming model response for a single prompt.
The result is a stream (generator) of partial responses.
Parameters
prompt – Question to ask the model.
max_output_tokens – Max length of the output text in tokens. Range: [1, 1024].
temperature – Controls the randomness of predictions. Range: [0, 1]. Default: 0.
top_k – The number of highest probability vocabulary tokens to keep for top-k-filtering. Range: [1, 40]. Default: 40.
top_p – The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling. Range: [0, 1]. Default: 0.95.
stop_sequences – Customized stop sequences to stop the decoding process.
logprobs – Returns the top logprobs most likely candidate tokens with their log probabilities at each generation step. The chosen tokens and their log probabilities at each step are always returned. The chosen token may or may not be in the top logprobs most likely candidates. The minimum value for logprobs is 0, which means only the chosen tokens and their log probabilities are returned. The maximum value for logprobs is 5.
presence_penalty – Positive values penalize tokens that have appeared in the generated text, thus increasing the possibility of generating more diversed topics. Range: [-2.0, 2.0]
frequency_penalty – Positive values penalize tokens that repeatedly appear in the generated text, thus decreasing the possibility of repeating the same content. Range: [-2.0, 2.0]
logit_bias – Mapping from token IDs (integers) to their bias values (floats). The bias values are added to the logits before sampling. Larger positive bias increases the probability of choosing the token. Smaller negative bias decreases the probability of choosing the token. Range: [-100.0, 100.0]
Yields
A stream of TextGenerationResponse objects that contain partial responses produced by the model.
tune_model(training_data: Union[str, pandas.core.frame.DataFrame], *, train_steps: Optional[int] = None, learning_rate_multiplier: Optional[float] = None, tuning_job_location: Optional[str] = None, tuned_model_location: Optional[str] = None, model_display_name: Optional[str] = None, tuning_evaluation_spec: Optional[TuningEvaluationSpec] = None, accelerator_type: Optional[Literal['TPU', 'GPU']] = None, max_context_length: Optional[str] = None)
Tunes a model based on training data.
This method launches and returns an asynchronous model tuning job. Usage:
``
` tuning_job = model.tune_model(…) … do some other work tuned_model = tuning_job.get_tuned_model() # Blocks until tuning is complete
Parameters
training_data – A Pandas DataFrame or a URI pointing to data in JSON lines format. The dataset schema is model-specific. See https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models#dataset_format
train_steps – Number of training batches to tune on (batch size is 8 samples).
learning_rate_multiplier – Learning rate multiplier to use in tuning.
tuning_job_location – GCP location where the tuning job should be run.
tuned_model_location – GCP location where the tuned model should be deployed.
model_display_name – Custom display name for the tuned model.
tuning_evaluation_spec – Specification for the model evaluation during tuning.
accelerator_type – Type of accelerator to use. Can be “TPU” or “GPU”.
max_context_length – The max context length used for tuning. Can be either ‘8k’ or ‘32k’
Returns
A LanguageModelTuningJob object that represents the tuning job. Calling job.result() blocks until the tuning is complete and returns a LanguageModel object.
Raises
ValueError – If the “tuning_job_location” value is not supported
ValueError – If the “tuned_model_location” value is not supported
RuntimeError – If the model does not support tuning
tune_model_rlhf(*, prompt_data: Union[str, pandas.core.frame.DataFrame], preference_data: Union[str, pandas.core.frame.DataFrame], model_display_name: Optional[str] = None, prompt_sequence_length: Optional[int] = None, target_sequence_length: Optional[int] = None, reward_model_learning_rate_multiplier: Optional[float] = None, reinforcement_learning_rate_multiplier: Optional[float] = None, reward_model_train_steps: Optional[int] = None, reinforcement_learning_train_steps: Optional[int] = None, kl_coeff: Optional[float] = None, default_context: Optional[str] = None, tuning_job_location: Optional[str] = None, accelerator_type: Optional[Literal['TPU', 'GPU']] = None, tuning_evaluation_spec: Optional[TuningEvaluationSpec] = None)
Tunes a model using reinforcement learning from human feedback.
This method launches and returns an asynchronous model tuning job.
Usage:
\
tuning_job = model.tune_model_rlhf(...)
... do some other work
tuned_model = tuning_job.get_tuned_model() # Blocks until tuning is complete
``
Parameters
prompt_data – A Pandas DataFrame or a URI pointing to data in JSON lines format. The dataset schema is model-specific. See https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-text-models-rlhf#prompt-dataset
preference_data – A Pandas DataFrame or a URI pointing to data in JSON lines format. The dataset schema is model-specific. See https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-text-models-rlhf#human-preference-dataset
model_display_name – Custom display name for the tuned model. If not provided, a default name will be created.
prompt_sequence_length – Maximum tokenized sequence length for input text. Higher values increase memory overhead. This value should be at most 8192. Default value is 512.
target_sequence_length – Maximum tokenized sequence length for target text. Higher values increase memory overhead. This value should be at most 1024. Default value is 64.
reward_model_learning_rate_multiplier – Constant used to adjust the base learning rate used when training a reward model. Multiply by a number > 1 to increase the magnitude of updates applied at each training step or multiply by a number < 1 to decrease the magnitude of updates. Default value is 1.0.
reinforcement_learning_rate_multiplier – Constant used to adjust the base learning rate used during reinforcement learning. Multiply by a number > 1 to increase the magnitude of updates applied at each training step or multiply by a number < 1 to decrease the magnitude of updates. Default value is 1.0.
reward_model_train_steps – Number of steps to use when training a reward model. Default value is 1000.
reinforcement_learning_train_steps – Number of reinforcement learning steps to perform when tuning a base model. Default value is 1000.
kl_coeff – Coefficient for KL penalty. This regularizes the policy model and penalizes if it diverges from its initial distribution. If set to 0, the reference language model is not loaded into memory. Default value is 0.1.
default_context – This field lets the model know what task to perform. Base models have been trained over a large set of varied instructions. You can give a simple and intuitive description of the task and the model will follow it, e.g. “Classify this movie review as positive or negative” or “Translate this sentence to Danish”. Do not specify this if your dataset already prepends the instruction to the inputs field.
tuning_job_location – GCP location where the tuning job should be run.
accelerator_type – Type of accelerator to use. Can be “TPU” or “GPU”.
tuning_evaluation_spec – Evaluation settings to use during tuning.
Returns
A LanguageModelTuningJob object that represents the tuning job. Calling job.result() blocks until the tuning is complete and returns a LanguageModel object.
Raises
ValueError – If the “tuning_job_location” value is not supported
RuntimeError – If the model does not support tuning
class vertexai.language_models.TextGenerationResponse(text: str, _prediction_response: typing.Any, is_blocked: bool = False, errors: typing.Tuple[int] = (), safety_attributes: typing.Dict[str, float] =
Bases: object
TextGenerationResponse represents a response of a language model. .. attribute:: text
The generated text
type
str
is_blocked()
Whether the the request was blocked.
Type
errors()
The error codes indicate why the response was blocked. Learn more information about safety errors here: this documentation https://cloud.google.com/vertex-ai/docs/generative-ai/learn/responsible-ai#safety_errors
Type
Tuple[int]
safety_attributes()
Scores for safety attributes. Learn more about the safety attributes here: https://cloud.google.com/vertex-ai/docs/generative-ai/learn/responsible-ai#safety_attribute_descriptions
grounding_metadata()
Metadata for grounding.
Type
Optional[vertexai.language_models._language_models.GroundingMetadata]
property raw_prediction_response(: google.cloud.aiplatform.models.Predictio )
Raw prediction response.
Classes for working with language models.
class vertexai.language_models._language_models._TunableModelMixin(model_id: str, endpoint_name: Optional[str] = None)
Model that can be tuned with supervised fine tuning (SFT).
Creates a LanguageModel.
This constructor should not be called directly. Use LanguageModel.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Vertex LLM. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
tune_model(training_data: Union[str, pandas.core.frame.DataFrame], *, train_steps: Optional[int] = None, learning_rate: Optional[float] = None, learning_rate_multiplier: Optional[float] = None, tuning_job_location: Optional[str] = None, tuned_model_location: Optional[str] = None, model_display_name: Optional[str] = None, tuning_evaluation_spec: Optional[TuningEvaluationSpec] = None, default_context: Optional[str] = None, accelerator_type: Optional[Literal['TPU', 'GPU']] = None, max_context_length: Optional[str] = None)
Tunes a model based on training data.
This method launches and returns an asynchronous model tuning job.
Usage:
\
tuning_job = model.tune_model(...)
... do some other work
tuned_model = tuning_job.get_tuned_model() # Blocks until tuning is complete
``
Parameters
training_data – A Pandas DataFrame or a URI pointing to data in JSON lines format. The dataset schema is model-specific. See https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models#dataset_format
train_steps – Number of training batches to tune on (batch size is 8 samples).
learning_rate – Deprecated. Use learning_rate_multiplier instead. Learning rate to use in tuning.
learning_rate_multiplier – Learning rate multiplier to use in tuning.
tuning_job_location – GCP location where the tuning job should be run.
tuned_model_location – GCP location where the tuned model should be deployed.
model_display_name – Custom display name for the tuned model.
tuning_evaluation_spec – Specification for the model evaluation during tuning.
default_context – The context to use for all training samples by default.
accelerator_type – Type of accelerator to use. Can be “TPU” or “GPU”.
max_context_length – The max context length used for tuning. Can be either ‘8k’ or ‘32k’
Returns
A LanguageModelTuningJob object that represents the tuning job. Calling job.result() blocks until the tuning is complete and returns a LanguageModel object.
Raises
ValueError – If the “tuning_job_location” value is not supported
ValueError – If the “tuned_model_location” value is not supported
RuntimeError – If the model does not support tuning
class vertexai.preview.VertexModel()
Bases: object
mixin class that can be used to add Vertex AI remote execution to a custom model.
vertexai.preview.end_run(state: google.cloud.aiplatform_v1.types.execution.Execution.State = State.COMPLETE)
Ends the the current experiment run.
\
py
aiplatform.start_run('my-run')
...
aiplatform.end_run()
``
vertexai.preview.from_pretrained(*, model_name: Optional[str] = None, custom_job_name: Optional[str] = None, foundation_model_name: Optional[str] = None)
Pulls a model from Model Registry or from a CustomJob ID for retraining.
The returned model is wrapped with a Vertex wrapper for running remote jobs on Vertex, unless an unwrapped model was registered to Model Registry.
Parameters
model_name (str) – Optional. The resource ID or fully qualified resource name of a registered model. Format: “12345678910” or “projects/123/locations/us-central1/models/12345678910@1”. One of model_name, custom_job_name, or foundation_model_name is required.
custom_job_name (str) – Optional. The resource ID or fully qualified resource name of a CustomJob created with Vertex SDK remote training. If the job has completed successfully, this will load the trained model created in the CustomJob. One of model_name, custom_job_name, or foundation_model_name is required.
foundation_model_name (str) – Optional. The name of the foundation model to load. For example: “text-bison@001”. One of model_name,`custom_job_name`, or foundation_model_name is required.
Returns
local model for uptraining.
Return type
model
Raises
ValueError – If registered model is not registered through vertexai.preview.register If custom job was not created with Vertex SDK remote training If both or neither model_name or custom_job_name are provided
vertexai.preview.get_experiment_df(experiment: Optional[str] = None)
Returns a Pandas DataFrame of the parameters and metrics associated with one experiment.
Example:
``
`
py aiplatform.init(experiment=’exp-1’) aiplatform.start_run(run=’run-1’) aiplatform.log_params({‘learning_rate’: 0.1}) aiplatform.log_metrics({‘accuracy’: 0.9})
aiplatform.start_run(run=’run-2’) aiplatform.log_params({‘learning_rate’: 0.2}) aiplatform.log_metrics({‘accuracy’: 0.95})
aiplatform.get_experiment_df()
``
`
Will result in the following DataFrame:
\
experiment_name | run_name | param.learning_rate | metric.accuracy
exp-1 | run-1 | 0.1 | 0.9
exp-1 | run-2 | 0.2 | 0.95
``
Parameters
experiment (str) – Name of the Experiment to filter results. If not set, return results of current active experiment.
Returns
Pandas Dataframe of Experiment with metrics and parameters.
Raises
NotFound exception if experiment does not exist. –
ValueError if given experiment is not associated with a wrong schema. –
vertexai.preview.init(*, remote: Optional[bool] = None, autolog: Optional[bool] = None, cluster: Optional[vertexai.preview._workflow.shared.configs.PersistentResourceConfig] = None)
Updates preview global parameters for Vertex remote execution.
Parameters
remote (bool) – Optional. A global flag to indicate whether or not a method will be executed remotely. Default is Flase. The method level remote flag has higher priority than this global flag.
autolog (bool) – Optional. Whether or not to turn on autologging feature for remote execution. To learn more about the autologging feature, see https://cloud.google.com/vertex-ai/docs/experiments/autolog-data.
cluster (PersistentResourceConfig) – Optional. If passed, check if the cluster exists. If not, create a default one (single node, “n1-standard-4”, no GPU) with the given name. Then use the cluster to run CustomJobs. Default is None. Example usage: from vertexai.preview.shared.configs import PersistentResourceConfig cluster = PersistentResourceConfig(
name=”my-cluster-1”, resource_pools=[
ResourcePool(replica_count=1,), ResourcePool(
machine_type=”n1-standard-8”, replica_count=2, accelerator_type=”NVIDIA_TESLA_P100”, accelerator_count=1, ),
]
)
vertexai.preview.log_classification_metrics(*, labels: Optional[List[str]] = None, matrix: Optional[List[List[int]]] = None, fpr: Optional[List[float]] = None, tpr: Optional[List[float]] = None, threshold: Optional[List[float]] = None, display_name: Optional[str] = None)
Create an artifact for classification metrics and log to ExperimentRun. Currently support confusion matrix and ROC curve.
``
`
py my_run = aiplatform.ExperimentRun(‘my-run’, experiment=’my-experiment’) classification_metrics = my_run.log_classification_metrics(
display_name=’my-classification-metrics’, labels=[‘cat’, ‘dog’], matrix=[[9, 1], [1, 9]], fpr=[0.1, 0.5, 0.9], tpr=[0.1, 0.7, 0.9], threshold=[0.9, 0.5, 0.1],
Parameters
labels (List[str]) – Optional. List of label names for the confusion matrix. Must be set if ‘matrix’ is set.
matrix (List[List[int]) – Optional. Values for the confusion matrix. Must be set if ‘labels’ is set.
fpr (List[float]) – Optional. List of false positive rates for the ROC curve. Must be set if ‘tpr’ or ‘thresholds’ is set.
tpr (List[float]) – Optional. List of true positive rates for the ROC curve. Must be set if ‘fpr’ or ‘thresholds’ is set.
threshold (List[float]) – Optional. List of thresholds for the ROC curve. Must be set if ‘fpr’ or ‘tpr’ is set.
display_name (str) – Optional. The user-defined name for the classification metric artifact.
Raises
ValueError – if ‘labels’ and ‘matrix’ are not set together or if ‘labels’ and ‘matrix’ are not in the same length or if ‘fpr’ and ‘tpr’ and ‘threshold’ are not set together or if ‘fpr’ and ‘tpr’ and ‘threshold’ are not in the same length
vertexai.preview.log_metrics(metrics: Dict[str, Union[float, int, str]])
Log single or multiple Metrics with specified key and value pairs.
Metrics with the same key will be overwritten.
\
py
aiplatform.start_run('my-run', experiment='my-experiment')
aiplatform.log_metrics({'accuracy': 0.9, 'recall': 0.8})
``
Parameters
metrics (Dict[str, **Union[float, *[int](https://python.readthedocs.io/en/latest/library/functions.html#int), [str](https://python.readthedocs.io/en/latest/library/stdtypes.html#str)]*]) – Required. Metrics key/value pairs.
vertexai.preview.log_params(params: Dict[str, Union[float, int, str]])
Log single or multiple parameters with specified key and value pairs.
Parameters with the same key will be overwritten.
\
py
aiplatform.start_run('my-run')
aiplatform.log_params({'learning_rate': 0.1, 'dropout_rate': 0.2})
``
Parameters
params (Dict[str, **Union[float, *[int](https://python.readthedocs.io/en/latest/library/functions.html#int), [str](https://python.readthedocs.io/en/latest/library/stdtypes.html#str)]*]) – Required. Parameter key/value pairs.
vertexai.preview.log_time_series_metrics(metrics: Dict[str, float], step: Optional[int] = None, wall_time: Optional[google.protobuf.timestamp_pb2.Timestamp] = None)
Logs time series metrics to to this Experiment Run.
Requires the experiment or experiment run has a backing Vertex Tensorboard resource.
``
`
py my_tensorboard = aiplatform.Tensorboard(…) aiplatform.init(experiment=’my-experiment’, experiment_tensorboard=my_tensorboard) aiplatform.start_run(‘my-run’)
increments steps as logged
for i in range(10):
aiplatform.log_time_series_metrics({‘loss’: loss})
explicitly log steps
for i in range(10):
aiplatform.log_time_series_metrics({‘loss’: loss}, step=i)
``
`
Parameters
metrics (Dict[str, **Union[str, *[float](https://python.readthedocs.io/en/latest/library/functions.html#float)]*]) – Required. Dictionary of where keys are metric names and values are metric values.
step (int) – Optional. Step index of this data point within the run.
If not provided, the latest step amongst all time series metrics already logged will be used.
wall_time (timestamp_pb2.Timestamp) – Optional. Wall clock timestamp when this data point is generated by the end user.
If not provided, this will be generated based on the value from time.time()
Raises
RuntimeError – If current experiment run doesn’t have a backing Tensorboard resource.
vertexai.preview.register(model: Union[sklearn.base.BaseEstimator, tf.Module, torch.nn.Module], use_gpu: bool = False)
Registers a model and returns a Model representing the registered Model resource.
Parameters
model (Union["sklearn.base.BaseEstimator", **"tensorflow.Module", **"torch.nn.Module"]) – Required. An OSS model. Supported frameworks: sklearn, tensorflow, pytorch.
use_gpu (bool) – Optional. Whether to use GPU for model serving. Default to False.
Returns
Instantiated representation of the registered model resource.
Return type
vertex_model (aiplatform.Model)
Raises
ValueError – if default staging bucket is not set or if the framework is not supported.
vertexai.preview.remote(cls_or_method: Any)
Takes a class or method and add Vertex remote execution support.
LogisticRegression = vertexai.preview.remote(LogisticRegression) model = LogisticRegression() model.fit.vertex.remote_config.staging_bucket = REMOTE_JOB_BUCKET model.fit.vertex.remote=True model.fit(X_train, y_train)
``
`
Parameters
cls_or_method (Any) – Required. A class or method that will be added Vertex remote execution support.
Returns
A class or method that can be executed remotely.
vertexai.preview.start_run(run: str, *, tensorboard: Optional[Union[google.cloud.aiplatform.tensorboard.tensorboard_resource.Tensorboard, str]] = None, resume=False)
Start a run to current session.
\
py
aiplatform.init(experiment='my-experiment')
aiplatform.start_run('my-run')
aiplatform.log_params({'learning_rate':0.1})
``
Use as context manager. Run will be ended on context exit:
``
`
py aiplatform.init(experiment=’my-experiment’) with aiplatform.start_run(‘my-run’) as my_run:
my_run.log_params({‘learning_rate’:0.1})
``
`
Resume a previously started run:
``
`
py aiplatform.init(experiment=’my-experiment’) with aiplatform.start_run(‘my-run’, resume=True) as my_run:
my_run.log_params({‘learning_rate’:0.1})
``
`
Parameters
run (str) – Required. Name of the run to assign current session with.
Union[str (tensorboard) – Optional. Backing Tensorboard Resource to enable and store time series metrics logged to this Experiment Run using log_time_series_metrics.
If not provided will the the default backing tensorboard of the currently set experiment.
tensorboard_resource.Tensorboard] – Optional. Backing Tensorboard Resource to enable and store time series metrics logged to this Experiment Run using log_time_series_metrics.
If not provided will the the default backing tensorboard of the currently set experiment.
resume (bool) – Whether to resume this run. If False a new run will be created.
Raises
ValueError – if experiment is not set. Or if run execution or metrics artifact is already created but with a different schema.
Classes for working with language models.
class vertexai.preview.language_models.ChatMessage(content: str, author: str)
Bases: object
A chat message.
content()
Content of the message.
Type
author()
Author of the message.
Type
vertexai.preview.language_models.ChatModel()
alias of vertexai.preview.language_models._PreviewChatModel
vertexai.preview.language_models.ChatSession()
alias of vertexai.preview.language_models._PreviewChatSession
vertexai.preview.language_models.CodeChatModel()
alias of vertexai.preview.language_models._PreviewCodeChatModel
vertexai.preview.language_models.CodeChatSession()
alias of vertexai.preview.language_models._PreviewCodeChatSession
vertexai.preview.language_models.CodeGenerationModel()
alias of vertexai.preview.language_models._PreviewCodeGenerationModel
class vertexai.preview.language_models.CountTokensResponse(total_tokens: int, total_billable_characters: int, _count_tokens_response: Any)
Bases: object
The response from a count_tokens request. .. attribute:: total_tokens
The total number of tokens counted across all instances passed to the request.
type
int
total_billable_characters()
The total number of billable characters counted across all instances from the request.
Type
class vertexai.preview.language_models.EvaluationClassificationMetric(label_name: Optional[str] = None, auPrc: Optional[float] = None, auRoc: Optional[float] = None, logLoss: Optional[float] = None, confidenceMetrics: Optional[List[Dict[str, Any]]] = None, confusionMatrix: Optional[Dict[str, Any]] = None)
Bases: vertexai.language_models._evaluatable_language_models._EvaluationMetricBase
The evaluation metric response for classification metrics.
Parameters
label_name (str) – Optional. The name of the label associated with the metrics. This is only returned when only_summary_metrics=False is passed to evaluate().
auPrc (float) – Optional. The area under the precision recall curve.
auRoc (float) – Optional. The area under the receiver operating characteristic curve.
logLoss (float) – Optional. Logarithmic loss.
confidenceMetrics (List[Dict[str, **Any]]) – Optional. This is only returned when only_summary_metrics=False is passed to evaluate().
confusionMatrix (Dict[str, **Any]) – Optional. This is only returned when only_summary_metrics=False is passed to evaluate().
property input_dataset_paths(: [str](https://python.readthedocs.io/en/latest/library/stdtypes.html#str )
The Google Cloud Storage paths to the dataset used for this evaluation.
property task_name(: [str](https://python.readthedocs.io/en/latest/library/stdtypes.html#str )
The type of evaluation task for the evaluation..
class vertexai.preview.language_models.EvaluationMetric(bleu: Optional[float] = None, rougeLSum: Optional[float] = None)
Bases: vertexai.language_models._evaluatable_language_models._EvaluationMetricBase
The evaluation metric response.
Parameters
property input_dataset_paths(: [str](https://python.readthedocs.io/en/latest/library/stdtypes.html#str )
The Google Cloud Storage paths to the dataset used for this evaluation.
property task_name(: [str](https://python.readthedocs.io/en/latest/library/stdtypes.html#str )
The type of evaluation task for the evaluation..
class vertexai.preview.language_models.EvaluationQuestionAnsweringSpec(ground_truth_data: Union[List[str], str, pandas.DataFrame], task_name: str = 'question-answering')
Bases: vertexai.language_models._evaluatable_language_models._EvaluationTaskSpec
Spec for question answering model evaluation tasks.
class vertexai.preview.language_models.EvaluationTextClassificationSpec(ground_truth_data: Union[List[str], str, pandas.DataFrame], target_column_name: str, class_names: List[str])
Bases: vertexai.language_models._evaluatable_language_models._EvaluationTaskSpec
Spec for text classification model evaluation tasks.
Parameters
class vertexai.preview.language_models.EvaluationTextGenerationSpec(ground_truth_data: Union[List[str], str, pandas.DataFrame])
Bases: vertexai.language_models._evaluatable_language_models._EvaluationTaskSpec
Spec for text generation model evaluation tasks.
class vertexai.preview.language_models.EvaluationTextSummarizationSpec(ground_truth_data: Union[List[str], str, pandas.DataFrame], task_name: str = 'summarization')
Bases: vertexai.language_models._evaluatable_language_models._EvaluationTaskSpec
Spec for text summarization model evaluation tasks.
class vertexai.preview.language_models.InputOutputTextPair(input_text: str, output_text: str)
Bases: object
InputOutputTextPair represents a pair of input and output texts.
class vertexai.preview.language_models.TextEmbedding(values: List[float], statistics: Optional[vertexai.language_models.TextEmbeddingStatistics] = None, _prediction_response: Optional[google.cloud.aiplatform.models.Prediction] = None)
Bases: object
Text embedding vector and statistics.
class vertexai.preview.language_models.TextEmbeddingInput(text: str, task_type: Optional[str] = None, title: Optional[str] = None)
Bases: object
Structural text embedding input.
text()
The main text content to embed.
Type
task_type()
The name of the downstream task the embeddings will be used for. Valid values: RETRIEVAL_QUERY
Specifies the given text is a query in a search/retrieval setting.
RETRIEVAL_DOCUMENT
Specifies the given text is a document from the corpus being searched.
SEMANTIC_SIMILARITY
Specifies the given text will be used for STS.
CLASSIFICATION
Specifies that the given text will be classified.
CLUSTERING
Specifies that the embeddings will be used for clustering.
QUESTION_ANSWERING
Specifies that the embeddings will be used for question answering.
FACT_VERIFICATION
Specifies that the embeddings will be used for fact verification.
Type
Optional[str]
title()
Optional identifier of the text content.
Type
Optional[str]
vertexai.preview.language_models.TextEmbeddingModel()
alias of vertexai.preview.language_models._PreviewTextEmbeddingModel
vertexai.preview.language_models.TextGenerationModel()
alias of vertexai.preview.language_models._PreviewTextGenerationModel
class vertexai.preview.language_models.TextGenerationResponse(text: str, _prediction_response: typing.Any, is_blocked: bool = False, errors: typing.Tuple[int] = (), safety_attributes: typing.Dict[str, float] =
Bases: object
TextGenerationResponse represents a response of a language model. .. attribute:: text
The generated text
type
str
is_blocked()
Whether the the request was blocked.
Type
errors()
The error codes indicate why the response was blocked. Learn more information about safety errors here: this documentation https://cloud.google.com/vertex-ai/docs/generative-ai/learn/responsible-ai#safety_errors
Type
Tuple[int]
safety_attributes()
Scores for safety attributes. Learn more about the safety attributes here: https://cloud.google.com/vertex-ai/docs/generative-ai/learn/responsible-ai#safety_attribute_descriptions
grounding_metadata()
Metadata for grounding.
Type
Optional[vertexai.language_models._language_models.GroundingMetadata]
property raw_prediction_response(: google.cloud.aiplatform.models.Predictio )
Raw prediction response.
class vertexai.preview.language_models.TuningEvaluationSpec(evaluation_data: Optional[str] = None, evaluation_interval: Optional[int] = None, enable_early_stopping: Optional[bool] = None, enable_checkpoint_selection: Optional[bool] = None, tensorboard: Optional[Union[google.cloud.aiplatform.tensorboard.tensorboard_resource.Tensorboard, str]] = None)
Bases: object
Specification for model evaluation to perform during tuning.
evaluation_data()
GCS URI of the evaluation dataset. This will run model evaluation as part of the tuning job.
Type
Optional[str]
evaluation_interval()
The evaluation will run at every evaluation_interval tuning steps. Default: 20.
Type
Optional[int]
enable_early_stopping()
If True, the tuning may stop early before completing all the tuning steps. Requires evaluation_data.
Type
Optional[bool]
enable_checkpoint_selection()
If set to True, the tuning process returns the best model checkpoint (based on model evaluation). If set to False, the latest model checkpoint is returned. If unset, the selection is only enabled for *-bison@001 models.
Type
Optional[bool]
tensorboard()
Vertex Tensorboard where to write the evaluation metrics. The Tensorboard must be in the same location as the tuning job.
Type
Optional[Union[google.cloud.aiplatform.tensorboard.tensorboard_resource.Tensorboard, str]]
Classes for working with vision models.
class vertexai.vision_models.Image(image_bytes: Optional[bytes] = None, gcs_uri: Optional[str] = None)
Bases: object
Image.
Creates an Image object.
Parameters
image_bytes – Image file bytes. Image can be in PNG or JPEG format.
gcs_uri – Image URI in Google Cloud Storage.
static load_from_file(location: str)
Loads image from local file or Google Cloud Storage.
Parameters
location – Local path or Google Cloud Storage uri from where to load the image.
Returns
Loaded image as an Image object.
save(location: str)
Saves image to a file.
Parameters
location – Local path where to save the image.
show()
Shows the image.
This method only works when in a notebook environment.
class vertexai.vision_models.ImageCaptioningModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai._model_garden._model_garden_models._ModelGardenModel
Generates captions from image.
Examples:
model = ImageCaptioningModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
captions = model.get_captions(
image=image,
# Optional:
number_of_results=1,
language="en",
)
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
get_captions(image: vertexai.vision_models.Image, *, number_of_results: int = 1, language: str = 'en', output_gcs_uri: Optional[str] = None)
Generates captions for a given image.
Parameters
image – The image to get captions for. Size limit: 10 MB.
number_of_results – Number of captions to produce. Range: 1-3.
language – Language to use for captions. Supported languages: “en”, “fr”, “de”, “it”, “es”
output_gcs_uri – Google Cloud Storage uri to store the captioned images.
Returns
A list of image caption strings.
class vertexai.vision_models.ImageQnAModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai._model_garden._model_garden_models._ModelGardenModel
Answers questions about an image.
Examples:
model = ImageQnAModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
answers = model.ask_question(
image=image,
question="What color is the car in this image?",
# Optional:
number_of_results=1,
)
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
ask_question(image: vertexai.vision_models.Image, question: str, *, number_of_results: int = 1)
Answers questions about an image.
Parameters
image – The image to get captions for. Size limit: 10 MB.
question – Question to ask about the image.
number_of_results – Number of captions to produce. Range: 1-3.
Returns
A list of answers.
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
class vertexai.vision_models.ImageTextModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai.vision_models.ImageCaptioningModel
, vertexai.vision_models.ImageQnAModel
Generates text from images.
Examples:
model = ImageTextModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
captions = model.get_captions(
image=image,
# Optional:
number_of_results=1,
language="en",
)
answers = model.ask_question(
image=image,
question="What color is the car in this image?",
# Optional:
number_of_results=1,
)
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
ask_question(image: vertexai.vision_models.Image, question: str, *, number_of_results: int = 1)
Answers questions about an image.
Parameters
image – The image to get captions for. Size limit: 10 MB.
question – Question to ask about the image.
number_of_results – Number of captions to produce. Range: 1-3.
Returns
A list of answers.
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
get_captions(image: vertexai.vision_models.Image, *, number_of_results: int = 1, language: str = 'en', output_gcs_uri: Optional[str] = None)
Generates captions for a given image.
Parameters
image – The image to get captions for. Size limit: 10 MB.
number_of_results – Number of captions to produce. Range: 1-3.
language – Language to use for captions. Supported languages: “en”, “fr”, “de”, “it”, “es”
output_gcs_uri – Google Cloud Storage uri to store the captioned images.
Returns
A list of image caption strings.
class vertexai.vision_models.MultiModalEmbeddingModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai._model_garden._model_garden_models._ModelGardenModel
Generates embedding vectors from images and videos.
Examples:
model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding@001")
image = Image.load_from_file("image.png")
video = Video.load_from_file("video.mp4")
embeddings = model.get_embeddings(
image=image,
video=video,
contextual_text="Hello world",
)
image_embedding = embeddings.image_embedding
video_embeddings = embeddings.video_embeddings
text_embedding = embeddings.text_embedding
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
get_embeddings(image: Optional[vertexai.vision_models.Image] = None, video: Optional[vertexai.vision_models.Video] = None, contextual_text: Optional[str] = None, dimension: Optional[int] = None, video_segment_config: Optional[vertexai.vision_models.VideoSegmentConfig] = None)
Gets embedding vectors from the provided image.
Parameters
image (Image) – Optional. The image to generate embeddings for. One of image, video, or contextual_text is required.
video (Video) – Optional. The video to generate embeddings for. One of image, video or contextual_text is required.
contextual_text (str) – Optional. Contextual text for your input image or video. If provided, the model will also generate an embedding vector for the provided contextual text. The returned image and text embedding vectors are in the same semantic space with the same dimensionality, and the vectors can be used interchangeably for use cases like searching image by text or searching text by image. One of image, video or contextual_text is required.
dimension (int) – Optional. The number of embedding dimensions. Lower values offer decreased latency when using these embeddings for subsequent tasks, while higher values offer better accuracy. Available values: 128, 256, 512, and 1408 (default).
video_segment_config (VideoSegmentConfig) – Optional. The specific video segments (in seconds) the embeddings are generated for.
Returns
The image and text embedding vectors.
Return type
MultiModalEmbeddingResponse
class vertexai.vision_models.MultiModalEmbeddingResponse(_prediction_response: Any, image_embedding: Optional[List[float]] = None, video_embeddings: Optional[List[vertexai.vision_models.VideoEmbedding]] = None, text_embedding: Optional[List[float]] = None)
Bases: object
The multimodal embedding response.
image_embedding()
Optional. The embedding vector generated from your image.
Type
List[float]
video_embeddings()
Optional. The embedding vectors generated from your video.
Type
List[VideoEmbedding]
text_embedding()
Optional. The embedding vector generated from the contextual text provided for your image or video.
Type
List[float]
class vertexai.vision_models.Video(video_bytes: Optional[bytes] = None, gcs_uri: Optional[str] = None)
Bases: object
Video.
Creates an Image object.
Parameters
video_bytes – Video file bytes. Video can be in AVI, FLV, MKV, MOV, MP4, MPEG, MPG, WEBM, and WMV formats.
gcs_uri – Image URI in Google Cloud Storage.
static load_from_file(location: str)
Loads video from local file or Google Cloud Storage.
Parameters
location – Local path or Google Cloud Storage uri from where to load the video.
Returns
Loaded video as an Video object.
save(location: str)
Saves video to a file.
Parameters
location – Local path where to save the video.
class vertexai.vision_models.VideoEmbedding(start_offset_sec: int, end_offset_sec: int, embedding: List[float])
Bases: object
Embeddings generated from video with offset times.
Creates a VideoEmbedding object.
Parameters
start_offset_sec – Start time offset (in seconds) of generated embeddings.
end_offset_sec – End time offset (in seconds) of generated embeddings.
embedding – Generated embedding for interval.
class vertexai.vision_models.VideoSegmentConfig(start_offset_sec: int = 0, end_offset_sec: int = 120, interval_sec: int = 16)
Bases: object
The specific video segments (in seconds) the embeddings are generated for.
Creates a VideoSegmentConfig object.
Parameters
start_offset_sec – Start time offset (in seconds) to generate embeddings for.
end_offset_sec – End time offset (in seconds) to generate embeddings for.
interval_sec – Interval to divide video for generated embeddings.
Classes for working with vision models.
class vertexai.preview.vision_models.GeneratedImage(image_bytes: Optional[bytes], generation_parameters: Dict[str, Any], gcs_uri: Optional[str] = None)
Bases: vertexai.vision_models.Image
Generated image.
Creates a GeneratedImage object.
Parameters
image_bytes – Image file bytes. Image can be in PNG or JPEG format.
generation_parameters – Image generation parameter values.
gcs_uri – Image file Google Cloud Storage uri.
property generation_parameters()
Image generation parameters as a dictionary.
static load_from_file(location: str)
Loads image from file.
Parameters
location – Local path from where to load the image.
Returns
Loaded image as a GeneratedImage object.
save(location: str, include_generation_parameters: bool = True)
Saves image to a file.
Parameters
location – Local path where to save the image.
include_generation_parameters – Whether to include the image generation parameters in the image’s EXIF metadata.
show()
Shows the image.
This method only works when in a notebook environment.
class vertexai.preview.vision_models.Image(image_bytes: Optional[bytes] = None, gcs_uri: Optional[str] = None)
Bases: object
Image.
Creates an Image object.
Parameters
image_bytes – Image file bytes. Image can be in PNG or JPEG format.
gcs_uri – Image URI in Google Cloud Storage.
static load_from_file(location: str)
Loads image from local file or Google Cloud Storage.
Parameters
location – Local path or Google Cloud Storage uri from where to load the image.
Returns
Loaded image as an Image object.
save(location: str)
Saves image to a file.
Parameters
location – Local path where to save the image.
show()
Shows the image.
This method only works when in a notebook environment.
class vertexai.preview.vision_models.ImageCaptioningModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai._model_garden._model_garden_models._ModelGardenModel
Generates captions from image.
Examples:
model = ImageCaptioningModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
captions = model.get_captions(
image=image,
# Optional:
number_of_results=1,
language="en",
)
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
get_captions(image: vertexai.vision_models.Image, *, number_of_results: int = 1, language: str = 'en', output_gcs_uri: Optional[str] = None)
Generates captions for a given image.
Parameters
image – The image to get captions for. Size limit: 10 MB.
number_of_results – Number of captions to produce. Range: 1-3.
language – Language to use for captions. Supported languages: “en”, “fr”, “de”, “it”, “es”
output_gcs_uri – Google Cloud Storage uri to store the captioned images.
Returns
A list of image caption strings.
class vertexai.preview.vision_models.ImageGenerationModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai._model_garden._model_garden_models._ModelGardenModel
Generates images from text prompt.
Examples:
model = ImageGenerationModel.from_pretrained("imagegeneration@002")
response = model.generate_images(
prompt="Astronaut riding a horse",
# Optional:
number_of_images=1,
seed=0,
)
response[0].show()
response[0].save("image1.png")
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
edit_image(*, prompt: str, base_image: vertexai.vision_models.Image, mask: Optional[vertexai.vision_models.Image] = None, negative_prompt: Optional[str] = None, number_of_images: int = 1, guidance_scale: Optional[float] = None, edit_mode: Optional[Literal['inpainting-insert', 'inpainting-remove', 'outpainting', 'product-image']] = None, mask_mode: Optional[Literal['background', 'foreground', 'semantic']] = None, segmentation_classes: Optional[List[str]] = None, mask_dilation: Optional[float] = None, product_position: Optional[Literal['fixed', 'reposition']] = None, output_mime_type: Optional[Literal['image/png', 'image/jpeg']] = None, compression_quality: Optional[float] = None, language: Optional[str] = None, seed: Optional[int] = None, output_gcs_uri: Optional[str] = None, safety_filter_level: Optional[Literal['block_most', 'block_some', 'block_few', 'block_fewest']] = None, person_generation: Optional[Literal['dont_allow', 'allow_adult', 'allow_all']] = None)
Edits an existing image based on text prompt.
Parameters
prompt – Text prompt for the image.
base_image – Base image from which to generate the new image.
mask – Mask for the base image.
negative_prompt – A description of what you want to omit in the generated images.
number_of_images – Number of images to generate. Range: 1..8.
guidance_scale – Controls the strength of the prompt. Suggested values are: * 0-9 (low strength) * 10-20 (medium strength) * 21+ (high strength)
edit_mode – Describes the editing mode for the request. Supported values are: * inpainting-insert: fills the mask area based on the text prompt (requires mask and text) * inpainting-remove: removes the object(s) in the mask area. (requires mask) * outpainting: extend the image based on the mask area. (Requires mask) * product-image: Changes the background for the predominant product or subject in the image
segmentation_classes – List of class IDs for segmentation. Max of 5 IDs
mask_mode – Solicits generation of the mask (v/s providing mask as an input). Supported values are: * background: Automatically generates a mask for all regions except the primary subject(s) of the image * foreground: Automatically generates a mask for the primary subjects(s) of the image. * semantic: Segment one or more of the segmentation classes using class ID
mask_dilation – Defines the dilation percentage of the mask provided. Float between 0 and 1. Defaults to 0.03
product_position – Defines whether the product should stay fixed or be repositioned. Supported Values: * fixed: Fixed position * reposition: Can be moved (default)
output_mime_type – Which image format should the output be saved as. Supported values: * image/png: Save as a PNG image * image/jpeg: Save as a JPEG image
compression_quality – Level of compression if the output mime type is selected to be image/jpeg. Float between 0 to 100
language – Language of the text prompt for the image. Default: None. Supported values are “en” for English, “hi” for Hindi, “ja” for Japanese, “ko” for Korean, and “auto” for automatic language detection.
seed – Image generation random seed.
output_gcs_uri – Google Cloud Storage uri to store the edited images.
safety_filter_level – Adds a filter level to Safety filtering. Supported values are: * “block_most” : Strongest filtering level, most strict blocking * “block_some” : Block some problematic prompts and responses * “block_few” : Block fewer problematic prompts and responses * “block_fewest” : Block very few problematic prompts and responses
person_generation – Allow generation of people by the model Supported values are: * “dont_allow” : Block generation of people * “allow_adult” : Generate adults, but not children * “allow_all” : Generate adults and children
Returns
An ImageGenerationResponse object.
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
generate_images(prompt: str, *, negative_prompt: Optional[str] = None, number_of_images: int = 1, aspect_ratio: Optional[Literal['1:1', '9:16', '16:9', '4;3', '3:4']] = None, guidance_scale: Optional[float] = None, language: Optional[str] = None, seed: Optional[int] = None, output_gcs_uri: Optional[str] = None, add_watermark: Optional[bool] = True, safety_filter_level: Optional[Literal['block_most', 'block_some', 'block_few', 'block_fewest']] = None, person_generation: Optional[Literal['dont_allow', 'allow_adult', 'allow_all']] = None)
Generates images from text prompt.
Parameters
prompt – Text prompt for the image.
negative_prompt – A description of what you want to omit in the generated images.
number_of_images – Number of images to generate. Range: 1..8.
aspect_ratio – Changes the aspect ratio of the generated image Supported values are: * “1:1” : 1:1 aspect ratio * “9:16” : 9:16 aspect ratio * “16:9” : 16:9 aspect ratio * “4:3” : 4:3 aspect ratio * “3:4” : 3;4 aspect_ratio
guidance_scale – Controls the strength of the prompt. Suggested values are: * 0-9 (low strength) * 10-20 (medium strength) * 21+ (high strength)
language – Language of the text prompt for the image. Default: None. Supported values are “en” for English, “hi” for Hindi, “ja” for Japanese, “ko” for Korean, and “auto” for automatic language detection.
seed – Image generation random seed.
output_gcs_uri – Google Cloud Storage uri to store the generated images.
add_watermark – Add a watermark to the generated image
safety_filter_level – Adds a filter level to Safety filtering. Supported values are: * “block_most” : Strongest filtering level, most strict blocking * “block_some” : Block some problematic prompts and responses * “block_few” : Block fewer problematic prompts and responses * “block_fewest” : Block very few problematic prompts and responses
person_generation – Allow generation of people by the model Supported values are: * “dont_allow” : Block generation of people * “allow_adult” : Generate adults, but not children * “allow_all” : Generate adults and children
Returns
An ImageGenerationResponse object.
upscale_image(image: Union[vertexai.vision_models.Image, vertexai.preview.vision_models.GeneratedImage], new_size: Optional[int] = 2048, output_gcs_uri: Optional[str] = None)
Upscales an image.
This supports upscaling images generated through the generate_images() method, or upscaling a new image that is 1024x1024.
Examples:
# Upscale a generated image
model = ImageGenerationModel.from_pretrained("imagegeneration@002")
response = model.generate_images(
prompt="Astronaut riding a horse",
)
model.upscale_image(image=response[0])
# Upscale a new 1024x1024 image
my_image = Image.load_from_file("my-image.png")
model.upscale_image(image=my_image)
Parameters
image (Union[GeneratedImage, **Image]) – Required. The generated image to upscale.
new_size (int) – The size of the biggest dimension of the upscaled image. Only 2048 and 4096 are currently supported. Results in a 2048x2048 or 4096x4096 image. Defaults to 2048 if not provided.
output_gcs_uri – Google Cloud Storage uri to store the upscaled images.
Returns
An Image object.
class vertexai.preview.vision_models.ImageGenerationResponse(images: List[GeneratedImage])
Bases: object
Image generation response.
images()
The list of generated images.
Type
List[vertexai.preview.vision_models.GeneratedImage]
_getitem_(idx: int)
Gets the generated image by index.
_iter_()
Iterates through the generated images.
class vertexai.preview.vision_models.ImageQnAModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai._model_garden._model_garden_models._ModelGardenModel
Answers questions about an image.
Examples:
model = ImageQnAModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
answers = model.ask_question(
image=image,
question="What color is the car in this image?",
# Optional:
number_of_results=1,
)
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
ask_question(image: vertexai.vision_models.Image, question: str, *, number_of_results: int = 1)
Answers questions about an image.
Parameters
image – The image to get captions for. Size limit: 10 MB.
question – Question to ask about the image.
number_of_results – Number of captions to produce. Range: 1-3.
Returns
A list of answers.
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
class vertexai.preview.vision_models.ImageTextModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai.vision_models.ImageCaptioningModel
, vertexai.vision_models.ImageQnAModel
Generates text from images.
Examples:
model = ImageTextModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
captions = model.get_captions(
image=image,
# Optional:
number_of_results=1,
language="en",
)
answers = model.ask_question(
image=image,
question="What color is the car in this image?",
# Optional:
number_of_results=1,
)
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
ask_question(image: vertexai.vision_models.Image, question: str, *, number_of_results: int = 1)
Answers questions about an image.
Parameters
image – The image to get captions for. Size limit: 10 MB.
question – Question to ask about the image.
number_of_results – Number of captions to produce. Range: 1-3.
Returns
A list of answers.
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
get_captions(image: vertexai.vision_models.Image, *, number_of_results: int = 1, language: str = 'en', output_gcs_uri: Optional[str] = None)
Generates captions for a given image.
Parameters
image – The image to get captions for. Size limit: 10 MB.
number_of_results – Number of captions to produce. Range: 1-3.
language – Language to use for captions. Supported languages: “en”, “fr”, “de”, “it”, “es”
output_gcs_uri – Google Cloud Storage uri to store the captioned images.
Returns
A list of image caption strings.
class vertexai.preview.vision_models.MultiModalEmbeddingModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai._model_garden._model_garden_models._ModelGardenModel
Generates embedding vectors from images and videos.
Examples:
model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding@001")
image = Image.load_from_file("image.png")
video = Video.load_from_file("video.mp4")
embeddings = model.get_embeddings(
image=image,
video=video,
contextual_text="Hello world",
)
image_embedding = embeddings.image_embedding
video_embeddings = embeddings.video_embeddings
text_embedding = embeddings.text_embedding
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
get_embeddings(image: Optional[vertexai.vision_models.Image] = None, video: Optional[vertexai.vision_models.Video] = None, contextual_text: Optional[str] = None, dimension: Optional[int] = None, video_segment_config: Optional[vertexai.vision_models.VideoSegmentConfig] = None)
Gets embedding vectors from the provided image.
Parameters
image (Image) – Optional. The image to generate embeddings for. One of image, video, or contextual_text is required.
video (Video) – Optional. The video to generate embeddings for. One of image, video or contextual_text is required.
contextual_text (str) – Optional. Contextual text for your input image or video. If provided, the model will also generate an embedding vector for the provided contextual text. The returned image and text embedding vectors are in the same semantic space with the same dimensionality, and the vectors can be used interchangeably for use cases like searching image by text or searching text by image. One of image, video or contextual_text is required.
dimension (int) – Optional. The number of embedding dimensions. Lower values offer decreased latency when using these embeddings for subsequent tasks, while higher values offer better accuracy. Available values: 128, 256, 512, and 1408 (default).
video_segment_config (VideoSegmentConfig) – Optional. The specific video segments (in seconds) the embeddings are generated for.
Returns
The image and text embedding vectors.
Return type
MultiModalEmbeddingResponse
class vertexai.preview.vision_models.MultiModalEmbeddingResponse(_prediction_response: Any, image_embedding: Optional[List[float]] = None, video_embeddings: Optional[List[vertexai.vision_models.VideoEmbedding]] = None, text_embedding: Optional[List[float]] = None)
Bases: object
The multimodal embedding response.
image_embedding()
Optional. The embedding vector generated from your image.
Type
List[float]
video_embeddings()
Optional. The embedding vectors generated from your video.
Type
List[VideoEmbedding]
text_embedding()
Optional. The embedding vector generated from the contextual text provided for your image or video.
Type
List[float]
class vertexai.preview.vision_models.Video(video_bytes: Optional[bytes] = None, gcs_uri: Optional[str] = None)
Bases: object
Video.
Creates an Image object.
Parameters
video_bytes – Video file bytes. Video can be in AVI, FLV, MKV, MOV, MP4, MPEG, MPG, WEBM, and WMV formats.
gcs_uri – Image URI in Google Cloud Storage.
static load_from_file(location: str)
Loads video from local file or Google Cloud Storage.
Parameters
location – Local path or Google Cloud Storage uri from where to load the video.
Returns
Loaded video as an Video object.
save(location: str)
Saves video to a file.
Parameters
location – Local path where to save the video.
class vertexai.preview.vision_models.VideoEmbedding(start_offset_sec: int, end_offset_sec: int, embedding: List[float])
Bases: object
Embeddings generated from video with offset times.
Creates a VideoEmbedding object.
Parameters
start_offset_sec – Start time offset (in seconds) of generated embeddings.
end_offset_sec – End time offset (in seconds) of generated embeddings.
embedding – Generated embedding for interval.
class vertexai.preview.vision_models.VideoSegmentConfig(start_offset_sec: int = 0, end_offset_sec: int = 120, interval_sec: int = 16)
Bases: object
The specific video segments (in seconds) the embeddings are generated for.
Creates a VideoSegmentConfig object.
Parameters
start_offset_sec – Start time offset (in seconds) to generate embeddings for.
end_offset_sec – End time offset (in seconds) to generate embeddings for.
interval_sec – Interval to divide video for generated embeddings.
class vertexai.preview.vision_models.WatermarkVerificationModel(model_id: str, endpoint_name: Optional[str] = None)
Bases: vertexai._model_garden._model_garden_models._ModelGardenModel
Verifies if an image has a watermark
Creates a _ModelGardenModel.
This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=…) instead.
Parameters
model_id – Identifier of a Model Garden Model. Example: “text-bison@001”
endpoint_name – Vertex Endpoint resource name for the model
classmethod from_pretrained(model_name: str)
Loads a _ModelGardenModel.
Parameters
model_name – Name of the model.
Returns
An instance of a class derieved from _ModelGardenModel.
Raises
ValueError – If model_name is unknown.
ValueError – If model does not support this class.
verify_image(image: vertexai.vision_models.Image)
Verifies the watermark of an image.
Parameters
image – The image to verify.
Returns
A WatermarkVerificationResponse, containing the confidence level of the image being watermarked.
class vertexai.preview.vision_models.WatermarkVerificationResponse(_prediction_response: Any, watermark_verification_result: Optional[str] = None)
Bases: object