Vertex Generative AI SDK for Python
The Gen AI Modules in the Vertex SDK help developers use Google’s generative AI Gemini models to build AI-powered features and applications in Vertex.
The modules currently available are: Evaluation, Agent Engines, Prompt Management, and Prompt Optimization. See below for instructions on getting started with each module. For other Gemini features on Vertex, use the Gen AI SDK.
Installation
To install the google-cloud-aiplatform Python package, run the following command:
pip3 install --upgrade --user "google-cloud-aiplatform>=1.114.0"
Imports:
import vertexai
from vertexai import types
Client initialization
client = vertexai.Client(project='my-project', location='us-central1')
Gen AI Evaluation
To run evaluation, first generate model responses from a set of prompts.
import pandas as pd
prompts_df = pd.DataFrame({
"prompt": [
"What is the capital of France?",
"Write a haiku about a cat.",
"Write a Python function to calculate the factorial of a number.",
"Translate 'How are you?' to French.",
],
})
inference_results = client.evals.run_inference(
model="gemini-2.5-flash",
("def factorial(n):\n"
" if n < 0:\n"
" return 'Factorial does not exist for negative numbers'\n"
" elif n == 0:\n"
" return 1\n"
" else:\n"
" fact = 1\n"
" i = 1\n"
" while i <= n:\n"
" fact *= i\n"
" i += 1\n"
" return fact"),
)
inference_results.show()
Then run evaluation by providing the inference results and specifying the metric types.
eval_result = client.evals.evaluate(
dataset=inference_results,
metrics=[
types.RubricMetric.GENERAL_QUALITY,
]
)
eval_result.show()
Agent Engine with Agent Development Kit (ADK)
First, define a function that looks up the exchange rate:
def get_exchange_rate(
currency_from: str = "USD",
currency_to: str = "EUR",
currency_date: str = "latest",
):
"""Retrieves the exchange rate between two currencies on a specified date.
Uses the Frankfurter API (https://api.frankfurter.app/) to obtain
exchange rate data.
Returns:
dict: A dictionary containing the exchange rate information.
Example: {"amount": 1.0, "base": "USD", "date": "2023-11-24",
"rates": {"EUR": 0.95534}}
"""
import requests
response = requests.get(
f"https://api.frankfurter.app/{currency_date}",
params={"from": currency_from, "to": currency_to},
)
return response.json()
Next, define an ADK Agent:
from google.adk.agents import Agent
from vertexai.agent_engines import AdkApp
app = AdkApp(agent=Agent(
model="gemini-2.0-flash", # Required.
name='currency_exchange_agent', # Required.
tools=[get_exchange_rate], # Optional.
))
Test the agent locally using US dollars and Swedish Krona:
async for event in app.async_stream_query(
user_id="user-id",
message="What is the exchange rate from US dollars to SEK today?",
):
print(event)
To deploy the agent to Agent Engine:
remote_app = client.agent_engines.create(
agent=app,
config={
"requirements": ["google-cloud-aiplatform[agent_engines,adk]"],
},
)
You can also run queries against the deployed agent:
async for event in remote_app.async_stream_query(
user_id="user-id",
message="What is the exchange rate from US dollars to SEK today?",
):
print(event)
Prompt Optimization
To do a zero-shot prompt optimization, use the optimize_prompt
method.
prompt = "Generate system instructions for a question-answering assistant"
response = client.prompt_optimizer.optimize_prompt(prompt=prompt)
print(response.raw_text_response)
if response.parsed_response:
print(response.parsed_response.suggested_prompt)
To call the data-driven prompt optimization, call the optimize
method.
In this case however, we need to provide vapo_config
. This config needs to
have either service account or project number and the config path.
Please refer to this tutorial
for more details on config parameter.
import logging
project_number = PROJECT_NUMBER # replace with your project number
service_account = f"{project_number}-compute@developer.gserviceaccount.com"
vapo_config = types.PromptOptimizerVAPOConfig(
config_path="gs://your-bucket/config.json",
service_account_project_number=project_number,
wait_for_completion=False
)
# Set up logging to see the progress of the optimization job
logging.basicConfig(encoding='utf-8', level=logging.INFO, force=True)
result = client.prompt_optimizer.optimize(method="vapo", config=vapo_config)
We can also call optimize method async.
await client.aio.prompt_optimizer.optimize(method="vapo", config=vapo_config)
Prompt Management
The Prompt Management module uses some types from the Gen AI SDK. First, import those types:
from google.genai import types as genai_types
To create and store a Prompt, first define a types.Prompt object and then run create_version
to save it in Vertex.
prompt = types.Prompt(
prompt_data=types.PromptData(
contents=[genai_types.Content(parts=[genai_types.Part(text="Hello, {name}! How are you?")])],
system_instruction=genai_types.Content(parts=[genai_types.Part(text="Please answer in a short sentence.")]),
generation_config=genai_types.GenerationConfig(temperature=0.1),
safety_settings=[genai_types.SafetySetting(
category="HARM_CATEGORY_DANGEROUS_CONTENT",
threshold="BLOCK_MEDIUM_AND_ABOVE",
method="SEVERITY",
)],
variables=[
{"name": genai_types.Part(text="Alice")},
{"name": genai_types.Part(text="Bob")},
],
model="gemini-2.0-flash-001",
),
)
prompt_resource = client.prompt_management.create_version(
prompt=prompt,
)
To retrieve a prompt, provide the prompt_id
:
retrieved_prompt = client.prompt_management.get(prompt_id=prompt_resource.prompt_id)
After creating or retrieving a prompt, you can call generate_content()
with that prompt using the Gen AI SDK.
The following uses a utility function available on Prompt objects to transform a Prompt object into a list of Part objects for use with generate_content
. To run this you need to have the Gen AI SDK installed, which you can do via pip install google-genai
.
from google import genai
from google.genai import types as genai_types
# Create a Client in the Gen AI SDK
genai_client = genai.Client(vertexai=True, project="your-project", location="your-location")
# Call generate_content() with the prompt
response = genai_client.models.generate_content(
model=retrieved_prompt.prompt_data.model,
contents=retrieved_prompt.assemble_contents(),
)
Warning
The following Generative AI modules in the Vertex AI SDK are deprecated as of
June 24, 2025 and will be removed on June 24, 2026:
vertexai.generative_models
, vertexai.language_models
,
vertexai.vision_models
, vertexai.tuning
, vertexai.caching
. Please use the
Google Gen AI SDK to access these
features. See
the migration guide
for details. You can continue using all other Vertex AI SDK modules, as they are
the recommended way to use the API.
Imports:
import vertexai
Initialization:
vertexai.init(project='my-project', location='us-central1')
Basic generation:
from vertexai.generative_models import GenerativeModel
model = GenerativeModel("gemini-pro")
print(model.generate_content("Why is sky blue?"))
Using images and videos
from vertexai.generative_models import GenerativeModel, Image
vision_model = GenerativeModel("gemini-pro-vision")
# Local image
image = Image.load_from_file("image.jpg")
print(vision_model.generate_content(["What is shown in this image?", image]))
# Image from Cloud Storage
image_part = generative_models.Part.from_uri("gs://download.tensorflow.org/example_images/320px-Felis_catus-cat_on_snow.jpg", mime_type="image/jpeg")
print(vision_model.generate_content([image_part, "Describe this image?"]))
# Text and video
video_part = Part.from_uri("gs://cloud-samples-data/video/animals.mp4", mime_type="video/mp4")
print(vision_model.generate_content(["What is in the video? ", video_part]))
Chat
from vertexai.generative_models import GenerativeModel, Image
vision_model = GenerativeModel("gemini-ultra-vision")
vision_chat = vision_model.start_chat()
image = Image.load_from_file("image.jpg")
print(vision_chat.send_message(["I like this image.", image]))
print(vision_chat.send_message("What things do I like?."))
System instructions
from vertexai.generative_models import GenerativeModel
model = GenerativeModel(
"gemini-1.0-pro",
system_instruction=[
"Talk like a pirate.",
"Don't use rude words.",
],
)
print(model.generate_content("Why is sky blue?"))
Function calling
# First, create tools that the model is can use to answer your questions.
# Describe a function by specifying it's schema (JsonSchema format)
get_current_weather_func = generative_models.FunctionDeclaration(
name="get_current_weather",
description="Get the current weather in a given location",
parameters={
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit",
]
}
},
"required": [
"location"
]
},
)
# Tool is a collection of related functions
weather_tool = generative_models.Tool(
function_declarations=[get_current_weather_func],
)
# Use tools in chat:
model = GenerativeModel(
"gemini-pro",
# You can specify tools when creating a model to avoid having to send them with every request.
tools=[weather_tool],
)
chat = model.start_chat()
# Send a message to the model. The model will respond with a function call.
print(chat.send_message("What is the weather like in Boston?"))
# Then send a function response to the model. The model will use it to answer.
print(chat.send_message(
Part.from_function_response(
name="get_current_weather",
response={
"content": {"weather": "super nice"},
}
),
))
Automatic Function calling
Note: The FunctionDeclaration.from_func
converter does not support nested types for parameters. Please provide full FunctionDeclaration
instead.
from vertexai.preview.generative_models import GenerativeModel, Tool, FunctionDeclaration, AutomaticFunctionCallingResponder
# First, create functions that the model can use to answer your questions.
def get_current_weather(location: str, unit: str = "centigrade"):
"""Gets weather in the specified location.
Args:
location: The location for which to get the weather.
unit: Optional. Temperature unit. Can be Centigrade or Fahrenheit. Defaults to Centigrade.
"""
return dict(
location=location,
unit=unit,
weather="Super nice, but maybe a bit hot.",
)
# Infer function schema
get_current_weather_func = FunctionDeclaration.from_func(get_current_weather)
# Tool is a collection of related functions
weather_tool = Tool(
function_declarations=[get_current_weather_func],
)
# Use tools in chat:
model = GenerativeModel(
"gemini-pro",
# You can specify tools when creating a model to avoid having to send them with every request.
tools=[weather_tool],
)
# Activate automatic function calling:
afc_responder = AutomaticFunctionCallingResponder(
# Optional:
max_automatic_function_calls=5,
)
chat = model.start_chat(responder=afc_responder)
# Send a message to the model. The model will respond with a function call.
# The SDK will automatically call the requested function and respond to the model.
# The model will use the function call response to answer the original question.
print(chat.send_message("What is the weather like in Boston?"))
Evaluation
- To perform bring-your-own-response(BYOR) evaluation, provide the model responses in the
response
column in the dataset. If a pairwise metric is used for BYOR evaluation, provide the baseline model responses in thebaseline_model_response
column.
import pandas as pd
from vertexai.evaluation import EvalTask, MetricPromptTemplateExamples
eval_dataset = pd.DataFrame({
"prompt" : [...],
"reference": [...],
"response" : [...],
"baseline_model_response": [...],
})
eval_task = EvalTask(
dataset=eval_dataset,
metrics=[
"bleu",
"rouge_l_sum",
MetricPromptTemplateExamples.Pointwise.FLUENCY,
MetricPromptTemplateExamples.Pairwise.SAFETY
],
experiment="my-experiment",
)
eval_result = eval_task.evaluate(experiment_run_name="eval-experiment-run")
- To perform evaluation with Gemini model inference, specify the
model
parameter with aGenerativeModel
instance. The input column name to the model isprompt
and must be present in the dataset.
from vertexai.evaluation import EvalTask
from vertexai.generative_models import GenerativeModel
eval_dataset = pd.DataFrame({
"reference": [...],
"prompt" : [...],
})
result = EvalTask(
dataset=eval_dataset,
metrics=["exact_match", "bleu", "rouge_1", "rouge_l_sum"],
experiment="my-experiment",
).evaluate(
model=GenerativeModel("gemini-1.5-pro"),
experiment_run_name="gemini-eval-run"
)
- If a
prompt_template
is specified, theprompt
column is not required. Prompts can be assembled from the evaluation dataset, and all prompt template variable names must be present in the dataset columns.
import pandas as pd
from vertexai.evaluation import EvalTask, MetricPromptTemplateExamples
from vertexai.generative_models import GenerativeModel
eval_dataset = pd.DataFrame({
"context" : [...],
"instruction": [...],
})
result = EvalTask(
dataset=eval_dataset,
metrics=[MetricPromptTemplateExamples.Pointwise.SUMMARIZATION_QUALITY],
).evaluate(
model=GenerativeModel("gemini-1.5-pro"),
prompt_template="{instruction}. Article: {context}. Summary:",
)
- To perform evaluation with custom model inference, specify the
model
parameter with a custom inference function. The input column name to the custom inference function isprompt
and must be present in the dataset.
from openai import OpenAI
from vertexai.evaluation import EvalTask, MetricPromptTemplateExamples
client = OpenAI()
def custom_model_fn(input: str) -> str:
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "user", "content": input}
]
)
return response.choices[0].message.content
eval_dataset = pd.DataFrame({
"prompt" : [...],
"reference": [...],
})
result = EvalTask(
dataset=eval_dataset,
metrics=[MetricPromptTemplateExamples.Pointwise.SAFETY],
experiment="my-experiment",
).evaluate(
model=custom_model_fn,
experiment_run_name="gpt-eval-run"
)
- To perform pairwise metric evaluation with model inference step, specify
the
baseline_model
input to aPairwiseMetric
instance and the candidatemodel
input to theEvalTask.evaluate()
function. The input column name to both models isprompt
and must be present in the dataset.
import pandas as pd
from vertexai.evaluation import EvalTask, MetricPromptTemplateExamples, PairwiseMetric
from vertexai.generative_models import GenerativeModel
baseline_model = GenerativeModel("gemini-1.0-pro")
candidate_model = GenerativeModel("gemini-1.5-pro")
pairwise_groundedness = PairwiseMetric(
metric_prompt_template=MetricPromptTemplateExamples.get_prompt_template(
"pairwise_groundedness"
),
baseline_model=baseline_model,
)
eval_dataset = pd.DataFrame({
"prompt" : [...],
})
result = EvalTask(
dataset=eval_dataset,
metrics=[pairwise_groundedness],
experiment="my-pairwise-experiment",
).evaluate(
model=candidate_model,
experiment_run_name="gemini-pairwise-eval-run",
)
Agent Engine
Before you begin, install the packages with
pip3 install --upgrade --user "google-cloud-aiplatform[agent_engines,adk]>=1.111"
First, define a function that looks up the exchange rate:
def get_exchange_rate(
currency_from: str = "USD",
currency_to: str = "EUR",
currency_date: str = "latest",
):
"""Retrieves the exchange rate between two currencies on a specified date.
Uses the Frankfurter API (https://api.frankfurter.app/) to obtain
exchange rate data.
Returns:
dict: A dictionary containing the exchange rate information.
Example: {"amount": 1.0, "base": "USD", "date": "2023-11-24",
"rates": {"EUR": 0.95534}}
"""
import requests
response = requests.get(
f"https://api.frankfurter.app/{currency_date}",
params={"from": currency_from, "to": currency_to},
)
return response.json()
Next, define an ADK Agent:
from google.adk.agents import Agent
from vertexai.agent_engines import AdkApp
app = AdkApp(agent=Agent(
model="gemini-2.0-flash", # Required.
name='currency_exchange_agent', # Required.
tools=[get_exchange_rate], # Optional.
))
Test the agent locally using US dollars and Swedish Krona:
async for event in app.async_stream_query(
user_id="user-id",
message="What is the exchange rate from US dollars to SEK today?",
):
print(event)
To deploy the agent to Agent Engine:
vertexai.init(
project='my-project',
location='us-central1',
staging_bucket="gs://my-staging-bucket",
)
remote_app = vertexai.agent_engines.create(
app,
requirements=["google-cloud-aiplatform[agent_engines,adk]"],
)
You can also run queries against the deployed agent:
async for event in remote_app.async_stream_query(
user_id="user-id",
message="What is the exchange rate from US dollars to SEK today?",
):
print(event)
Documentation
You can find complete documentation for the Vertex AI SDKs and the Gemini model in the Google Cloud documentation
Contributing
See Contributing for more information on contributing to the Vertex AI Python SDK.
License
The contents of this repository are licensed under the Apache License, version 2.0.