Migrate from PaLM API to Gemini API on Vertex AI

This guide shows how to migrate Vertex AI SDK for Python code from using the PaLM APi to using the Gemini API. You can generate text, multi-turn conversations (chat), and code with Gemini. After you migrate, check your responses because the Gemini output might be different from PaLM output. For more information, see the Introduction to multimodal classes in the Vertex AI SDK.

Gemini differences from PaLM

The following are some differences between Gemini and PaLM models:

Their response structures are different. To learn about the Gemini response structure, see the Gemini API model reference response body.
Their safety categories are different. To learn about differences between Gemini and PaLM safety settings, see Key differences between Gemini and other model families.
Gemini can't perform code completion. If you need to create a code completion application, use the code-gecko model. For more information, see Codey code completion model.
For code generation, Gemini has a higher recitation block rate.
The confidence score in Codey code generation models that indicates how confident the model is in its response isn't exposed in Gemini.

Update PaLM code to use Gemini models

The methods on the GenerativeModel class are mostly the same as the methods on the PaLM classes. For example, use GenerativeModel.start_chat to replace the PaLM equivalent, ChatModel.start_chat. However, because Google Cloud is always improving and updating Gemini, you might run into some differences. For more information, see the Python SDK Reference

To migrate from the PaLM API to the Gemini API, the following code modifications are required:

For all PaLM model classes, you use the GenerativeModel class in Gemini.
To use the GenerativeModel class, run the following import statement:

from vertexai.generative_models import GenerativeModel
To load a Gemini model, use the GenerativeModel constructor instead of using the from_pretrained method. For example, to load the Gemini 1.0 Pro model, use GenerativeModel(gemini-1.0-pro).
To generate text in Gemini, use the GenerativeModel.generate_content method instead of the predict method that's used on PaLM models. For example:

   model = GenerativeModel("gemini-1.0-pro-002")
   response = model.generate_content("Write a short poem about the moon")

Gemini and PaLM class comparison

Each PaLM model class is replaced by the GenerativeModel class in Gemini. The following table shows the classes used by the PaLM models and their equivalent class in Gemini.

PaLM model	PaLM model class	Gemini model class
`text-bison`	`TextGenerationModel`	`GenerativeModel`
`chat-bison`	`ChatModel`	`GenerativeModel`
`code-bison`	`CodeGenerationModel`	`GenerativeModel`
`codechat-bison`	`CodeChatModel`	`GenerativeModel`

Common setup instructions

For both PaLM API and Gemini API in Vertex AI, the setup process is the same. For more information, see Introduction to the Vertex AI SDK for Python. The following is a short code sample that installs the Vertex AI SDK for Python.

pip install google-cloud-aiplatform
import vertexai
vertexai.init(project="PROJECT_ID", location="LOCATION")

In this sample code, replace PROJECT_ID with your Google Cloud project ID, and replace LOCATION with the location of your Google Cloud project (for example, us-central1).

Gemini and PaLM code samples

Each of the following pairs of code samples includes PaLM code and, next to it, Gemini code that's been migrated from the PaLM code.

Text generation: basic

The following code samples show the differences between the PaLM API and Gemini API for creating a text generation model.

PaLM Gemini

PaLM	Gemini
`from vertexai.language_models import TextGenerationModel model = TextGenerationModel.from_pretrained("text-bison@002") response = model.predict(prompt="The opposite of hot is") print(response.text) # 'cold.'`	`from vertexai.generative_models import GenerativeModel model = GenerativeModel("gemini-1.0-pro") responses = model.generate_content("The opposite of hot is") for response in responses: print(response.text)`


        
from vertexai.language_models import TextGenerationModel

model = TextGenerationModel.from_pretrained("text-bison@002")

response = model.predict(prompt="The opposite of hot is")
print(response.text) #  'cold.'


        
from vertexai.generative_models import GenerativeModel

model = GenerativeModel("gemini-1.0-pro")

responses = model.generate_content("The opposite of hot is")

for response in responses:
    print(response.text)

Text generation with parameters

The following code samples show the differences between the PaLM API and Gemini API for creating a text generation model, with optional parameters.

PaLM Gemini

PaLM	Gemini
from vertexai.language_models import TextGenerationModel model = TextGenerationModel.from_pretrained("text-bison@002") prompt = """ You are an expert at solving word problems. Solve the following problem: I have three houses, each with three cats. each cat owns 4 mittens, and a hat. Each mitten was knit from 7m of yarn, each hat from 4m. How much yarn was needed to make all the items? Think about it step by step, and show your work. """ response = model.predict( prompt=prompt, temperature=0.1, max_output_tokens=800, top_p=1.0, top_k=40 ) print(response.text)	from vertexai.generative_models import GenerativeModel model = GenerativeModel("gemini-1.0-pro") prompt = """ You are an expert at solving word problems. Solve the following problem: I have three houses, each with three cats. each cat owns 4 mittens, and a hat. Each mitten was knit from 7m of yarn, each hat from 4m. How much yarn was needed to make all the items? Think about it step by step, and show your work. """ responses = model.generate_content( prompt, generation_config={ "temperature": 0.1, "max_output_tokens": 800, "top_p": 1.0, "top_k": 40, } ) for response in responses: print(response.text)


        
from vertexai.language_models import TextGenerationModel

model = TextGenerationModel.from_pretrained("text-bison@002")

prompt = """
You are an expert at solving word problems.

Solve the following problem:

I have three houses, each with three cats.
each cat owns 4 mittens, and a hat. Each mitten was
knit from 7m of yarn, each hat from 4m.
How much yarn was needed to make all the items?

Think about it step by step, and show your work.
"""

response = model.predict(
    prompt=prompt,
    temperature=0.1,
    max_output_tokens=800,
    top_p=1.0,
    top_k=40
)

print(response.text)


        
from vertexai.generative_models import GenerativeModel

model = GenerativeModel("gemini-1.0-pro")

prompt = """
You are an expert at solving word problems.

Solve the following problem:

I have three houses, each with three cats.
each cat owns 4 mittens, and a hat. Each mitten was
knit from 7m of yarn, each hat from 4m.
How much yarn was needed to make all the items?

Think about it step by step, and show your work.
"""

responses = model.generate_content(
    prompt,
    generation_config={
        "temperature": 0.1,
        "max_output_tokens": 800,
        "top_p": 1.0,
        "top_k": 40,
    }
  )

for response in responses:
    print(response.text)

Chat

The following code samples show the differences between the PaLM API and Gemini API for creating a chat model.

PaLM Gemini

PaLM	Gemini
`from vertexai.language_models import ChatModel model = ChatModel.from_pretrained("chat-bison@002") chat = model.start_chat() print( chat.send_message( """ Hello! Can you write a 300 word abstract for a research paper I need to write about the impact of AI on society? """ ) ) print( chat.send_message( """ Could you give me a catchy title for the paper? """ ) )`	`from vertexai.generative_models import GenerativeModel model = GenerativeModel("gemini-1.0-pro") chat = model.start_chat() responses = chat.send_message( """ Hello! Can you write a 300 word abstract for a research paper I need to write about the impact of AI on society? """ ) for response in responses: print(response.text) responses = chat.send_message( """ Could you give me a catchy title for the paper? """ ) for response in responses: print(response.text)`


        
from vertexai.language_models import ChatModel

model = ChatModel.from_pretrained("chat-bison@002")

chat = model.start_chat()

print(
    chat.send_message(
        """
Hello! Can you write a 300 word abstract for a research paper I need to write about the impact of AI on society?
"""
    )
)

print(
    chat.send_message(
        """
Could you give me a catchy title for the paper?
"""
    )
)


        
from vertexai.generative_models import GenerativeModel

model = GenerativeModel("gemini-1.0-pro")

chat = model.start_chat()


responses = chat.send_message(
        """
Hello! Can you write a 300 word abstract for a research paper I need to write about the impact of AI on society?
"""
    )

for response in responses:
   print(response.text)


responses = chat.send_message(
        """
Could you give me a catchy title for the paper?
"""
    )

for response in responses:
   print(response.text)

Code generation

The following code samples show the differences between the PaLM API and Gemini API for generating a function that predicts if a year is a leap year.

Codey Gemini

Codey	Gemini
`from vertexai.language_models import CodeGenerationModel model = CodeGenerationModel.from_pretrained("code-bison@002") response = model.predict( prefix="Write a function that checks if a year is a leap year." ) print(response.text)`	`from vertexai.generative_models import GenerativeModel model = GenerativeModel("gemini-1.0-pro-002") response = model.generate_content("Write a function that checks if a year is a leap year.") print(response.text)`


        
from vertexai.language_models import CodeGenerationModel

model = CodeGenerationModel.from_pretrained("code-bison@002")

response = model.predict(
        prefix="Write a function that checks if a year is a leap year."
    )

print(response.text)


        
from vertexai.generative_models import GenerativeModel

model = GenerativeModel("gemini-1.0-pro-002")

response = model.generate_content("Write a function that checks if a year is a leap year.")

print(response.text)

Migrate prompts to Gemini models

If you have sets of prompts that you previously used with PaLM 2 models, you can optimize them for use with Gemini models by using the Vertex AI prompt optimizer (Preview).

Next steps

See the Google models page for more details on the latest models and features.