Gemini API FAQ

This document provides answers to frequently asked questions (FAQs) about Gemini API, organized into the following categories:

Model comparisons

What is the difference between PaLM and Gemini?

Gemini models are designed for multimodal applications. Gemini models accept prompts that include, for example, text and images, and then return a text response. Gemini also supports function calling, which lets developers pass a description of a function and then the model returns a function and parameters that best matches the description. Developers can then call that function in external APIs and services.

PaLM 2 models are Generally Available (GA). PaLM 2 models are designed for language applications and perform well on use cases such as text summarization and text generation. PaLM 2 also offers full support for MLOps services on Vertex AI, such as auto side-by-side comparison and model monitoring, which aren't available with Gemini.

With Vertex AI Studio, you can customize both Gemini and PaLM 2 models with full data controls and take advantage of Google Cloud's security, safety, privacy, and data governance and compliance support. Prompts and tuning data for both Gemini and PaLM 2 are never used to train or enhance our foundation models.

Why would you choose PaLM over Gemini?

For use cases that exclusively require text input-output (like text summarization, text generation, and Q&A), PaLM 2 models can provide sufficiently high quality responses.

Gemini models are a good fit for use cases that include multimodal input, require function calling, or require complex prompting techniques (like chain-of-thought and complex instruction following).

Is PaLM 2 being deprecated?

There are no plans to deprecate PaLM 2.

What is the difference between Imagen on Vertex AI and Gemini API for vision use cases?

Imagen is a vision model for image generation, editing, captioning, and Q&A use cases. As part of your prompts, Gemini can take multiple images or a video and provide answers about your inputs, where as Imagen can take only a one input image. Gemini doesn't support image generation or image editing.

What is the difference between Vertex AI Codey APIs and Gemini API for coding use cases?

Codey APIs is purpose-built for code generation, code completion, and code chat. The Codey APIs are powered by Gemini and other models developed by Google. You can use the APIs across the software development lifecycle by integrating it into IDEs, CI/CD workflows, dashboards, and other applications. You can also customize the models with your codebase. We don't recommend Gemini 1.0 Pro Vision for code generation.

How do I send a prompt to the Gemini 1.0 Pro or Gemini 1.0 Pro Vision model

There are a number of different methods you can use to send requests to the Gemini API. You can, for example, use Google Cloud console, a programming language SDK, or the REST API to send requests to gemini-1.0-pro (Gemini 1.0 Pro) or gemini-1.0-pro-vision (Gemini 1.0 Pro Vision).

To get started, see Try the Gemini API.

Is fine-tuning available for Gemini?

You can fine-tune version 002 of the stable version of Gemini 1.0 Pro (gemini-1.0-pro-002). For more information, see Overview of model tuning for Gemini.

Safety and data usage

Why are my responses blocked?

Generative AI on Vertex AI uses safety filters to prevent potentially harmful responses. You can adjust this safety filter threshold. For more information, see Responsible AI.

How is my input data used?

Google ensures that its teams are following our AI/ML privacy commitment through robust data governance practices, which include reviews of the data that Google Cloud uses in the development of its products. For details, see Generative AI and Data Governance.

Do you cache my data?

Google can cache a customer's inputs and outputs for Gemini models to accelerate responses to subsequent prompts from the customer. Cached contents are stored for up to 24 hours. By default, data caching is enabled for each Google Cloud project. The same cache settings for a Google Cloud project apply to all regions. You can use the following curl commands to get caching status, disable caching, or re-enable caching. For more information, see Prediction on the Generative AI and Data Governance page. When you disable or re-enable caching, the change applies to all Google Cloud regions. For more information about using Identity and Access Management to grant permissions required to enable or disable caching, see Vertex AI access control with IAM. Expand the following sections to learn how to get the current cache setting, to disable caching, and to enable caching.

Get current caching setting

Run the following command to determine if caching is enabled or disabled for a project. To run this command, a user must be granted one of the following roles: roles/aiplatform.viewer, roles/aiplatform.user, or roles/aiplatform.admin.

PROJECT_ID=PROJECT_ID
# Setup project_id
$ gcloud config set project PROJECT_ID

# GetCacheConfig
$ curl -X GET -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/cacheConfig

# Response if caching is enabled (caching is enabled by default).
{
  "name": "projects/PROJECT_ID/cacheConfig"
}

# Response if caching is disabled.
{
  "name": "projects/PROJECT_ID/cacheConfig"
  "disableCache": true
}
    

Disable caching

Run the following curl command to enable caching for a Google Cloud project. To run this command, a user must be granted the Vertex AI administrator role, roles/aiplatform.admin.

PROJECT_ID=PROJECT_ID
# Setup project_id
$ gcloud config set project PROJECT_ID

# Setup project_id.
$ gcloud config set project ${PROJECT_ID}

# Opt-out of caching.
$ curl -X PATCH -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/cacheConfig -d '{
  "name": "projects/PROJECT_ID/cacheConfig",
  "disableCache": true
}'

# Response.
{
  "name": "projects/PROJECT_ID/locations/us-central1/projects/PROJECT_ID/cacheConfig/operations/${OPERATION_ID}",
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.protobuf.Empty"
  }
}
    

Enable caching

If you disabled caching for a Google Cloud project and want re-enable it, run the following curl command. To run this command, a user must be granted the Vertex AI administrator role, roles/aiplatform.admin.

PROJECT_ID=PROJECT_ID
LOCATION_ID="us-central1"
# Setup project_id
$ gcloud config set project PROJECT_ID

# Setup project_id.
$ gcloud config set project ${PROJECT_ID}

# Opt in to caching.
$ curl -X PATCH     -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/cacheConfig -d '{
  "name": "projects/PROJECT_ID/cacheConfig",
  "disableCache": false
}'

# Response.
{
  "name": "projects/PROJECT_ID/locations/us-central1/projects/PROJECT_ID/cacheConfig/operations/${OPERATION_NUMBER}",
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.protobuf.Empty"
  }
}
    

Migration

How do I migrate Gemini on Google AI Studio to Vertex AI Studio?

Migrating to Google Cloud's Vertex AI platform offers a suite of MLOps tools that streamline the usage, deployment, and monitoring of AI models for efficiency and reliability. To migrate your work to Vertex AI, import and upload your existing data to Vertex AI Studio and use the Vertex AI Gemini API. For more information, see Migrate from Gemini on Google AI to Vertex AI.

How do I switch from PaLM 2 to Vertex AI Gemini API as the underlying model?

You don't need to make any major architectural changes to your applications when switching from PaLM models to Gemini models. From an API perspective, switching from one model to another requires changing a single line of code or updating the SDK. For more information, see Migrate from PaLM API to Vertex AI Gemini API.

Because responses can vary between models, we recommend you do prompt testing to compare the responses of PaLM and Gemini models to check that responses meet your expectations.

Availability and pricing

In what locations is Gemini available?

Gemini 1.0 Pro and Gemini 1.0 Pro Vision are available in the Asia, US, and Europe regions. For more information, see Generative AI on Vertex AI locations.

Is there a free evaluation tier for the Vertex AI Gemini API?

Contact your Google Cloud representative for more information.

What is the pricing for Vertex AI Gemini API?

Pricing information for Gemini models is available in the Multimodal section of the Pricing for Generative AI on Vertex AI.

How do I get access to Gemini Ultra?

Contact your Google account representative to request access.

Quotas

How do I resolve a quota (429) error when making API requests?

There is either excessive demand or the request exceeded your per-project quota. Check that your request rate is less than the quota for your project. To view you project quotas, go to the Quotas page in the Google Cloud console. For more information, see Generative AI on Vertex AI on Vertex AI quota and limits.

How do I increase my project quotas for Gemini?

You can request an increase from the Google Cloud console. For more information, see Generative AI on Vertex AI on Vertex AI quota and limits.