Gemma is a set of lightweight, generative artificial intelligence (AI)
open models. Gemma models are available to run in your
applications and on your hardware, mobile devices, or hosted services. You can
also customize these models using tuning techniques so that they excel at
performing tasks that matter to you and your users. Gemma models are
based on Gemini models and are intended
for the AI development community to extend and take further. Fine-tuning can help improve a model's performance in specific tasks. Because
models in the Gemma model family are open weight, you can tune any of
them using the AI framework of your choice and the Vertex AI SDK.
You can open a notebook example to fine-tune the Gemma model using
a link available on the Gemma model card in Model Garden. The following Gemma models are available to use with Vertex AI.
To learn more about and test the Gemma models, see their
Model Garden model cards. The following are some options for where you can use Gemma: Vertex AI offers a managed platform for rapidly building and scaling
machine learning projects without needing in-house MLOps expertise. You can use
Vertex AI as the downstream application that serves the
Gemma models. For example, you might port weights from the Keras
implementation of Gemma. Next, you can use Vertex AI to
serve that version of Gemma to get predictions. We recommend using
Vertex AI if you want end-to-end MLOps capabilities, value-added ML
features, and a serverless experience for streamlined development. To get started with Gemma, see the following notebooks: Fine-tune Gemma 3 using PEFT and then deploy to Vertex AI from Vertex Fine-tune Gemma 2 using PEFT and then deploy to Vertex AI from Vertex Fine-tune Gemma using PEFT and then deploy to Vertex AI from Vertex Fine-tune Gemma using PEFT and then deploy to Vertex AI from Huggingface Fine-tune Gemma with Ray on Vertex AI and then deploy to Vertex AI Run local inference with ShieldGemma 2 with Hugging Face transformers Run local inference with T5Gemma with Hugging Face transformers You can use Gemma with other Google Cloud products, such as
Google Kubernetes Engine and Dataflow. Google Kubernetes Engine (GKE) is the Google Cloud solution
for managed Kubernetes that provides scalability, security, resilience, and cost
effectiveness. We recommend this option if you have existing Kubernetes
investments, your organization has in-house MLOps expertise, or if you need
granular control over complex AI/ML workloads with unique security, data
pipeline, and resource management requirements. To learn more, see the following
tutorials in the GKE documentation: You can use Gemma models with Dataflow for
sentiment analysis.
Use Dataflow to run inference pipelines that use the
Gemma models. To learn more, see
Run inference pipelines with Gemma open models. You can use Gemma with Colaboratory to create your Gemma
solution. In Colab, you can use Gemma with framework
options such as PyTorch and JAX. To learn more, see: Gemma models are available in several sizes so you can build
generative AI solutions based on your available computing resources, the
capabilities you need, and where you want to run them. Each model is available
in a tuned and an untuned version: Pretrained - This version of the model wasn't trained on any specific tasks
or instructions beyond the Gemma core data training set. We don't
recommend using this model without performing some tuning. Instruction-tuned - This version of the model was trained with human language
interactions so that it can participate in a conversation, similar to a basic
chat bot. Mix fine-tuned - This version of the model is fine-tuned on a mixture of
academic datasets and accepts natural language prompts. Lower parameter sizes means lower resource requirements and more deployment
flexibility. Gemma has been tested using Google's purpose built v5e TPU
hardware and NVIDIA's L4(G2 Standard), A100(A2 Standard),
H100(A3 High) GPU hardware.
Model name
Use cases
Model Garden model card
Gemma 3n
Capable of multimodal input, handling text, image, video, and audio input, and generating text outputs.
Go to the Gemma 3n model card
Gemma 3
Best for text generation and image understanding tasks, including question answering, summarization, and reasoning.
Go to the Gemma 3 model card
Gemma 2
Best for text generation, summarization, and extraction.
Go to the Gemma 2 model card
Gemma
Best for text generation, summarization, and extraction.
Go to the Gemma model card
CodeGemma
Best for code generation and completion.
Go to the CodeGemma model card
PaliGemma 2
Best for image captioning tasks and visual question and answering tasks.
Go to the PaliGemma 2 model card
PaliGemma
Best for image captioning tasks and visual question and answering tasks.
Go to the PaliGemma model card
ShieldGemma 2
Checks the safety of synthetic and natural images to help you build robust datasets and models.
Go to the ShieldGemma 2 model card
TxGemma
Best for therapeutic prediction tasks, including classification, regression, or generation, and reasoning tasks.
Go to the TxGemma model card
MedGemma
Gemma 3 variants that are trained for performance on medical text and image comprehension.
Go to the MedGemma model card
MedSigLIP
SigLIP variant that is trained to encode medical images and text into a common embedding space.
Go to the MedSigLIP model card
T5Gemma
Well-suited for a variety of generative tasks, including question answering, summarization, and reasoning.
Go to the T5Gemma model card
Use Gemma with Vertex AI
Use Gemma in other Google Cloud products
Use Gemma with GKE
Use Gemma with Dataflow
Use Gemma with Colab
Gemma model sizes and capabilities
Model name
Parameters size
Input
Output
Tuned versions
Intended platforms
Gemma 3n
Gemma 3n E4B
4 billion effective parameters
Text, image and audio
Text
Mobile devices and laptops
Gemma 3n E2B
2 billion effective parameters
Text, image and audio
Text
Mobile devices and laptops
Gemma 3
Gemma 27B
27 billion
Text and image
Text
Large servers or server clusters
Gemma 12B
12 billion
Text and image
Text
Higher-end desktop computers and servers
Gemma 4B
4 billion
Text and image
Text
Desktop computers and small servers
Gemma 1B
1 billion
Text
Text
Mobile devices and laptops
Gemma 2
Gemma 27B
27 billion
Text
Text
Large servers or server clusters
Gemma 9B
9 billion
Text
Text
Higher-end desktop computers and servers
Gemma 2B
2 billion
Text
Text
Mobile devices and laptops
Gemma
Gemma 7B
7 billion
Text
Text
Desktop computers and small servers
Gemma 2B
2.2 billion
Text
Text
Mobile devices and laptops
CodeGemma
CodeGemma 7B
7 billion
Text
Text
Desktop computers and small servers
CodeGemma 2B
2 billion
Text
Text
Desktop computers and small servers
PaliGemma 2
PaliGemma 28B
28 billion
Text and image
Text
Large servers or server clusters
PaliGemma 10B
10 billion
Text and image
Text
Higher-end desktop computers and servers
PaliGemma 3B
3 billion
Text and image
Text
Desktop computers and small servers
PaliGemma
PaliGemma 3B
3 billion
Text and image
Text
Desktop computers and small servers
ShieldGemma 2
ShieldGemma 2
4 billion
Text and image
Text
Desktop computers and small servers
TxGemma
TxGemma 27B
27 billion
Text
Text
Large servers or server clusters
TxGemma 9B
9 billion
Text
Text
Higher-end desktop computers and servers
TxGemma 2B
2 billion
Text
Text
Mobile devices and laptops
MedGemma
MedGemma 27B
27 billion
Text and image
Text
Large servers or server clusters
MedGemma 4B
4 billion
Text and image
Text
Desktop computers and small servers
MedSigLIP
MedSigLIP
800 million
Text and image
Embedding
Mobile devices and laptops
T5Gemma
T5Gemma 9B-9B
18 billion
Text
Text
Mobile devices and laptops
T5Gemma 9B-2B
11 billion
Text
Text
Mobile devices and laptops
T5Gemma 2B-2B
4 billion
Text
Text
Mobile devices and laptops
T5Gemma XL-XL
4 billion
Text
Text
Mobile devices and laptops
T5Gemma M-L
2 billion
Text
Text
Mobile devices and laptops
T5Gemma L-L
1 billion
Text
Text
Mobile devices and laptops
T5Gemma B-B
0.6 billion
Text
Text
Mobile devices and laptops
T5Gemma S-S
0.3 billion
Text
Text
Mobile devices and laptops
What's next
Use Gemma open models
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-18 UTC.