[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-03。"],[],[],null,["# Use Gemma open models\n\nGemma is a set of lightweight, generative artificial intelligence (AI)\nopen models. Gemma models are available to run in your\napplications and on your hardware, mobile devices, or hosted services. You can\nalso customize these models using tuning techniques so that they excel at\nperforming tasks that matter to you and your users. Gemma models are\nbased on [Gemini](/vertex-ai/generative-ai/docs/overview) models and are intended\nfor the AI development community to extend and take further.\n\nFine-tuning can help improve a model's performance in specific tasks. Because\nmodels in the Gemma model family are open weight, you can tune any of\nthem using the AI framework of your choice and the Vertex AI SDK.\nYou can open a notebook example to fine-tune the Gemma model using\na link available on the Gemma model card in Model Garden.\n\nThe following Gemma models are available to use with Vertex AI.\nTo learn more about and test the Gemma models, see their\nModel Garden model cards.\n\nThe following are some options for where you can use Gemma:\n\nUse Gemma with Vertex AI\n------------------------\n\nVertex AI offers a managed platform for rapidly building and scaling\nmachine learning projects without needing in-house MLOps expertise. You can use\nVertex AI as the downstream application that serves the\nGemma models. For example, you might port weights from the Keras\nimplementation of Gemma. Next, you can use Vertex AI to\nserve that version of Gemma to get predictions. We recommend using\nVertex AI if you want end-to-end MLOps capabilities, value-added ML\nfeatures, and a serverless experience for streamlined development.\n\nTo get started with Gemma, see the following notebooks:\n\n- [Serve Gemma 3n in Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma3n_deployment_on_vertex.ipynb)\n\n- [Serve Gemma 3 in Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma3_deployment_on_vertex.ipynb)\n\n- [Serve Gemma 2 in Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma2_deployment_on_vertex.ipynb)\n\n- [Serve Gemma in Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_deployment_on_vertex.ipynb)\n\n- [Fine-tune Gemma 3 using PEFT and then deploy to Vertex AI from Vertex](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma3_finetuning_on_vertex.ipynb)\n\n- [Fine-tune Gemma 2 using PEFT and then deploy to Vertex AI from Vertex](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma2_finetuning_on_vertex.ipynb)\n\n- [Fine-tune Gemma using PEFT and then deploy to Vertex AI from Vertex](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_finetuning_on_vertex.ipynb)\n\n- [Fine-tune Gemma using PEFT and then deploy to Vertex AI from Huggingface](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_pytorch_gemma_peft_finetuning_hf.ipynb)\n\n- [Fine-tune Gemma using KerasNLP and then deploy to Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_kerasnlp_to_vertexai.ipynb)\n\n- [Fine-tune Gemma with Ray on Vertex AI and then deploy to Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_fine_tuning_batch_deployment_on_rov.ipynb)\n\n- [Run local inference with ShieldGemma 2 with Hugging Face transformers](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_shieldgemma2_local_inference.ipynb)\n\n- [Serve MedGemma in Vertex AI](https://github.com/Google-Health/medgemma/blob/main/notebooks/quick_start_with_model_garden.ipynb)\n\n- [Serve MedSigLIP in Vertex AI](https://github.com/Google-Health/medsiglip/blob/main/notebooks/quick_start_with_model_garden.ipynb)\n\n- [Run local inference with T5Gemma with Hugging Face transformers](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_t5gemma_local_inference.ipynb)\n\nUse Gemma in other Google Cloud products\n----------------------------------------\n\nYou can use Gemma with other Google Cloud products, such as\nGoogle Kubernetes Engine and Dataflow.\n\n### Use Gemma with GKE\n\nGoogle Kubernetes Engine (GKE) is the Google Cloud solution\nfor managed Kubernetes that provides scalability, security, resilience, and cost\neffectiveness. We recommend this option if you have existing Kubernetes\ninvestments, your organization has in-house MLOps expertise, or if you need\ngranular control over complex AI/ML workloads with unique security, data\npipeline, and resource management requirements. To learn more, see the following\ntutorials in the GKE documentation:\n\n- [Serve Gemma with vLLM](/kubernetes-engine/docs/tutorials/serve-gemma-gpu-vllm)\n- [Serve Gemma with TGI](/kubernetes-engine/docs/tutorials/serve-gemma-gpu-tgi)\n- [Serve Gemma with Triton and TensorRT-LLM](/kubernetes-engine/docs/tutorials/serve-gemma-gpu-tensortllm)\n- [Serve Gemma with JetStream](/kubernetes-engine/docs/tutorials/serve-gemma-tpu-jetstream)\n\n### Use Gemma with Dataflow\n\nYou can use Gemma models with Dataflow for\n[sentiment analysis](https://en.wikipedia.org/wiki/Sentiment_analysis).\nUse Dataflow to run inference pipelines that use the\nGemma models. To learn more, see\n[Run inference pipelines with Gemma open models](/dataflow/docs/machine-learning/gemma).\n\nUse Gemma with Colab\n--------------------\n\nYou can use Gemma with Colaboratory to create your Gemma\nsolution. In Colab, you can use Gemma with framework\noptions such as PyTorch and JAX. To learn more, see:\n\n- [Get started with Gemma using Keras](https://ai.google.dev/gemma/docs/get_started).\n- [Get started with Gemma using PyTorch](https://ai.google.dev/gemma/docs/pytorch_gemma).\n- [Basic tuning with Gemma using Keras](https://ai.google.dev/gemma/docs/lora_tuning).\n- [Distributed tuning with Gemma using Keras](https://ai.google.dev/gemma/docs/distributed_tuning).\n\nGemma model sizes and capabilities\n----------------------------------\n\nGemma models are available in several sizes so you can build\ngenerative AI solutions based on your available computing resources, the\ncapabilities you need, and where you want to run them. Each model is available\nin a tuned and an untuned version:\n\n- **Pretrained** - This version of the model wasn't trained on any specific tasks\n or instructions beyond the Gemma core data training set. We don't\n recommend using this model without performing some tuning.\n\n- **Instruction-tuned** - This version of the model was trained with human language\n interactions so that it can participate in a conversation, similar to a basic\n chat bot.\n\n- **Mix fine-tuned** - This version of the model is fine-tuned on a mixture of\n academic datasets and accepts natural language prompts.\n\nLower parameter sizes means lower resource requirements and more deployment\nflexibility.\n\nGemma has been tested using Google's purpose built v5e TPU\nhardware and NVIDIA's L4(G2 Standard), A100(A2 Standard),\nH100(A3 High) GPU hardware.\n\nWhat's next\n-----------\n\n- See [Gemma documentation](https://ai.google.dev/gemma/docs)."]]