Gemma는 가벼운 생성형 인공지능(AI) 개방형 모델의 집합입니다. Gemma 모델은 애플리케이션과 하드웨어, 휴대기기 또는 호스팅된 서비스에서 실행할 수 있습니다. 또한 개발자와 사용자에게 중요한 태스크를 수행할 때 뛰어난 성능을 발휘하도록 조정 기법을 사용하여 이러한 모델을 맞춤설정할 수 있습니다. Gemma 모델은 Gemini 모델을 기반으로 하며 AI 개발 커뮤니티가 이를 확장하고 발전할 수 있도록 고안되었습니다.
미세 조정을 사용하면 특정 태스크에서 모델 성능을 향상시킬 수 있습니다. Gemma 모델 제품군의 모델은 개방형 가중치이므로 원하는 AI 프레임워크와 Vertex AI SDK를 사용하여 이러한 모델을 조정할 수 있습니다.
노트북 예시를 열어 Model Garden의 Gemma 모델 카드에서 제공되는 링크를 사용하여 Gemma 모델을 미세 조정할 수 있습니다.
Vertex AI에서 사용할 수 있는 Gemma 모델은 다음과 같습니다.
Gemma 모델을 자세히 알아보고 테스트하려면 Model Garden 모델 카드를 참조하세요.
Vertex AI는 사내 MLOps 전문 지식 없이도 머신러닝 프로젝트를 빠르게 빌드 및 확장할 수 있는 관리형 플랫폼을 제공합니다. Vertex AI를 Gemma 모델을 제공하는 다운스트림 애플리케이션으로 사용할 수 있습니다. 예를 들어 Gemma의 Keras 구현에서 가중치를 포팅할 수 있습니다. 그런 다음 Vertex AI로 해당 버전의 Gemma를 서빙하여 예측을 얻을 수 있습니다. 엔드 투 엔드 MLOps 역량, 고급형 ML 기능, 간소화된 개발을 위한 서버리스 환경이 필요한 경우 Vertex AI를 사용하는 것이 좋습니다.
Google Kubernetes Engine, Dataflow 등 다른 Google Cloud 제품과 함께 Gemma를 사용할 수 있습니다.
GKE에서 Gemma 사용
Google Kubernetes Engine(GKE)은 확장성, 보안, 복원력, 비용 효율성을 제공하는 관리형 Kubernetes용 Google Cloud 솔루션입니다. Kubernetes를 이미 도입했거나 조직 내부에 MLOps 전문가가 있거나 특별한 보안, 데이터 파이프라인, 리소스 관리 요구사항에 따라 복잡한 AI/ML 워크로드를 세밀하게 제어해야 하는 경우 이 옵션을 사용하는 것이 좋습니다. 자세한 내용은 GKE 문서의 다음 튜토리얼을 참조하세요.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[],[],null,["# Use Gemma open models\n\nGemma is a set of lightweight, generative artificial intelligence (AI)\nopen models. Gemma models are available to run in your\napplications and on your hardware, mobile devices, or hosted services. You can\nalso customize these models using tuning techniques so that they excel at\nperforming tasks that matter to you and your users. Gemma models are\nbased on [Gemini](/vertex-ai/generative-ai/docs/overview) models and are intended\nfor the AI development community to extend and take further.\n\nFine-tuning can help improve a model's performance in specific tasks. Because\nmodels in the Gemma model family are open weight, you can tune any of\nthem using the AI framework of your choice and the Vertex AI SDK.\nYou can open a notebook example to fine-tune the Gemma model using\na link available on the Gemma model card in Model Garden.\n\nThe following Gemma models are available to use with Vertex AI.\nTo learn more about and test the Gemma models, see their\nModel Garden model cards.\n\nThe following are some options for where you can use Gemma:\n\nUse Gemma with Vertex AI\n------------------------\n\nVertex AI offers a managed platform for rapidly building and scaling\nmachine learning projects without needing in-house MLOps expertise. You can use\nVertex AI as the downstream application that serves the\nGemma models. For example, you might port weights from the Keras\nimplementation of Gemma. Next, you can use Vertex AI to\nserve that version of Gemma to get predictions. We recommend using\nVertex AI if you want end-to-end MLOps capabilities, value-added ML\nfeatures, and a serverless experience for streamlined development.\n\nTo get started with Gemma, see the following notebooks:\n\n- [Serve Gemma 3n in Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma3n_deployment_on_vertex.ipynb)\n\n- [Serve Gemma 3 in Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma3_deployment_on_vertex.ipynb)\n\n- [Serve Gemma 2 in Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma2_deployment_on_vertex.ipynb)\n\n- [Serve Gemma in Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_deployment_on_vertex.ipynb)\n\n- [Fine-tune Gemma 3 using PEFT and then deploy to Vertex AI from Vertex](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma3_finetuning_on_vertex.ipynb)\n\n- [Fine-tune Gemma 2 using PEFT and then deploy to Vertex AI from Vertex](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma2_finetuning_on_vertex.ipynb)\n\n- [Fine-tune Gemma using PEFT and then deploy to Vertex AI from Vertex](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_finetuning_on_vertex.ipynb)\n\n- [Fine-tune Gemma using PEFT and then deploy to Vertex AI from Huggingface](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_pytorch_gemma_peft_finetuning_hf.ipynb)\n\n- [Fine-tune Gemma using KerasNLP and then deploy to Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_kerasnlp_to_vertexai.ipynb)\n\n- [Fine-tune Gemma with Ray on Vertex AI and then deploy to Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_fine_tuning_batch_deployment_on_rov.ipynb)\n\n- [Run local inference with ShieldGemma 2 with Hugging Face transformers](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_shieldgemma2_local_inference.ipynb)\n\n- [Serve MedGemma in Vertex AI](https://github.com/Google-Health/medgemma/blob/main/notebooks/quick_start_with_model_garden.ipynb)\n\n- [Serve MedSigLIP in Vertex AI](https://github.com/Google-Health/medsiglip/blob/main/notebooks/quick_start_with_model_garden.ipynb)\n\n- [Run local inference with T5Gemma with Hugging Face transformers](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_t5gemma_local_inference.ipynb)\n\nUse Gemma in other Google Cloud products\n----------------------------------------\n\nYou can use Gemma with other Google Cloud products, such as\nGoogle Kubernetes Engine and Dataflow.\n\n### Use Gemma with GKE\n\nGoogle Kubernetes Engine (GKE) is the Google Cloud solution\nfor managed Kubernetes that provides scalability, security, resilience, and cost\neffectiveness. We recommend this option if you have existing Kubernetes\ninvestments, your organization has in-house MLOps expertise, or if you need\ngranular control over complex AI/ML workloads with unique security, data\npipeline, and resource management requirements. To learn more, see the following\ntutorials in the GKE documentation:\n\n- [Serve Gemma with vLLM](/kubernetes-engine/docs/tutorials/serve-gemma-gpu-vllm)\n- [Serve Gemma with TGI](/kubernetes-engine/docs/tutorials/serve-gemma-gpu-tgi)\n- [Serve Gemma with Triton and TensorRT-LLM](/kubernetes-engine/docs/tutorials/serve-gemma-gpu-tensortllm)\n- [Serve Gemma with JetStream](/kubernetes-engine/docs/tutorials/serve-gemma-tpu-jetstream)\n\n### Use Gemma with Dataflow\n\nYou can use Gemma models with Dataflow for\n[sentiment analysis](https://en.wikipedia.org/wiki/Sentiment_analysis).\nUse Dataflow to run inference pipelines that use the\nGemma models. To learn more, see\n[Run inference pipelines with Gemma open models](/dataflow/docs/machine-learning/gemma).\n\nUse Gemma with Colab\n--------------------\n\nYou can use Gemma with Colaboratory to create your Gemma\nsolution. In Colab, you can use Gemma with framework\noptions such as PyTorch and JAX. To learn more, see:\n\n- [Get started with Gemma using Keras](https://ai.google.dev/gemma/docs/get_started).\n- [Get started with Gemma using PyTorch](https://ai.google.dev/gemma/docs/pytorch_gemma).\n- [Basic tuning with Gemma using Keras](https://ai.google.dev/gemma/docs/lora_tuning).\n- [Distributed tuning with Gemma using Keras](https://ai.google.dev/gemma/docs/distributed_tuning).\n\nGemma model sizes and capabilities\n----------------------------------\n\nGemma models are available in several sizes so you can build\ngenerative AI solutions based on your available computing resources, the\ncapabilities you need, and where you want to run them. Each model is available\nin a tuned and an untuned version:\n\n- **Pretrained** - This version of the model wasn't trained on any specific tasks\n or instructions beyond the Gemma core data training set. We don't\n recommend using this model without performing some tuning.\n\n- **Instruction-tuned** - This version of the model was trained with human language\n interactions so that it can participate in a conversation, similar to a basic\n chat bot.\n\n- **Mix fine-tuned** - This version of the model is fine-tuned on a mixture of\n academic datasets and accepts natural language prompts.\n\nLower parameter sizes means lower resource requirements and more deployment\nflexibility.\n\nGemma has been tested using Google's purpose built v5e TPU\nhardware and NVIDIA's L4(G2 Standard), A100(A2 Standard),\nH100(A3 High) GPU hardware.\n\nWhat's next\n-----------\n\n- See [Gemma documentation](https://ai.google.dev/gemma/docs)."]]