Jump to Content
Developers & Practitioners

Hands-on with Gemma 3 on Google Cloud

November 17, 2025
https://storage.googleapis.com/gweb-cloudblog-publish/images/gemma_3_hero.max-2500x2500.png
Olivier Bourgeois

Developer Relations Engineer

The landscape of generative AI is shifting. While proprietary APIs are powerful, there is a growing demand for open models—models where the architecture and weights are publicly available. This shift puts control back in the hands of developers, offering transparency, data privacy, and the ability to fine-tune for specific use cases.

To help you navigate this landscape, we are releasing two new hands-on labs featuring Gemma 3, Google’s latest family of lightweight, state-of-the-art open models.

Why Gemma?

Built from the same research and technology as Gemini, Gemma models are designed for responsible AI development. Gemma 3 is particularly exciting because it offers multimodal capabilities (text and image) and fits efficiently on smaller hardware footprints while delivering massive performance.

But running a model on your laptop is very different from running it in production. You need scale, reliability, and hardware acceleration (GPUs). The question is: Where should you deploy?

We have prepared two different paths for you, depending on your infrastructure needs: Cloud Run or Google Kubernetes Engine (GKE).

Path 1: The Serverless Approach (Cloud Run)

Best for: Developers who want an API up and running instantly without managing infrastructure, scaling to zero when not in use.

If your priority is simplicity and cost-efficiency for stateless workloads, Cloud Run is your answer. It abstracts away the server management entirely. With the recent addition of GPU support on Cloud Run, you can now serve modern LLMs without provisioning a cluster.

Path 2: The Platform Approach (GKE)

Best for: Engineering teams building complex AI platforms, requiring high throughput, custom orchestration, or integration with a broader microservices ecosystem.

When your application graduates from a prototype to a high-traffic production system, you need the control of Kubernetes. GKE Autopilot gives you that power while still handling the heavy lifting of node management. This path creates a seamless journey from local testing to cloud production.

Which Path Will You Choose?

Whether you are looking for the serverless simplicity of Cloud Run or the robust orchestration of GKE, Google Cloud provides the tools to take Gemma 3 from a concept to a deployed application.

Dive into the labs today and start building:

Share your progress and connect with others on the journey using the hashtag #ProductionReadyAI. Happy learning!

Posted in