Google Kubernetes Engine documentation
Deploy, manage, and scale containerized applications on Kubernetes, powered by Google Cloud. Learn more
Start your proof of concept with $300 in free credit
- Get access to Gemini 2.0 Flash Thinking
- Free monthly usage of popular products, including AI APIs and BigQuery
- No automatic charges, no commitment
Documentation resources
AI/ML on GKE tutorials
Related resources
Related videos
Dynamic Workload Scheduler for AI workloads
Discover how Dynamic Workload Scheduler (DWS) simplifies hardware acquisition for AI workloads on Google Cloud. This video explains DWS modes (Calendar and Flex Start) and their integration with products like Compute Engine, Kubernetes Engine, Vertex
Why use GKE for AI/ML workloads?
Learn more → https://goo.gle/AIMLonGKE With Google Kubernetes Engine (GKE), you can implement a robust, production-ready AI/ML platform with all the benefits of managed Kubernetes. This video provides an overview of the AI/ML capabilities of GKE.
New Way Now: ROSHN is building new ways for its customers to find and buy their homes online
*Featured in this interview:* From ROSHN’s leadership team, Pablo Salcedo, executive director of digital products; Jayesh Maganlal, group chief information and digital officer; and Yazeed Alghmadi, director of emerging technologies, *Summary:* Google
Google Cloud Essentials
Enroll today → https://goo.gle/3BnKytd Want to learn Google Cloud but don't know where to start? Looking for hands-on experience with the Google Cloud console? Google Cloud Essentials is a series of labs where you learn to: Create and manage virtual
Deploy Gemma 2 with multiple LoRA adapters on GKE
Tutorial: Deploy Gemma 2 with multiple LoRA adapters using TGI on GKE → https://goo.gle/4f5KP1C Video: Train a LoRA adapter with your own dataset → https://goo.gle/4gkBLar Deep dive: A conceptual overview of Low-Rank Adaptation (LoRA) →
LiveX AI achieves over 50% lower TCO with custom AI agents trained and served on GKE and NVIDIA.
Jia Li, Co-Founder and Chief AI officer at LiveX AI describes how they built custom AI chat agents to deliver truly human-like customer experience on Google Kubernetes Engine and NVIDIA NIM and NVIDIA A100 GPUs. GKE has allowed LiveX AI to ramp up
Cloud migration insights from banking
Christian Gorke, VP and Head of Cyber Center of Excellence at Commerzbank, shares insights from their cloud migration journey, highlighting key lessons learned and practical advice for organizations moving to the cloud. Discover the importance of
Fine-tuning open AI models using Hugging Face TRL
Tutorial: Fine-tune Gemma 2 with Hugging Face TRL on GKE → https://goo.gle/3ZcLQiL Hugging Face Deep Learning containers → https://goo.gle/3Otahn3 Docs: Hugging Face TRL → https://goo.gle/3ZgsyJs Supervised fine-tuning of a pre-trained language model
Choosing between self-hosted GKE and managed Vertex AI to host AI models
Read the blog post → https://goo.gle/3V41A6f Vertex AI or Google Kubernetes Engine? Which platform is the best fit for unleashing the power of LLMs in your applications? Find out in this video. Chapters: 0:00 - Intro 0:39 - Why do AI with
How to autoscale a TGI deployment on GKE
Tutorial: Configure autoscaling for TGI on GKE → https://goo.gle/3Z9a7WK Learn more about observability on GKE → https://goo.gle/4951bWY Hugging Face TGI (Text Generation Inference) → https://goo.gle/4hXScLk Text Generation Inference (TGI) is a
New Way Now: Picterra powers geospatial AI analytics with Google Cloud for a sustainable future
*Summary:* Martina Löfqvist, head of strategy and partnerships at Picterra, shares how Google Cloud is helping this leading geospatial AI company to build a more sustainable future. With Google Cloud, Picterra is enabling quicker ways to analyze
Deploy HUGS on GKE with Hugging Face
Getting started with HUGS on Google Cloud → https://goo.gle/3OhK8aL HUGS in the Google Cloud Marketplace → https://goo.gle/3ZbVPG6 Blog: Introducing HUGS → https://goo.gle/4eCs5Xl Deploy open models on Google Kubernetes Engine with the power of
GKE Gemma 2 deployment with Hugging Face
Tutorial: Serve Gemma on GKE with TGI → https://goo.gle/4fFKt2Q Learn more about TGI (text generation inference) from Hugging Face → https://goo.gle/4e7qusz Hugging Face Deep Learning containers for Google Cloud → https://goo.gle/3BPaYUM Text
65K node Kubernetes AI Platform - A Reality
The size of generative AI models is constantly increasing, with current models reaching hundreds of billions of parameters and the most advanced ones approaching 2 trillion. Training such large models on modern accelerators necessitates clusters
AI anywhere: How AI models, AI optimized cloud infrastructure, and edge unlock new business cases
Summary: On-premises computing is not new, but extending AI enabled cloud infrastructure from the cloud to on-premises deployments is. According to Omdia research commissioned by Google Cloud, 38% of banking institutions plan to deploy AI to hundreds
What are Hugging Face Deep Learning Containers?
Hugging Face Deep Learning Containers docs → https://goo.gle/4dKMtoU Updates every month (release notes) → https://goo.gle/3YmYzPi Hugging Face Expert Support Program → https://goo.gle/3U9jVP1 The Hugging Face Deep Learning Containers have everything
Experience on-premises gen AI search for finance, telecom, public sector, and regulated industries
Google Distributed Cloud is excited to announce the latest addition to our “AI anywhere” video series! This insightful video dives deep into the challenges of modernizing on-premises high-performance and AI workloads and how Google Distributed Cloud
Customer Voices: Next DLP
Next DLP is a user-centric, flexible, cloud-native, AI/ML-powered solution built for today’s threat landscape. Hear from their Engineer, Chung Poon, as he explains why Google Cloud's Kubernetes Engine was the perfect fit for their SaaS platform
Google Mainframe Modernization - Refactor for Batch
For more information, please visit Google Cloud Mainframe Modernization → https://goo.gle/3SGO7Qy Google Cloud Mainframe Refactor for batch allows customers to modernize mainframe batch applications, written in COBOL and JCL, to Google Cloud, and
Enhancing GKE efficiency with Filestore multishares
A Filestore instance is a fully managed network-attached storage (NAS) system cloud developers can use with their Google Kubernetes Engine instances. Watch along and learn about the seamless integration of Google Cloud’s Filestore as a persistent
Bosch SDS pioneers sustainability-driven initiatives through digitization, AI and IoT
Jürgen Imhoff, Global Leadership Partners & Alliances at Bosch Global Software Technologies, explains how Bosch Software and Digital Solutions will help businesses globally integrate sustainability into operations through AI-powered data and
New Way Now: Canonical unlocks open-source innovation with Google Cloud
𝗦𝘂𝗺𝗺𝗮𝗿𝘆: Alex Gallagher, VP of cloud for Canonical, the publisher of Ubuntu, shares how Google Cloud is helping its popular Linux operating system deliver trusted, secure tools and technologies to millions. Google Cloud’s AI-ready infrastructure is
Transform Your Career: Architecting with Google Kubernetes Engine
Enroll today and architect the future of the cloud! ☁️ Architecting with Google Kubernetes Engine: Workloads → https://goo.gle/gke-work Architecting with Google Kubernetes Engine: Production → https://goo.gle/gke-prod Cloud architects are
Which GKE deployment strategy is right for you?
Getting Started with GKE → https://goo.gle/4brsL0c Take the Architecting with GKE: Workloads course → https://goo.gle/gke-work Take the Architecting with GKE: Production course → https://goo.gle/gke-prod This video explores various deployment
What on earth is kubectl?
Getting Started with GKE → https://goo.gle/4brsL0c Take the Architecting with GKE: Workloads course → https://goo.gle/gke-work Take the Architecting with GKE: Production course → https://goo.gle/gke-prod Ever wondered what ""kubectl"" is and why it's
What is GKE?
Getting Started with GKE → https://goo.gle/4brsL0c Take the Architecting with GKE: Workloads course → https://goo.gle/gke-work Take the Architecting with GKE: Production course → https://goo.gle/gke-prod Tired of maintaining complex Kubernetes
Kubernetes clusters
Getting Started with GKE → https://goo.gle/4brsL0c Take the Architecting with GKE: Workloads course → https://goo.gle/gke-work Take the Architecting with GKE: Production course → https://goo.gle/gke-prod GKE clusters explained! Learn about Kubernetes
Beyond PostgreSQL: Modernize your applications anywhere with AlloyDB Omni
Ditch legacy and embrace freedom with AlloyDB Omni, your hybrid and multicloud enterprise database. Run anywhere, from data centers to the public clouds of your choice, and unlock performance and ease of management. Elevate your apps with HTAP and
Generative AI application development best practices with Cloud SQL for PostgreSQL
Developers choose PostgreSQL for its power, ecosystem, and enterprise-grade features. In this session, unlock best practices for building apps of all kinds with PostgreSQL. We'll cover Google Kubernetes Engine deployments, pgvector for generative AI
New Way Now: Nuro is creating a safe path to a driverless future with Google Cloud
𝗦𝘂𝗺𝗺𝗮𝗿𝘆: Andrew Clare, Chief Technology Officer of Nuro, shares how the innovative robotics company is using Google Cloud to develop the Nuro Driver — its cutting-edge driverless autonomous technology. Leveraging Google Cloud, Nuro can utilize all of
Try GKE for yourself
Create an account to evaluate how our products perform in real-world
scenarios.
New customers also get $300 in free credits to run, test,
and deploy workloads.