Google Kubernetes Engine documentation
Deploy, manage, and scale containerized applications on Kubernetes, powered by Google Cloud. Learn more
Start your next project with $300 in free credit
Build and test a proof of concept with the free trial credits and free monthly usage of 20+ products.
Documentation resources
AI/ML on GKE tutorials
Related resources
Related videos
New Way Now: How Apex fuels frictionless investment with Google Cloud
*Summary:* Daniel Gicklhorn, VP of Product at Apex Fintech Solutions, shares how Google Cloud is helping the market-leading fintech provider enable seamless access, frictionless investing, and investor education for all. With Google Cloud, Apex
How a cloud architect predicts wildfire behavior with Google Cloud
Watch along as Leo Guzman, Cloud Architect at Improving Aviation, deployed the ember spread model across Google Kubernetes Engine, BigQuery, Cloud Storage, and more to provide first responders with real time data on wildfire spread. #GoogleCloud
How a software engineer predicts wildfire behavior with Google Cloud
Watch along and learn how new developer Alex Rodriguez with Improving Aviation got started quickly with Google Cloud. Alex integrated the cloud with existing code and data to scale Improving Aviation’s ember spread model to help give firefighters
Cost-efficient AI inference
Learn how to use JAX, Google Kubernetes Engine (GKE) and NVIDIA Triton Inference Server as a winning combination for low cost AI inference. Speaker: Don McCasland Products Mentioned: AI Infrastructure
The secret to cost-efficient AI inference
See the detailed reference architecture → https://goo.gle/4bKh5aR Learn how to use JAX, Google Kubernetes Engine (GKE) and NVIDIA Triton Inference Server as a winning combination for low cost AI inference. Subscribe to Google Cloud Tech →
How Improving Aviation uses Gemini and Google Cloud to predict and combat wildfires
Wildfires are a growing threat, but what if we could predict them with near real-time accuracy? In this video, we explore how Improving Aviation is doing just that, using the power of Gemini and Google Cloud's machine learning capabilities. Learn how
Kubernetes jobs for batch workload
Job is the fundamental building block for building a batch platform on Kubernetes. In this video Ali and Mofi go over basics of Job and some simple patterns of Job execution on Kubernetes on GKE. Subscribe to Google Cloud Tech →
Batch architecture in Kubernetes
In this video we look at how the architecture of a batch platform looks like on Kubernetes. Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech
New Way Now: Air Force Research Laboratory powers the future of defense with Google Cloud
*Featuring:* Dan Berrigan, lead of Worldwide Research Collaboration and Digital Capabilities Directorate for the Air Force Research Laboratory *Executive summary:* The Air Force Research Laboratory (AFRL), which helps power the innovation arm of the
Dynamic Workload Scheduler for AI workloads
Discover how Dynamic Workload Scheduler (DWS) simplifies hardware acquisition for AI workloads on Google Cloud. This video explains DWS modes (Calendar and Flex Start) and their integration with products like Compute Engine, Kubernetes Engine, Vertex
Why use GKE for AI/ML workloads?
Learn more → https://goo.gle/AIMLonGKE With Google Kubernetes Engine (GKE), you can implement a robust, production-ready AI/ML platform with all the benefits of managed Kubernetes. This video provides an overview of the AI/ML capabilities of GKE.
New Way Now: ROSHN is building new ways for its customers to find and buy their homes online
*Featured in this interview:* From ROSHN’s leadership team, Pablo Salcedo, executive director of digital products; Jayesh Maganlal, group chief information and digital officer; and Yazeed Alghmadi, director of emerging technologies, *Summary:* Google
Google Cloud Essentials
Enroll today → https://goo.gle/3BnKytd Want to learn Google Cloud but don't know where to start? Looking for hands-on experience with the Google Cloud console? Google Cloud Essentials is a series of labs where you learn to: Create and manage virtual
Deploy Gemma 2 with multiple LoRA adapters on GKE
Tutorial: Deploy Gemma 2 with multiple LoRA adapters using TGI on GKE → https://goo.gle/4f5KP1C Video: Train a LoRA adapter with your own dataset → https://goo.gle/4gkBLar Deep dive: A conceptual overview of Low-Rank Adaptation (LoRA) →
LiveX AI achieves over 50% lower TCO with custom AI agents trained and served on GKE and NVIDIA.
Jia Li, Co-Founder and Chief AI officer at LiveX AI describes how they built custom AI chat agents to deliver truly human-like customer experience on Google Kubernetes Engine and NVIDIA NIM and NVIDIA A100 GPUs. GKE has allowed LiveX AI to ramp up
Cloud migration insights from banking
Christian Gorke, VP and Head of Cyber Center of Excellence at Commerzbank, shares insights from their cloud migration journey, highlighting key lessons learned and practical advice for organizations moving to the cloud. Discover the importance of
Fine-tuning open AI models using Hugging Face TRL
Tutorial: Fine-tune Gemma 2 with Hugging Face TRL on GKE → https://goo.gle/3ZcLQiL Hugging Face Deep Learning containers → https://goo.gle/3Otahn3 Docs: Hugging Face TRL → https://goo.gle/3ZgsyJs Supervised fine-tuning of a pre-trained language model
Choosing between self-hosted GKE and managed Vertex AI to host AI models
Read the blog post → https://goo.gle/3V41A6f Vertex AI or Google Kubernetes Engine? Which platform is the best fit for unleashing the power of LLMs in your applications? Find out in this video. Chapters: 0:00 - Intro 0:39 - Why do AI with
How to autoscale a TGI deployment on GKE
Tutorial: Configure autoscaling for TGI on GKE → https://goo.gle/3Z9a7WK Learn more about observability on GKE → https://goo.gle/4951bWY Hugging Face TGI (Text Generation Inference) → https://goo.gle/4hXScLk Text Generation Inference (TGI) is a
New Way Now: Picterra powers geospatial AI analytics with Google Cloud for a sustainable future
*Summary:* Martina Löfqvist, head of strategy and partnerships at Picterra, shares how Google Cloud is helping this leading geospatial AI company to build a more sustainable future. With Google Cloud, Picterra is enabling quicker ways to analyze
Deploy HUGS on GKE with Hugging Face
Getting started with HUGS on Google Cloud → https://goo.gle/3OhK8aL HUGS in the Google Cloud Marketplace → https://goo.gle/3ZbVPG6 Blog: Introducing HUGS → https://goo.gle/4eCs5Xl Deploy open models on Google Kubernetes Engine with the power of
GKE Gemma 2 deployment with Hugging Face
Tutorial: Serve Gemma on GKE with TGI → https://goo.gle/4fFKt2Q Learn more about TGI (text generation inference) from Hugging Face → https://goo.gle/4e7qusz Hugging Face Deep Learning containers for Google Cloud → https://goo.gle/3BPaYUM Text
65K node Kubernetes AI Platform - A Reality
The size of generative AI models is constantly increasing, with current models reaching hundreds of billions of parameters and the most advanced ones approaching 2 trillion. Training such large models on modern accelerators necessitates clusters
AI anywhere: How AI models, AI optimized cloud infrastructure, and edge unlock new business cases
Summary: On-premises computing is not new, but extending AI enabled cloud infrastructure from the cloud to on-premises deployments is. According to Omdia research commissioned by Google Cloud, 38% of banking institutions plan to deploy AI to hundreds
What are Hugging Face Deep Learning Containers?
Hugging Face Deep Learning Containers docs → https://goo.gle/4dKMtoU Updates every month (release notes) → https://goo.gle/3YmYzPi Hugging Face Expert Support Program → https://goo.gle/3U9jVP1 The Hugging Face Deep Learning Containers have everything
Experience on-premises gen AI search for finance, telecom, public sector, and regulated industries
Google Distributed Cloud is excited to announce the latest addition to our “AI anywhere” video series! This insightful video dives deep into the challenges of modernizing on-premises high-performance and AI workloads and how Google Distributed Cloud
Customer Voices: Next DLP
Next DLP is a user-centric, flexible, cloud-native, AI/ML-powered solution built for today’s threat landscape. Hear from their Engineer, Chung Poon, as he explains why Google Cloud's Kubernetes Engine was the perfect fit for their SaaS platform
Google Mainframe Modernization - Refactor for Batch
For more information, please visit Google Cloud Mainframe Modernization → https://goo.gle/3SGO7Qy Google Cloud Mainframe Refactor for batch allows customers to modernize mainframe batch applications, written in COBOL and JCL, to Google Cloud, and
Enhancing GKE efficiency with Filestore multishares
A Filestore instance is a fully managed network-attached storage (NAS) system cloud developers can use with their Google Kubernetes Engine instances. Watch along and learn about the seamless integration of Google Cloud’s Filestore as a persistent
Bosch SDS pioneers sustainability-driven initiatives through digitization, AI and IoT
Jürgen Imhoff, Global Leadership Partners & Alliances at Bosch Global Software Technologies, explains how Bosch Software and Digital Solutions will help businesses globally integrate sustainability into operations through AI-powered data and
Try GKE for yourself
Create an account to evaluate how our products perform in real-world
scenarios.
New customers also get $300 in free credits to run, test,
and deploy workloads.