Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. With Google Kubernetes Engine (GKE), you can implement a robust, production-ready AI/ML platform with all the benefits of managed Kubernetes and these capabilities:

  • Infrastructure orchestration that supports GPUs and TPUs for training and serving workloads at scale.
  • Flexible integration with distributed computing and data processing frameworks.
  • Support for multiple teams on the same infrastructure to maximize utilization of resources
This page provides an overview of the AI/ML capabilities of GKE and how to get started running optimized AI/ML workloads on GKE with GPUs, TPUs, and frameworks like Hugging Face TGI, vLLM, and JetStream.
Get started for free

Start your proof of concept with $300 in free credit

  • Get access to Gemini 2.0 Flash Thinking
  • Free monthly usage of popular products, including AI APIs and BigQuery
  • No automatic charges, no commitment
Explore self-paced training from Google Cloud Skills Boost, use cases, reference architectures, and code samples with examples of how to use and connect Google Cloud services.

Related videos

As enterprises deploy more modern applications in Google Cloud, they are presented with new ways of using and optimizing storage. We’ll showcase our newest addition, Google Cloud NetApp Volumes, a fully Google-managed file storage service built for

Google Kubernetes Engine (GKE) provides enterprises the most fully managed Kubernetes experience on the planet. Learn how Autopilot streamlines Day 2 operations around node provisioning and bin packing while providing a sensible default security

Unlock the power of running stateful workloads in GKE! Dive into tools, backup solutions, and upgrade functionalities that make Kubernetes applications seamless and secure. Ready to elevate your app management? Visit the Google Cloud Console now!