Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. With Google Kubernetes Engine (GKE), you can implement a robust, production-ready AI/ML platform with all the benefits of managed Kubernetes and these capabilities:

  • Infrastructure orchestration that supports GPUs and TPUs for training and serving workloads at scale.
  • Flexible integration with distributed computing and data processing frameworks.
  • Support for multiple teams on the same infrastructure to maximize utilization of resources
This page provides an overview of the AI/ML capabilities of GKE and how to get started running optimized AI/ML workloads on GKE with GPUs, TPUs, and frameworks like Hugging Face TGI, vLLM, and JetStream.
Get started for free

Start your next project with $300 in free credit

Build and test a proof of concept with the free trial credits and free monthly usage of 20+ products.

Explore self-paced training from Google Cloud Skills Boost, use cases, reference architectures, and code samples with examples of how to use and connect Google Cloud services.

Related videos

Kubernetes is the top container orchestration platform for batch workloads like data processing, machine learning, and scientific simulations. In this video, Mofi Rahman, Cloud Advocate at Google, discusses why Google Kubernetes Engine (GKE) is the

Being an industry leader in running massive scale container workloads is what Google is known for. Learn from us and our customers the key best practices for designing clusters and workloads for large scale and performance on GKE. This session will