Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. With Google Kubernetes Engine (GKE), you can implement a robust, production-ready AI/ML platform with all the benefits of managed Kubernetes and these capabilities:

  • Infrastructure orchestration that supports GPUs and TPUs for training and serving workloads at scale.
  • Flexible integration with distributed computing and data processing frameworks.
  • Support for multiple teams on the same infrastructure to maximize utilization of resources
This page provides an overview of the AI/ML capabilities of GKE and how to get started running optimized AI/ML workloads on GKE with GPUs, TPUs, and frameworks like Hugging Face TGI, vLLM, and JetStream.
Explore self-paced training from Google Cloud Skills Boost, use cases, reference architectures, and code samples with examples of how to use and connect Google Cloud services.

Related videos