Scalable, high performance, and cost effective infrastructure for every AI workload.
AI Accelerators for every use case from high performance training to low-cost inference
Scale faster with GPUs and TPUs on Google Kubernetes Engine or Google Compute Engine
Deployable solutions for Vertex AI, Google Kubernetes Engine, and the Cloud HPC Toolkit
Get the most out of our AI Infrastructure by deploying the AI Hypercomputer architecture
Benefits
Optimize performance and cost at scale
With Google Cloud, you can choose from GPUs, TPUs, or CPUs to support a variety of use cases including high performance training, low cost inference, and large-scale data processing.
Deliver results faster with managed infrastructure
Scale faster and more efficiently with managed infrastructure provided by Vertex AI. Set up ML environments quickly, automate orchestration, manage large clusters, and set up low latency applications.
Develop with software that’s purpose-built for AI
Improve AI development productivity by leveraging GKE to manage large-scale workloads. Train and serve the foundation models with support for autoscaling, workload orchestration, and automatic upgrades.
Key features
There’s no one-size-fits-all when it comes to AI workloads—that’s why together with our industry hardware partners, like NVIDIA, Intel, AMD, Arm and more, we provide customers with the widest range of AI-optimized compute options across TPUs, GPUs, and CPUs for training and serving the most data-intensive models.
Orchestrating large-scale AI workloads with Cloud TPUs and Cloud GPUs has historically required manual effort to handle failures, logging, monitoring, and other foundational operations. Google Kubernetes Engine (GKE), the most scalable and fully-managed Kubernetes service, considerably simplifies the work required to operate TPUs and GPUs. Leveraging GKE to manage large-scale AI workload orchestration on Cloud TPU and Cloud GPU improves AI development productivity.
And for organizations that prefer the simplicity of abstracting away the infrastructure through managed services, Vertex AI now supports training with various frameworks and libraries using Cloud TPU and Cloud GPU.
Our AI-optimized infrastructure is built to deliver the global scale and performance demanded by Google products like YouTube, Gmail, Google Maps, Google Play, and Android that serve billions of users. Our AI infrastructure solutions are all underpinned by Google Cloud's Jupiter data center network which supports best-in-industry, scale-out capability for foundational services, through to high-intensity AI workloads.
For decades, we’ve contributed to critical AI projects like TensorFlow and JAX. We co-founded the PyTorch Foundation and recently announced a new industry consortium—the OpenXLA project. Additionally, Google is the leading CNCF Open Source contributor, and has a 20+ year history of OSS contributions like TFX, MLIR, OpenXLA, KubeFlow, and Kubernetes as well as sponsorship of OSS projects critical to the data science community, like Project Jupyter and NumFOCverteUS.
Furthermore, our AI infrastructure services are embedded with the most popular AI frameworks such as TensorFlow, PyTorch, and MXNet allowing customers to continue using whichever framework they prefer, and not be constrained to a specific framework/or hardware architecture.
Customers
As AI opens the door for innovation across industries, companies are choosing Google Cloud to take advantage of our open, flexible, and performant infrastructure.
What's new
Documentation
Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities.
Deep Learning VM Images are optimized for data science and machine learning tasks. They come with key ML frameworks and tools pre-installed, and work with GPUs.
Deep Learning Containers are performance-optimized, consistent environments to help you prototype and implement workflows quickly on CPUs or GPUs.
Learn about the computational requirements of machine learning, and how TPUs were purpose-built to handle the task.
TPUs are Google's custom-developed ASICs used to accelerate machine learning workloads. Learn about the underlying system architecture of TPUs from the ground up.
Use cases
Cloud TPU Multislice training is a full-stack technology that enables fast, easy, and reliable large-scale AI model training across tens of thousands of TPU chips.
Google Cloud's open software ecosystem allows you to build applications with the tools and frameworks you are most comfortable with, while taking advantage of the price-performance benefits of the AI Hypercomputer architecture.
Cloud TPU v5e and NVIDIA L4 GPUs enable high-performance and cost-effective inference for a wide range of AI workloads, including the latest LLMs and Gen AI models. Both offer significant price performance improvements over previous models and Google Cloud's AI Hypercomputer architecture enables customers to scale their deployments to industry leading levels.
Pricing
Cloud TPU | Cloud GPU |
---|---|
For information on TPU pricing for single device TPU types and TPU pod types, refer to TPU pricing. | For information about GPU pricing for the different GPU types and regions that are available, refer to the GPU pricing. |
Cloud AI products comply with our SLA policies. They may offer different latency or availability guarantees from other Google Cloud services.
Start building on Google Cloud with $300 in free credits and 20+ always free products.