Train and serve larger models and datasets efficiently with the most powerful TPU yet.
Announced at Google Cloud Next '25, Ironwood is Google's seventh-generation Tensor Processing Unit (TPU) and the first TPU accelerator designed specifically for large-scale AI inference. Building on extensive experience developing TPUs for Google's internal services and Google Cloud customers, Ironwood is engineered to handle the computational and memory demands of models such as Large Language Models (LLMs), Mixture-of-Experts (MoEs), and advanced reasoning tasks. It supports both training and serving workloads within the Google Cloud AI Hypercomputer architecture.
Optimized for Large Language Models (LLMs): Ironwood is specifically designed to accelerate the growing demands of LLMs and generative AI applications.
Enhanced Interconnect Technology: Benefit from improvements to TPU interconnect technology, enabling faster communication and reduced latency.
High-Performance Computing: Experience significant performance gains for a wide range of inference tasks.
Sustainable AI: Ironwood continues Google Cloud's commitment to sustainability, delivering exceptional performance with optimized energy efficiency.
Ironwood integrates increased compute density, memory capacity, and interconnect bandwidth with significant gains in power efficiency. These features are designed to enable higher throughput and lower latency for demanding AI training and serving workloads, particularly those involving large, complex models. Ironwood TPUs operate within the Google Cloud AI Hypercomputer architecture.
Register your interest for early access.