On April 9, Google lifted the curtain on Ironwood at its Cloud Next 2025 event, which is its seventh-generation Tensor Processing Unit (TPU) to accelerate the development of artificial intelligence infrastructure.
The purpose-built for inference, Ironwood, is expected to transform what’s possible in large-scale AI processing.
Google Plans to Accelerate AI Development
According to the official source, Ironwood is Google’s first architecture designed to power “thinking models” that not only interpret information but proactively generate insights. The launch will help the tech giant to step into the “age of inference”, where AI agents are expected to go beyond reactive tasks and start anticipating user needs.
The tech giant says that Ironwood architecture is both massively scalable and remarkably energy-efficient. At its highest configuration, 9,216 liquid-cooled chips combine to deliver a staggering 42.5 exaflops, which is 24 times more than the compute of the world’s most powerful supercomputer, El Capitan.
Each chip provides 4,614 TFLOP of compute, which is supported by 192 GB of High Bandwidth Memory and breakthrough inter-chip communication speeds of 1.2 Tbps.
Power efficiency remains a core pillar. Ironwood is nearly 30 times more power-efficient than Google’s first Cloud TPU. It delivers twice the performance-per-watt of last year’s Trillium, which enhances cooling systems and memory design, reduces latency, and allows for consistent operation under demanding AI loads.
With support for Google’s Pathways software stack, Ironwood allows developers to orchestrate hundreds of thousands of chips across pods, paving the way for breakthroughs in areas ranging from advanced language models to scientific simulations.
As the backbone for the flagship models like Gemini 2.5 and AlphaFold, Ironwood is not just a chip. It is any infrastructure leap designed for the AI ambitions of tomorrow.
Read Also: Qualcomm and DeepRoute.ai to Develop Smart Driving Solutions