
In its larger incarnation, Google's Ironwood pods can generate a staggering 42.5 Exaflops of inference computing.
Each chip has a peak throughput of 4,614 TFLOPs, which Google claims is a substantial improvement over previous chips.
Google has also boosted memory for the new TPUs, with each chip sporting 192GB, which is six times more than Google's last-gen Trillium TPU.
The memory bandwidth has also increased to 7.2 Tbps, a 4.5x improvement.There are numerous ways to measure AI throughput, making it difficult to compare chips.
Google is using FP8 precision as its benchmark for the new TPU, but it's comparing it to some systems, like the El Capitan supercomputer, that don't support FP8 in hardware.
So you should take its claim that Ironwood "pods" are 24 times faster than comparable segments of the world's most powerful supercomputer with a grain of salt.Google's TPU v6 hardware is also conspicuously absent from the comparison chart above.
The company says Ironwood is twice as powerful per watt compared to that chip, though.
According to a spokesperson, Ironwood is best thought of as a successor to v5p, while TPU v6 (Trillium) was a follow-up to the less powerful TPU v5e.
Google opted not to show the lower-specced chips on this chart, but for the record, Trillium was capable of hitting about 918 TFLOPS at FP8 precision.While the provided benchmarks are a bit odd, Ironwood is clearly a big improvement for Google's AI ecosystem.
It's faster and more efficient than previous TPUs by a considerable margin, and Google's existing infrastructure has enabled rapid improvements to LLMs and simulated reasoning.
Google's market-leading Gemini 2.5 model is running on last-gen TPUs right now, and Google says the higher inference speed and efficiency of Ironwood sets the stage for more breakthroughs in the coming year.Updated April 9 with more detail on how Ironwood compares to Trillium (TPUv6).