INSUBCONTINENT EXCLUSIVE:
In its larger incarnation, Google's Ironwood pods can generate a staggering 42.5 Exaflops of inference computing
Each chip has a peak throughput of 4,614 TFLOPs, which Google claims is a substantial improvement over previous chips
Google has also boosted memory for the new TPUs, with each chip sporting 192GB, which is six times more than Google's last-gen Trillium TPU
The memory bandwidth has also increased to 7.2 Tbps, a 4.5x improvement.There are numerous ways to measure AI throughput, making it
difficult to compare chips
Google is using FP8 precision as its benchmark for the new TPU, but it's comparing it to some systems, like the El Capitan supercomputer,
that don't support FP8 in hardware
So you should take its claim that Ironwood "pods" are 24 times faster than comparable segments of the world's most powerful supercomputer
with a grain of salt.Google's TPU v6 hardware is also conspicuously absent from the comparison chart above
The company says Ironwood is twice as powerful per watt compared to that chip, though
According to a spokesperson, Ironwood is best thought of as a successor to v5p, while TPU v6 (Trillium) was a follow-up to the less powerful
Google opted not to show the lower-specced chips on this chart, but for the record, Trillium was capable of hitting about 918 TFLOPS at FP8
precision.While the provided benchmarks are a bit odd, Ironwood is clearly a big improvement for Google's AI ecosystem
It's faster and more efficient than previous TPUs by a considerable margin, and Google's existing infrastructure has enabled rapid
improvements to LLMs and simulated reasoning
Google's market-leading Gemini 2.5 model is running on last-gen TPUs right now, and Google says the higher inference speed and efficiency of
Ironwood sets the stage for more breakthroughs in the coming year.Updated April 9 with more detail on how Ironwood compares to Trillium