
Google launched the seventh era of its Tensor Processing Unit (TPU), Ironwood, final week. Unveiled at Google Cloud Subsequent 25, it’s mentioned to be the corporate’s strongest and scalable {custom} synthetic intelligence (AI) accelerator. The Mountain View-based tech large mentioned the chipset was particularly designed for AI inference — the compute utilized by an AI mannequin to course of a question and generate a response. The corporate will quickly make its Ironwood TPUs obtainable to builders through the Google Cloud platform.
Google Introduces Ironwood TPU for AI Inference
In a weblog publish, the tech large launched its seventh-generation AI accelerator chipset. Google acknowledged that Ironwood TPUs will allow the corporate to maneuver from a response-based AI system to a proactive AI system, which is concentrated on dense massive language fashions (LLMs), mixture-of-expert (MoE) fashions, and agentic AI methods that “retrieve and generate knowledge to collaboratively ship insights and solutions.”
Notably, TPUs are custom-built chipsets aimed toward AI and machine studying (ML) workflows. These accelerators supply extraordinarily excessive parallel processing, particularly for deep learning-related duties, in addition to considerably excessive energy effectivity.
Google mentioned every Ironwood chip comes with peak compute of 4,614 teraflop (TFLOP), which is a significantly increased throughput in comparison with its predecessor Trillium, which was unveiled in Might 2024. The tech large additionally plans to make these chipsets obtainable as clusters to maximise the processing energy for higher-end AI workflows.
Ironwood could be scaled as much as a cluster of 9,216 liquid-cooled chips linked with an Inter-Chip Interconnect (ICI) community. The chipset can be one of many new elements of Google Cloud AI Hypercomputer structure. Builders on Google Cloud can entry Ironwood in two sizes — a 256 chip configuration and a 9,216 chip configuration.
At its most expansive cluster, Ironwood chipsets can generate as much as 42.5 Exaflops of computing energy. Google claimed that its throughput is greater than 24X of the compute generated by the world’s largest supercomputer El Capitan, which provides 1.7 Exaflops per pod. Ironwood TPUs additionally include expanded reminiscence, with every chipset providing 192GB, sextuple of what Trillium was outfitted with. The reminiscence bandwidth has additionally been elevated to 7.2Tbps.
Notably, Ironwood is at the moment not obtainable to Google Cloud builders. Identical to the earlier chipset, the tech large will probably first transition its inner methods to the brand new TPUs, together with the corporate’s Gemini fashions, earlier than increasing its entry to builders.
