
Microsoft announced the Maia 200 chip on Monday to scale AI inference, featuring over 100 billion transistors for faster speeds and higher efficiency than the 2023 Maia 100.
The Maia 200 delivers more than 10 petaflops in 4-bit precision and approximately 5 petaflops in 8-bit performance. Microsoft calls it a silicon workhorse engineered specifically for AI inference tasks. This process involves running trained AI models to generate outputs, distinct from the training phase that builds those models. As AI operations expand, inference now accounts for a growing share of total computing expenses, driving efforts to streamline it.
The company positions the Maia 200 to reduce operational disruptions and power consumption in AI deployments. A single node equipped with the chip handles the largest current AI models while leaving capacity for substantially bigger ones ahead. Microsoft stated, “In practical terms, one Maia 200 node can effortlessly run today’s largest models, with plenty of headroom for even bigger models in the future.”
This release aligns with a pattern among major technology firms developing custom processors to decrease dependence on Nvidia’s graphics processing units, which dominate AI workloads. Nvidia GPUs have grown central to AI success, prompting alternatives to manage hardware expenses.
Google offers tensor processing units, or TPUs, not as standalone chips but as cloud-based compute resources. Amazon provides Trainium AI accelerator chips, with the third-generation Trainium 3 released in December. These options allow companies to shift some workloads away from Nvidia hardware, cutting overall costs.
Microsoft claims the Maia 200 outperforms competitors in key metrics. It achieves three times the FP4 performance of Amazon’s third-generation Trainium chips. Its FP8 performance surpasses that of Google’s seventh-generation TPUs, as detailed in the company’s Monday press release.
The chip already supports internal AI efforts. It powers models developed by Microsoft’s Superintelligence team. Operations for the Copilot chatbot also rely on Maia 200 hardware.
As of Monday, Microsoft extended invitations to external users. Developers, academics, and frontier AI labs can now access the Maia 200 software development kit to integrate into their workloads.
Featured image credit
































