Microsoft today launched MAI-Image-2-Efficient, a lower-cost, higher-speed variant of its flagship text-to-image model that the company says delivers production-ready quality at nearly half the price. The release, available immediately in Microsoft Foundry and MAI Playground with no waitlist, marks the fastest turnaround yet from Microsoft's in-house AI superintelligence team — and the clearest signal that Redmond is serious about building a self-sufficient AI stack that doesn't depend on OpenAI.
The new model is priced at $5 per million text input tokens and $19.50 per million image output tokens, a roughly 41% reduction from MAI-Image-2's pricing of $5 and $33, respectively, for those same tiers. Microsoft says the model runs 22% faster than its flagship sibling and achieves 4x greater throughput efficiency per GPU, as measured on NVIDIA H100 hardware at 1024×1024 resolution. The company also claims it outpaces competing hyperscaler models — specifically naming Google's Gemini 3.1 Flash, Gemini 3.1 Flash Image, and Gemini 3 Pro Image — by an average of 40% on p50 latency benchmarks.
The model is also rolling out across Copilot and Bing, Microsoft said, with additional product surfaces to follow.






