MiniMax has launched M3, a flagship AI model with a one-million-token context window and native multimodal capabilities. The model uses the company’s …

Shanghai-based company says M3 can process data five times faster than its predecessor, while also slashing inference costs.

Chinese start-up MiniMax has launched M3, an AI model with a redesigned architecture that reduces computational needs by up to 95%, enhancing efficiency and response speeds.

Chinese AI company MiniMax has released its new model M3. It's billed as the first open-weight model to combine top-tier coding performance, a one-million-token context window,…

M3 demonstrates that the next phase of agent development will not just be driven by larger datasets, but by efficient architectural choices.

MiniMax releases M3 with MSA architecture, 1M-token context, native multimodality, and frontier-level coding and agentic performance.

How Together served MiniMax-M3 efficiently with KV-block-major sparse attention, paged MSA decode, optimized index scoring, and a Rust-based multimodal gateway.

MiniMax has launched M3, a flagship AI model with a one-million-token context window and native multimodal capabilities. The model uses the company’s …

Author(s): Chew Loong Nian - AI ENGINEER Originally published on Towards AI. MiniMax M3 Decodes 1M Tokens 15x Faster — and It Shouldn't Be This Cheap O ...