MiniMax officially released MiniMax M3 on June 1, 2026. The model introduces MSA (MiniMax Sparse Attention), a new sparse attention architecture that gives M3 a 1M-token context window. M3 also supports image and video input and desktop computer operation natively. The API is live now.
MiniMax M3 is available today via MiniMax Code, the MiniMax Token Plan, and the MiniMax API. It is the next model in the M-series line after M2.7. MiniMax positions M3 as an open-weight model combining frontier-level coding performance, a 1M-token context window, and native multimodal input in a single architecture — the first to do so, per MiniMax. The corresponding model weights and technical report are scheduled for release within 10 days of launch.
MSA: MiniMax Sparse Attention
The central architectural change in MiniMax M3 is MSA (MiniMax Sparse Attention). Standard full attention has quadratic computational complexity: as context length grows, compute cost grows as the square of the sequence length. MSA is designed to address this.
Sparse attention mechanisms generally add a pre-filtering stage before computing attention, avoiding full quadratic cost. MiniMax team states that compared to approaches like DSA and MoBA, MSA partitions the KV cache into blocks more precisely, achieving higher effective context coverage.













