TL;DRAI

MiniMax released M3 with 1M-token context, MSA sparse attention, and open weights, scoring 59% on SWE-Bench Pro—matching Opus 4.7. For developer teams, M3's agentic performance and 15× decoding speedup enable self-hosted agents and cost-effective fine-tuning.

MiniMax officially released MiniMax M3 on June 1, 2026. The model introduces MSA (MiniMax Sparse Attention), a new sparse attention architecture that gives M3 a 1M-token context window. M3 also supports image and video input and desktop computer operation natively. The API is live now.

MiniMax M3 is available today via MiniMax Code, the MiniMax Token Plan, and the MiniMax API. It is the next model in the M-series line after M2.7. MiniMax positions M3 as an open-weight model combining frontier-level coding performance, a 1M-token context window, and native multimodal input in a single architecture — the first to do so, per MiniMax. The corresponding model weights and technical report are scheduled for release within 10 days of launch.

MSA: MiniMax Sparse Attention

The central architectural change in MiniMax M3 is MSA (MiniMax Sparse Attention). Standard full attention has quadratic computational complexity: as context length grows, compute cost grows as the square of the sequence length. MSA is designed to address this.

Sparse attention mechanisms generally add a pre-filtering stage before computing attention, avoiding full quadratic cost. MiniMax team states that compared to approaches like DSA and MoBA, MSA partitions the KV cache into blocks more precisely, achieving higher effective context coverage.

marktechpost.com

MiniMax Releases MiniMax M3 with MSA Architecture Supporting 1M-Token Context, Native Multimodality, and Agentic Coding

MiniMax releases M3 with MSA architecture, 1M-token context, native multimodality, and frontier-level coding and agentic performance.

lunedì 1 giugno 2026 New tab

TL;DRAI

1,880 words~9 min read

MSA: MiniMax Sparse Attention

MiniMax Releases MiniMax M3 with MSA Architecture Supporting 1M-Token Context, Native Multimodality, and Agentic Coding

MiniMax Releases MiniMax M3 with MSA Architecture Supporting 1M-Token Context, Native Multimodality, and Agentic Coding

Other newsrooms on this story

Related reading

MiniMax launches M3 model

MiniMax M3 Explained: The Sparse Attention Breakthrough

MiniMax M3: Open-weight model with a million-token context challenges…

Serving MiniMax-M3 for efficient inference: Unlocking 1M-Token Context and…

MiniMax teases M3 model with 15.6x faster decoding speed boost

MiniMax teases M3 model with new sparse attention mechanism, 15.6X long-context…

Other newsrooms on this story

Related reading

MiniMax launches M3 model

MiniMax M3 Explained: The Sparse Attention Breakthrough

MiniMax M3: Open-weight model with a million-token context challenges…

Serving MiniMax-M3 for efficient inference: Unlocking 1M-Token Context and…

MiniMax teases M3 model with 15.6x faster decoding speed boost

MiniMax teases M3 model with new sparse attention mechanism, 15.6X long-context…