This article was originally published on runaihome.com

TL;DR: Qwen3-Coder-Next is an 80B Mixture-of-Experts model that activates only 3 billion parameters per token, scoring 71.3% on SWE-bench Verified — competitive with closed-source frontier models. The catch is raw memory: the Q4_K_M GGUF weighs 48.7 GB, so you need either dual 24 GB cards, a Mac Studio with 64 GB+ unified memory, or a single RTX 5090 with aggressive RAM assist. A solo RTX 4090 can technically run it at IQ2 quality, but that is a different model from what the benchmarks describe.

Dual RTX 3090

Mac Studio M4 Max 64 GB

RTX 5090 + 128 GB DDR5