While recent breakthroughs in AI reasoning have largely been driven by massive scale, pouring in billions of parameters to cross complex cognitive thresholds—VibeThinker-3B is charting a completely different path.

Created by researchers from Sina Weibo Inc (China), this 3-billion-parameter model proves that efficiency can punch far above its weight class. Released under an open-source MIT license, VibeThinker-3B matches the performance of models hundreds of times its size on verifiable tasks like mathematics, coding, and STEM disciplines.

What is VibeThinker-3B

VibeThinker-3B is a compact dense model built on the Qwen2.5-Coder-3B base. It is post-trained, not pretrained from scratch. The research team applies supervised fine-tuning, reinforcement learning, and self-distillation on top.

The training framework continues the Spectrum-to-Signal Principle (SSP) from the earlier VibeThinker-1.5B. SFT (Supervised Fine-Tuning) builds a broad space of valid reasoning paths, the ‘Spectrum.’ RL then amplifies the correct paths, the ‘Signal.’