Sina's open model VibeThinker-3B aims to show reasoning compresses well but factual knowledge doesn't

Sina Weibo's VibeThinker-3B has just three billion parameters but matches models like DeepSeek V3.2 and Kimi K2.5 on math and coding benchmarks. Those models are up to 333 times larger. The secret isn't size but multi-stage post-training. The researchers propose a hypothesis based on their findings: logical reasoning compresses well into small models, but broad world knowledge does not.

domenica 28 giugno 2026 New tab

A Chinese language model with just three billion parameters sometimes matches models a hundred times larger on math and coding tasks. The researchers behind it have developed a hypothesis about how AI capabilities are structured.

Weibo's parent company Sina has released a small language model that competes with today's top models on hard math and coding tasks. According to a technical report, VibeThinker-3B performs on par with DeepSeek V3.2 and Kimi K2.5 on competitive benchmarks like AIME26. Both of those models have 200 to 333 times more parameters.

Sina positions the model as an experiment in figuring out how much compute a model actually needs to compete at the top. Its predecessor, VibeThinker-1.5B, launched in November 2025. The new version pushes further, asking whether a small model can hit genuine top-tier performance, not just be "good for its size."

Across six math and coding benchmarks, the 3B model (orange) falls within the performance range of five current top models including Gemini 3 Pro, GLM-5, and Claude Opus 4.5. | Image: Sina Weibo

Logic scales down, factual knowledge doesn't

Across six math and coding benchmarks, the 3B model (orange) falls within the performance range of five current top models including Gemini 3 Pro, GLM-5, and Claude Opus 4.5. | Image: Sina Weibo

Logic scales down, factual knowledge doesn't

Sina's open model VibeThinker-3B aims to show reasoning compresses well but factual knowledge doesn't

Sina's open model VibeThinker-3B aims to show reasoning compresses well but factual knowledge doesn't

Other newsrooms on this story

Related reading

Sina Weibo's VibeThinker-3B matches flagship AI models with just 3 billion…

Why Weibo’s tiny VibeThinker-3B has the AI world arguing over benchmarks again

VibeThinker-3B: A 3B Dense Reasoning Model Built on Qwen2.5-Coder-3B With the…

Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data…

Small Language Models on Edge Devices: How 2.6B Parameters Are Outperforming…

DeepSeek rival MiniMax says its first AI reasoning model halves compute of R1

Other newsrooms on this story

Related reading

Sina Weibo's VibeThinker-3B matches flagship AI models with just 3 billion…

Why Weibo’s tiny VibeThinker-3B has the AI world arguing over benchmarks again

VibeThinker-3B: A 3B Dense Reasoning Model Built on Qwen2.5-Coder-3B With the…

Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data…

Small Language Models on Edge Devices: How 2.6B Parameters Are Outperforming…

DeepSeek rival MiniMax says its first AI reasoning model halves compute of R1