Sina Weibo's VibeThinker-3B matches flagship AI models with just 3 billion parameters

A language model with 3 billion parameters just matched the reasoning performance of systems that are 200 times its size. The team behind it doesn’t work at OpenAI, Google DeepMind, or Anthropic. They work at a microblogging company.

Sina Weibo, the Chinese social media platform most people associate with viral posts rather than frontier AI research, published a 14-page technical report on arXiv detailing VibeThinker-3B. The model scored 94.3 on AIME 2026, one of the most demanding standardized math competitions in the world, placing it alongside DeepSeek V3.2 and its 671 billion parameters.

Small model, big numbers

The benchmark results tell the story. On AIME 2026, VibeThinker-3B hit 94.3, a score that climbs to 97.1 when using claim-level test-time scaling. On LiveCodeBench v6, a coding benchmark, it posted a Pass@1 score of 80.2. The model also demonstrated superior out-of-distribution performance on recent LeetCode contests, often matching or beating those much larger systems.

The model is built on top of Qwen2.5-Coder-3B as its base architecture. The Sina Weibo team, comprising nine researchers including Sen Xu, Shixi Liu, and Wei Wang, enhanced performance through a combination of curriculum-based supervised fine-tuning, multi-domain reinforcement learning, and offline self-distillation techniques. The paper also introduces the Parametric Compression-Coverage Hypothesis, which offers a theoretical framework for why smaller models can punch above their weight in structured reasoning tasks.

Small model, big numbers

Sina Weibo's VibeThinker-3B matches flagship AI models with just 3 billion parameters

Sina Weibo's VibeThinker-3B matches flagship AI models with just 3 billion parameters

Other newsrooms on this story

Related reading

Why Weibo’s tiny VibeThinker-3B has the AI world arguing over benchmarks again

VibeThinker-3B: A 3B Dense Reasoning Model Built on Qwen2.5-Coder-3B With the…

DeepSeek V3.1 just dropped — and it might be the most powerful open AI yet

DeepSeek’s V3.1 update sparks speculation over fate of next AI model

As agentic AI pushes rivals to raise prices and cap usage, Deepseek V4 is a…

Chinese fintech giant Ant releases powerful AI model to rival DeepSeek, OpenAI

Related reading

Why Weibo’s tiny VibeThinker-3B has the AI world arguing over benchmarks again

VibeThinker-3B: A 3B Dense Reasoning Model Built on Qwen2.5-Coder-3B With the…

DeepSeek V3.1 just dropped — and it might be the most powerful open AI yet

DeepSeek’s V3.1 update sparks speculation over fate of next AI model

As agentic AI pushes rivals to raise prices and cap usage, Deepseek V4 is a…

Chinese fintech giant Ant releases powerful AI model to rival DeepSeek, OpenAI

Other newsrooms on this story