Weibo VibeThinker-3B Scores 94.3 on AIME 2026
Key insights
- VibeThinker-3B scored 94.3 on AIME 2026 with 3 billion parameters, matching DeepSeek V3.2 at 671 billion parameters.
- Predecessor VibeThinker-1.5B cost roughly $7,800 to train, versus tens or hundreds of millions at OpenAI or Google.
- Full model weights and code are MIT-licensed on Hugging Face and GitHub, enabling independent benchmark verification.
Why this matters
A 3-billion-parameter model matching 671-billion-parameter systems on AIME 2026 under an MIT license is the most testable small-model efficiency claim published this year, because open weights let any practitioner independently reproduce the benchmarks. For founders building inference pipelines, VibeThinker-3B confirms that the gap between parameter count and reasoning performance is closing fast enough to make edge deployment and decentralized inference commercially viable today. The $7,800 training cost of its predecessor signals that competitive reasoning models are no longer gated behind nine-figure compute budgets, restructuring who can build and who can compete in applied AI.
Summary
Sina Weibo's nine-person research team published VibeThinker-3B under the MIT license, reporting that a 3-billion-parameter model scored 94.3 on AIME 2026, matching DeepSeek V3.2 at 671 billion parameters.
The model is built on Qwen2.5-Coder-3B and trained using curriculum-based supervised fine-tuning, multi-domain reinforcement learning, and offline self-distillation. On LiveCodeBench v6, it posted a Pass@1 of 80.2. The team also introduces the Parametric Compression-Coverage Hypothesis as a theoretical framework for why compact models can punch above their weight on structured reasoning.
Essentially: (Sina Weibo) built a publicly verifiable compact model that benchmarks alongside frontier systems costing hundreds of millions to train.
- Predecessor VibeThinker-1.5B launched in November 2025 and cost roughly $7,800 to train.
- Full weights and code are MIT-licensed on Hugging Face and GitHub.
- Claim-level test-time scaling pushes the AIME 2026 score to 97.1.
The MIT license means anyone can download the weights and verify the benchmark claims themselves.
Potential risks and opportunities
Risks
- If AIME and LiveCodeBench scores do not transfer to general deployment, enterprises that build production pipelines on VibeThinker-3B's 94.3 AIME result could face costly rollbacks.
- Frontier AI labs including OpenAI, Google DeepMind, and Anthropic face compounding commoditization pressure as open MIT-licensed models narrow the reasoning gap, threatening subscription-based model-access revenue.
- Decentralized inference networks that rush to market VibeThinker-3B integrations before independent validation of benchmark reproducibility risk overselling performance to token holders.
Opportunities
- Decentralized compute and inference networks can immediately integrate VibeThinker-3B under the MIT license, since a 3-billion-parameter model is far more practical to host on distributed hardware than 671-billion-parameter alternatives.
- Startups and open-source developers gain a frontier-level math and coding reasoning backbone at near-zero licensing cost, enabling competition with well-capitalized incumbents on structured reasoning tasks.
- Research groups can build on the Parametric Compression-Coverage Hypothesis and the freely available Sina Weibo codebase to push the next wave of sub-10-billion-parameter reasoning models.
What we don't know yet
- Training cost for VibeThinker-3B itself is not disclosed; only the predecessor's $7,800 figure appears in the paper.
- Whether the Parametric Compression-Coverage Hypothesis has received independent peer review beyond the Sina Weibo team's own technical report.
- How VibeThinker-3B performs on unstructured, real-world reasoning tasks outside math and coding benchmarks is not reported.
Originally reported by cryptobriefing.com
Read the original article →Original headline: Weibo's VibeThinker-3B Claims Frontier-Level Reasoning at 3 Billion Parameters, Scores 94.3 on AIME 2026 Under MIT License