Weibo's VibeThinker: A Small but Mighty AI Model Defying Industry Norms

Deep News11-28

In the competitive AI arena dominated by tech giants, an unexpected challenger has emerged—Weibo, a platform traditionally seen as a lightweight in AI technology. Based in Beijing's Zhongguancun, often dubbed the "Silicon Valley of China," Weibo has unveiled its first open-source model, VibeThinker, which has achieved remarkable results in advanced mathematical reasoning despite its modest size.

VibeThinker, developed over just three months from September to November, boasts a mere 1.5 billion parameters and was trained at an exceptionally low cost of $7,800. Yet, it scored impressively on prestigious international math tests, challenging the industry's belief that only massive models with hundreds of billions of parameters can handle complex tasks.

Zhang Junlin, Weibo's Chief Scientist, described the achievement as a breakthrough. "No one believed small models could solve such problems, but VibeThinker proved it’s possible," he said. The model’s performance could redefine the rules of AI development, emphasizing efficiency over sheer scale.

**Benchmarking Success** VibeThinker was evaluated on three high-difficulty math test sets: AIME2024, AIME2025, and HMMT2025. These benchmarks, originally designed for elite high school math competitions, serve as rigorous assessments for AI reasoning capabilities.

VibeThinker scored 80.4, 74.4, and 50.5 on these tests, respectively. While these scores don’t top the charts—where models like GPT-5 and Gemini 3.0 Pro dominate with 90+ scores—they stand out given VibeThinker’s tiny parameter count. For context, DeepSeek-R1, a model with 685 billion parameters, scored 70 on AIME2025, while VibeThinker achieved 74.4 with just 1.5 billion parameters.

Zhang categorized AI models into three tiers based on math performance: 1. **Top Tier**: Models like GLM-4.6 (355B params), GPT-5, and Gemini 3.0 Pro, scoring above 90. 2. **Mid Tier**: Models like Gemini 2.5 Pro and OpenAI’s O4 series, averaging 88. 3. **Small but Smart Tier**: VibeThinker, with 74.4, outperforming much larger models in efficiency.

**Training Breakthroughs** The key to VibeThinker’s success lies in its innovative training approach. Starting with a base model derived from Alibaba’s Qwen, Zhang’s team used a modified GRPO reinforcement learning algorithm—a cost-effective alternative to traditional RLHF—to boost scores from 4 to over 50.

Further gains came from the "Spectrum-to-Signal Principle" (SSP), a novel method that rethinks the relationship between supervised fine-tuning (SFT) and reinforcement learning (RL). Instead of optimizing for single-answer accuracy (Pass@1), SSP prioritizes diverse problem-solving (Pass@K), enhancing the model’s exploratory capabilities.

**Practical Applications** VibeThinker’s development was driven by a real-world need: powering Weibo’s "Comment Robert," an AI bot that generates millions of daily replies to engage users. The previous model, while effective, was costly due to its size. VibeThinker’s efficiency slashes operational expenses while maintaining performance.

Looking ahead, Zhang aims to refine VibeThinker further and expand its use cases. "From Robert to Roberts," he quipped, emphasizing the goal of making AI accessible, affordable, and practical for businesses and users alike.

Weibo’s Q3 2025 report highlights the success of its AI initiatives, with its smart search tool reaching 70 million MAUs and comment engagement boosting platform activity. VibeThinker’s open-source release also invites external developers to build upon its advancements, potentially democratizing high-efficiency AI solutions.

In Zhang’s words, "AI should be useful, cheap, and reliable—that’s what truly matters."

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment