As the Lunar New Year approached, the AI battlefield was already heating up even before the New Year's Eve fireworks lit up the sky. The 2026 Spring Festival season saw a fierce competition among Chinese tech giants, with Baidu offering 5 billion yuan in red packets, Tencent's Yuanbao distributing 10 billion, and Alibaba launching a 30 billion yuan free purchase campaign. Beneath the surface, this period marked an unprecedented collective display of strength from China's major AI model developers.
Starting around the twentieth day of the twelfth lunar month, ByteDance, Zhipu, MiniMax, and Kimi unleashed their trump cards. DeepSeek quietly completed crucial iterations, while Alibaba's Qwen3.5 was poised for release. Baidu kept its "Project O" under wraps, and Tencent hinted at future developments through a technical blog post by key figure Yao Shunyu.
This wasn't merely routine incremental updates but rather a confrontation about the direction of large language models in their next phase. Each company is betting on what kind of model users and developers will truly find indispensable in the coming two years.
Among major players, ByteDance currently leads the pack. Its Seedance 2.0 has been the only true breakout hit of this Spring Festival season. On February 7, without any formal announcement or press release, ByteDance casually dropped a "Kill the game" message in a Feishu document. What followed exceeded all expectations: Black Myth: Wukong producer Feng Ji called it "the strongest video generation model on Earth, bar none"; popular tech reviewer Tim from Film Hurricane used the word "terrifying" six times; media and entertainment stocks surged in secondary markets; and overseas users on X platform were scrambling to find Chinese phone numbers to try Seedance 2.0.
The breakthrough lies in Seedance 2.0's transformation of video generation from a novelty to a practical tool. It supports four-modal input (text, image, audio, video) and generates coherent multi-shot sequences. Most impressively, it demonstrates understanding of the physical world - uploading a frontal photo of a building can generate camera movement showing the back of the structure with remarkable accuracy.
ByteDance's move demonstrates two key points: first, video generation isn't Sora's exclusive domain, as Chinese companies can not only keep up but surpass; second, following DeepSeek, ByteDance has become the second Chinese player to give Silicon Valley "technology gap anxiety."
However, concerns emerged just two days after launch when the platform restricted realistic image-to-video generation due to potential misuse. On February 12, China's cyberspace administration announced it had dealt with 13,421 accounts and removed over 543,000 illegal items, vowing strict management of unlabeled AI-generated false information.
Alibaba appears to be preparing its next move quietly. On February 9, code merge requests for Qwen3.5 appeared on Hugging Face, revealing a new hybrid attention mechanism and potential native visual language model capabilities, with plans to open-source both 2B dense and 35B-A3B MoE versions. This represents a strategic shift for Alibaba toward building models that truly "understand the world" visually.
Baidu took a different approach, focusing on its existing strengths rather than new model releases. While being the first to launch red packet campaigns and serving as Beijing TV's Spring Festival Gala AI partner, Baidu remained quiet about new models. The circulating "Project O" appears related to Baidu App, suggesting the company's strategy centers on fortifying its existing user base of 200 million monthly active users as an AI super-entry point.
Tencent, though not releasing new foundation models, may have made the most profound strategic move. The company's key development was AI expert Yao Shunyu's first research publication since joining Tencent. The CL-bench study revealed that top language models average only 17.2% success rate in learning new knowledge from context. This suggests Tencent is betting on defining the next phase of AI competition around context provision rather than pure model training.
Among startups, differentiation became increasingly clear. Zhipu's GLM-5, with 744B parameters and SWE-bench score of 77.8%, positions itself as a "system architect" rather than conversational assistant, capable of decomposing requirements and delivering deployable products. MiniMax's M2.5 focuses on cost efficiency, theoretically supporting four Agents working continuously for one year at $10,000. Kimi's K2.5 features native multimodal architecture and Agent clusters that can work in teams. DeepSeek, while making no official announcements, showed significant updates including context window expansion to 1M tokens.
The intense activity over these twenty-plus days reveals a clear trend: the AI industry has moved beyond the fantasy of universal models. Each company is pursuing specialized excellence - ByteDance in video generation, Zhipu in Agent engineering, MiniMax in cost efficiency, Kimi in multimodal applications, DeepSeek in long-context reasoning, Alibaba in visual foundation models, Baidu in platform integration, and Tencent in context learning.
This specialization signals industry maturity. While Qwen3.5 remains unreleased, DeepSeek V4 is still in development, Baidu's Project O stays mysterious, and Tencent's context learning revolution exists only on paper, one thing is certain: mere conversational ability no longer guarantees a seat at the table in 2026. The real winners will be those who can genuinely integrate into workflows, embed into production lines, and reconstruct cost structures.
Comments