Tesla and xAI founder Elon Musk has recently maintained a strong focus on China's artificial intelligence sector, repeatedly praising domestic Chinese models on social media platforms and in public appearances. Among them, the models Kimi, ByteDance's Seedance 2.0, and Alibaba's Qwen3.5 have all received his endorsement, with official responses from each company creating a cross-ocean technological dialogue.
On March 16, Moonshot AI's Kimi released a technical report titled "Attention Residuals," which reimagines the residual connection mechanism in large models. This new "attention residual" method enhances both training efficiency and core performance—achieving a 1.25x improvement in training efficiency on a 48B parameter model, alongside a 7.5% increase in scientific reasoning scores and a 3.6% boost in mathematical performance. The industry has hailed this development as a significant signal of "Deep Learning 2.0."
Musk promptly shared and commented on the report on social platform X, stating, "Impressive work from Kimi." The Kimi team responded humorously, saying, "Your rockets are pretty good too!"
Prior to this, Alibaba's Qwen3.5 series models had already captured Musk's attention with their "exceptional intelligence density." On March 2, Alibaba's Qwen officially open-sourced four small-scale models—Qwen3.5-0.8B, 2B, 4B, and 9B—designed to meet full-scenario demands from edge devices to lightweight servers. These models achieve performance beyond their parameter scale, challenging the stereotype that small models are inherently weak. The 9B version performs comparably to models with hundreds of billions of parameters, while the 0.8B and 2B versions run smoothly on mobile and IoT edge devices. Musk commented directly under the official Qwen post on X, noting, "Impressive intelligence density."
Additionally, ByteDance's next-generation video generation model, Seedance 2.0, began internal testing on February 12. Utilizing a unified multimodal audio-video joint generation architecture, it supports inputs in four modalities: text, image, audio, and video. The model addresses industry challenges such as low usability in AI video generation and character detail drift, capable of producing up to 60 seconds of 2K broadcast-quality video. Musk subsequently reposted a related tweet, remarking, "It's happening fast."
Beyond endorsing specific models, Musk also expressed clear predictions during the Davos Forum and in podcast interviews, stating that China's AI computing power will far surpass that of other regions globally. He highlighted stable, low-cost electricity, large-scale infrastructure, and efficient engineering teams as core advantages for Chinese AI.
Industry analysts suggest that Musk's consecutive praises for domestic models reflect comprehensive breakthroughs in China's AI landscape, spanning from underlying architectures and multimodal applications to open-source ecosystems.
Comments