Baidu Intelligent Cloud Upgrades to New Full-Stack AI Cloud, Serving Over 1,000 AI Hardware Firms

Deep News12:21

At the Create 2026 Baidu AI Developer Conference, Shen Dou, Executive Vice President of Baidu Group and President of Baidu Intelligent Cloud Business Group, announced that Baidu Intelligent Cloud will be comprehensively upgraded to a new full-stack AI cloud for large-scale intelligent agent applications. This upgrade leverages Baidu's extensive experience with Kunlun chips, AI cloud services, the ERNIE large language model, and intelligent agents. The goal is to build the best Agent Infrastructure per token in terms of intelligence level and a more powerful, cost-effective AI Infrastructure with higher performance per watt.

It is reported that Baidu Intelligent Cloud currently serves over 1,000 AI hardware companies, providing them with core computing power, model services, and Agent Harness capabilities. The client base includes 90% of the top ten global smartphone manufacturers, 90% of China's top ten AI smart glasses companies, the top five floor-cleaning robot brands, the top five AI toy manufacturers, and over 30 major embodied AI enterprises. Its market share exceeds the combined total of the second and third largest competitors.

Currently, the Baidu Kunlun P800 chip has completed large-scale validation, with multiple ten-thousand-card clusters delivered since 2025. Furthermore, the successful training of the significant ERNIE 5.1 version has been accomplished on a fully domestically produced Kunlun chip cluster. The effective training rate for the entire cluster reached 97%, with linear scalability exceeding 85% for the ten-thousand-card scale cluster. This meets the requirements for large-scale training of cutting-edge large language models in terms of computational precision, operator stability, framework compatibility, and long-duration operation.

Additionally, the Tianchi 256-card super node based on Kunlun chips was activated last month and is set for official launch in June. It offers a 25% increase in throughput performance compared to the previous generation and has completed adaptation for mainstream models including ERNIE, DeepSeek, GLM, and MiniMax. Inference efficiency is improved by 50%, the network architecture is upgraded to HPN5.0, end-to-end latency is optimized by 50%, and it supports the on-demand construction of ultra-large clusters ranging from hundreds of thousands to millions of cards.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment