Large Language Models Undergo Rapid Upgrades: Opportunities in Model Providers, AI Applications, and Infrastructure

Stock News04-01 09:17

CITIC SECURITIES has released a research report stating that since 2026, domestic large language model developers have focused on upgrading Agent and coding capabilities, competitively launching new models. The upcoming next-generation DeepSeek model is expected to continue its strategy of offering high-value open-source models, featuring enhanced memory functions and ultra-long context processing. While refining its code and Agent capabilities, it will also address its multimodal shortcomings, presenting new investment opportunities in three key areas: model providers, AI applications, and AI infrastructure. The firm recommends focusing on these three investment themes.

1) Model Providers: The new DeepSeek model, alongside other domestic models, is anticipated to drive the acceleration of Chinese AI onto the global stage. Concurrently, further reductions in the costs of model training and inference are expected to lead to an overall increase in global API calls for large models due to cheaper tokens. 2) AI Applications: Model accessibility helps alleviate market anxieties surrounding the conflicting narratives about models and applications, facilitating the deployment of AI Agents across various industries. This benefits AI application companies with strong competitive moats. 3) AI Infrastructure: Cost reductions leading to increased usage volumes will benefit AI infrastructure providers. Domestic AI infrastructure is developing in tandem with domestic models.

CITIC SECURITIES' main viewpoints are as follows:

The global upgrade direction for large models focuses on code, Agents, and native multimodality. In AI programming, upgrades in training frameworks, the use of complete code repositories and engineering trajectories as training data, and the introduction of deeper chains-of-thought with multi-step execution and self-correction have evolved AI Coding from a code completion tool to a project-level autonomous agent. The rise of the "Harness Engineer" is expected to shift the role of technical personnel from code engineers to managers of Agents that maximize AI efficiency. In multi-Agent systems, the phenomenal product OpenClaw has fully demonstrated the potential of such systems. Domestic companies like KNOWLEDGE ATLAS, MINIMAX-WP, Tencent, and Kimi have all launched similar products, boosting the productivity of digital employees. In native multimodality, native multimodal architectures have become the mainstream direction, with rapid breakthroughs in hybrid embedding encoding. However, domestic models still need to make breakthroughs in key areas such as real-time audio-video interaction and cross-modal continuous reasoning.

Domestic large models are undergoing intensive iterations and upgrades, with capabilities continuously breaking through. 1) MINIMAX-WP: Coding capabilities have been further upgraded. The M2.7 model scored 56.22% on the SWE-Pro test, surpassing Gemini 3.1 Pro. In the VIBE-Pro test, which simulates end-to-end project delivery scenarios, it scored 55.6%, comparable to Claude Opus 4.6, indicating a stronger understanding of software system operational logic. Additionally, the M2 series models participated in the M2.7 training process in scenarios like RL, achieving self-iteration. 2) KNOWLEDGE ATLAS: The GLM-5 model introduces DSA and a self-developed "Slime" architecture, enabling it to autonomously complete system engineering tasks like Agentic long-range planning and execution, backend refactoring, and deep debugging with minimal human intervention. Its capabilities in tool use and multi-step task execution (MCP-Atlas 67.8%) and web search with information comprehension (Browse Comp 89.7%) are close to or even exceed those of leading overseas models. 3) Kimi: Kimi 2.5 introduces visual capabilities to automatically deconstruct interaction logic and replicate code. It has newly launched an Agent cluster mode, achieving scores comparable to GPT-5.2, Claude 4.5 Opus, and Gemini 3 Pro in Agent application test sets like HLE-Full, BrowseComp, and DeepSearchQA. Moonshoot has adopted a price reduction strategy, with API prices more than 30% lower than the K2 Turbo pricing. 4) XIAOMI-W: The Xiaomi MiMo-V2-Pro model performs close to or even leads some top overseas models in test sets measuring Agent capabilities, such as ClawEval and t2-bench. An early internal test version, under the anonymous codename Hunter Alpha, was launched on OpenRouter and topped the daily usage chart for multiple days. The firm is optimistic about the large model foundation empowering XIAOMI-W's "Person-Car-Home" ecosystem, achieving a leap in AI capabilities.

DeepSeek Outlook: Continuing the high-value route, refining long-text, code, Agent, and multimodal capabilities. DeepSeek V3.2, released in January '26, utilizes a Sparse Attention (DSA) + Mixture of Experts (MoE) architecture, improving training and inference efficiency while reducing costs. Input/output token pricing was reduced by 60%/75% respectively, while benchmark scores for code and multi-Agent capabilities saw significant improvements. Based on DeepSeek's evolution direction and the Engram module paper co-authored by Liang Wenfeng, the firm believes that next-generation models like DeepSeek V4.0 may integrate Engram into the mature DSA+MoE architecture. This could achieve exponential reduction in attention layer computation within the Transformer architecture through hierarchical storage of key, frequently used information, thereby enabling ultra-long context processing. This would enhance model efficiency while refining code and Agent capabilities and addressing multimodal weaknesses.

Risk factors include slower-than-expected development of core AI technologies and application expansion, slower-than-expected reduction in computing costs, severe social impacts from misuse of AI, data security risks, information security risks, and intensifying industry competition.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment