DeepSeek's New Model Approaches Release

Deep News19:18

DeepSeek is advancing the灰度 testing of a new model version, which may represent the final 灰度 release before the official debut of V4. On February 11, some users received prompts for an updated version upon opening the DeepSeek App. After updating to version 1.7.4, users can experience DeepSeek's latest model. This upgrade expands the model's context length from 128K to 1M, an increase of nearly 10 times. The knowledge base has been updated to May 2025, and several core capabilities have seen substantial improvements. Author testing revealed that during interactions, DeepSeek indicated the current version is likely not V4, but rather the final evolutionary form of the V3 series, or the ultimate 灰度 version before V4's official launch.

Nomura Securities issued a report on February 10 stating that the DeepSeek V4 model, expected to launch in mid-February 2026, is unlikely to recreate the global AI computing power demand panic triggered by last year's V3 release. The firm believes V4's core value lies in promoting the commercialization of AI applications through underlying architecture innovation, rather than disrupting the existing AI value chain. According to evaluations, the new version's complex task processing capabilities are now on par with leading closed-source models like Gemini 3 Pro and K2.5. Nomura further indicated that V4 is expected to introduce two innovative technologies: mHC and Engram, which aim to break through computational chip and memory bottlenecks at the algorithmic and engineering levels. Internal preliminary tests show that V4's performance on programming tasks has already surpassed contemporary models from Anthropic Claude and the OpenAI GPT series. The key significance of this release lies in further compressing training and inference costs, providing a viable path for global large language model and AI application companies to alleviate capital expenditure pressures.

**Innovative Architecture Optimized for Hardware Constraints** The Nomura report points out that computational chip performance and HBM memory bottlenecks remain hard constraints that the domestic large model industry cannot avoid. The upcoming DeepSeek V4 introduces the mHC and Engram architectures, which conduct systematic optimizations targeting these weaknesses from both training and inference dimensions.

**mHC:** Stands for "Manifold Constrained Hyper-connections." It aims to solve the information flow bottlenecks and training instability issues in Transformer models when the number of layers becomes extremely deep. Simply put, it enables richer and more flexible "dialogue" between neural network layers, while using strict mathematical "guardrails" to prevent information from being amplified or corrupted. Experiments prove that models using mHC perform better on tasks like mathematical reasoning.

**Engram:** A "conditional memory" module. Its design philosophy decouples "memory" from "computation." Static knowledge within the model is stored in a specialized sparse memory table, which can reside in cheaper DRAM. During inference, the model performs fast lookups when needed. This approach frees up expensive GPU memory (HBM) to focus on dynamic computations.

mHC technology, by improving training stability and convergence efficiency, helps mitigate the generational gap in interconnect bandwidth and computational density of domestic chips to a certain extent. The Engram architecture, meanwhile, restructures memory scheduling mechanisms, aiming to break through VRAM capacity and bandwidth constraints with more efficient access strategies against the backdrop of limited HBM supply. Nomura believes these two innovations together form an adaptation solution tailored for the domestic hardware ecosystem, possessing clear engineering value for practical implementation.

The report further states that the most direct commercial impact of the V4 release will be a substantive reduction in training and inference costs. Optimization on the cost side will effectively stimulate downstream application demand, thereby catalyzing a new cycle of AI infrastructure construction. In this process, Chinese AI hardware manufacturers are expected to benefit from a dual boost driven by demand expansion and front-loaded investment.

**Market Shifts from Monopoly to Multipolar Competition** The Nomura report reviewed changes in the market landscape one year after the DeepSeek V3/R1 release. By the end of 2024, DeepSeek's two models accounted for over half of the token usage for open-source models on OpenRouter. However, by the second half of 2025, with more players entering the field, its market share had significantly declined. The market evolved from a "single dominant player" scenario to a "multipolar competition" landscape. The competitive environment V4 faces is far more complex than a year ago. DeepSeek's "computing power management efficiency" combined with "performance improvements" has accelerated the development of Chinese large language models and applications, altered the global competitive landscape, and increased focus on open-source models.

**Software Companies Face Value Enhancement Opportunity** Nomura believes that global major cloud service providers are fully pursuing Artificial General Intelligence, and the capital expenditure race is far from over. Therefore, V4 is not expected to cause the same level of shockwave in the global AI infrastructure market as last year. However, global large model and application developers are bearing increasingly heavy capital expenditure burdens. If V4 can significantly reduce training and inference costs while maintaining high performance, it will help these companies convert technology into revenue faster and alleviate profit pressures. On the application side, a more powerful and efficient V4 will foster more capable AI agents. The report observes that applications like Alibaba's Tongyi Qianwen are already capable of executing multi-step tasks in a more automated manner. AI agents are transitioning from "conversational tools" to "AI assistants" that can handle complex tasks. These multi-tasking agents require more frequent interaction with the underlying large models, consuming more tokens and consequently driving up computing power demand. Therefore, improvements in model efficiency will not "kill software"; instead, they create value for leading software companies. Nomura emphasizes the need to focus on software companies that can be first to leverage the capabilities of the new generation of large models to build disruptive AI-native applications or agents. Their growth potential may be significantly raised once again due to the leap in model capabilities.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment