China Post Securities: NVIDIA's (NVDA.US) Vera Rubin Reconfigures AI Storage Hierarchy; NAND Poised to Become Inflationary Commodity

Stock News01-08

At the CES 2026 exhibition, Jensen Huang officially announced that its new-generation AI supercomputing platform, Vera Rubin, has entered full-scale production. The Rubin platform reconfigures the HBM-DRAM-NAND three-tier storage architecture. The Rubin GPU integrates the new-generation high-bandwidth memory HBM4, whose unit price has increased significantly compared to HBM3e, and is expected to substantially boost the gross margins of original equipment manufacturers. The Vera Rubin platform deploys BlueField-4 processors within the rack, specifically designed to manage the KV Cache. In terms of pricing, driven by growing demand from cloud service providers and AI applications, the industry anticipates double-digit percentage price increases for NAND throughout 2026. The main viewpoints of China Post Securities are as follows: NVIDIA's Vera Rubin enters full production, reconfiguring the storage architecture to alleviate the "memory wall" bottleneck. At the CES 2026 exhibition, Jensen Huang officially announced that its new-generation AI supercomputing platform, Vera Rubin, has entered the full-scale production phase. According to data released by NVIDIA, the Rubin GPU is equipped with a third-generation Transformer engine, delivering NVFP4 inference/training computing power of 50/35 PFLOPS, which is 5/3.5 times that of the previous Blackwell generation; HBM4 bandwidth reaches 22TB/s, 2.8 times that of the previous generation; and the transistor count is 336 billion, 1.6 times that of Blackwell. To resolve the context storage bottleneck, the Rubin platform reconfigures the storage pyramid into a three-tier HBM-DRAM-NAND architecture. In the era of Agentic AI, intelligent agents need to remember extensive conversation histories and complex contexts, which generates a massive KV Cache. The traditional solution involves cramming this data into expensive HBM video memory, but HBM capacity is limited and costly. NVIDIA has designed a completely new storage architecture for this purpose, introducing a third-tier inference context memory storage platform powered by BlueField-4, increasing the number of tokens processed per second by up to 5 times. HBM: The Rubin GPU upgrades to HBM4, becoming the "computing core" tightly bound to the GPU. The Rubin GPU integrates the new-generation high-bandwidth memory HBM4, whose interface width is doubled compared to HBM3e. Through a new memory controller, deep co-design with the memory ecosystem, and tighter compute-memory integration, the Rubin GPU's memory bandwidth reaches almost three times that of Blackwell. Quantitatively, each Rubin GPU HBM4 offers 288GB capacity and 22TB/s bandwidth; it is no longer just a "high-speed cache" near the GPU but a hard constraint on the overall system throughput. In terms of unit price, HBM4 is significantly more expensive than HBM3e, which is expected to noticeably boost the gross margins of original equipment manufacturers. DRAM: The Vera CPU upgrades to LPDDR5X, responsible for storing warm data (KV cache). Vera combines SCF with an LPDDR5X memory subsystem of up to 1.5TB (Grace memory is 480GB LPDDR5X), providing bandwidth of up to 1.2TB/s (Grace bandwidth is 512GB/s) under low power consumption. In application, LPDDR5X and HBM4 can be viewed as a single, coherent memory pool, reducing data movement overhead and supporting technologies like KV cache offloading and efficient multi-model execution. Regarding pricing, high-end server DRAM prices/profits have increased significantly, while consumer DRAM is承受ing cost pressures and price transmission under passive squeeze, forming a new structural price increase cycle characterized by "AI-first." NAND: Introduction of the BlueField-4-powered inference context memory storage platform positions NAND to potentially become an inflationary commodity linearly correlated with GPU quantity. The Vera Rubin platform deploys BlueField-4 processors within the rack, specifically designed to manage the KV Cache. BlueField-4 integrates a 64-core Grace CPU, high-bandwidth LPDDR5X memory, and ConnectX-9 networking, delivering ultra-low-latency Ethernet or InfiniBand connections of up to 800Gb/s. Regarding capacity, on top of the original 1TB memory per GPU, the BlueField-4 DPU memory storage platform adds an extra 16TB of memory per GPU, increasing the total by 1152TB for an NVL72 rack. In terms of unit price, fueled by demand growth from cloud service providers and AI applications, the industry forecasts double-digit percentage price increases for NAND throughout 2026. Investment recommendations favor the narrative upgrade logic within the storage industry chain. It is suggested to focus on: 1) Overseas leaders: SK Hynix, Samsung, Micron Technology (MU.US), SanDisk (SNDK.US), Kioxia (KIOX.US), etc.; 2) Domestic targets: Shannon Xinchuang (300475.SZ), Demingli (001309.SZ), GigaDevice (603986.SH), Puram (688766.SH), Tongyou Technology (00302.SZ), etc. Risk warnings include supply and demand rhythms falling short of expectations, intensifying industry competition, and technological iteration lagging behind expectations.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment