NVIDIA Unveils Rubin Chip Platform with Fivefold Compute Power Surge, Targets Trillion-Dollar Market

Deep News06:32

On March 16, NVIDIA founder Jensen Huang officially launched the Vera Rubin AI acceleration platform at the GTC 2026 conference held at the SAP Center in San Jose. The chip utilizes TSMC's 3-nanometer process and integrates 336 billion transistors, marking an increase of over 60% compared to the previous generation Blackwell's 208 billion transistors. The platform is named after the late American astronomer Vera Rubin, renowned for her evidence of dark matter discovery, symbolizing NVIDIA's ambitious vision for this era.

During his keynote address, Huang announced that the Vera Rubin platform has entered full production and redefined the scale of AI computing power with concrete figures: combined purchase orders for the Blackwell and Rubin architectures are projected to reach $1 trillion by 2027—double NVIDIA's own forecast from last year.

The Rubin platform is not a single chip but a meticulously designed six-chip collaborative system. The Vera Rubin superchip integrates one Vera CPU and two Rubin GPUs within a single processor. The remaining four chips—the NVLink 6 switch, ConnectX-9 super NIC, BlueField-4 data processor, and Spectrum-6 Ethernet switch—collectively form a complete AI factory infrastructure.

The core performance metrics astonished the audience: the Rubin GPU, manufactured using TSMC’s 3nm process, integrates 336 billion transistors and features 288GB of HBM4 memory with a bandwidth of 22TB/s. Inference computing power, measured at FP4 precision, reaches 50 PFLOPS, five times that of Blackwell; training performance hits 35 PFLOPS, exceeding Blackwell by 3.5 times. The entire Vera Rubin NVL72 rack offers 260TB/s of NVLink 6 bandwidth—reportedly surpassing the total bandwidth of the entire internet, according to NVIDIA.

Efficiency is another highlight. NVIDIA claims the Vera Rubin platform reduces inference token costs by ten times compared to Blackwell and cuts the number of GPUs required for training mixture-of-experts (MoE) models by 75%. Huang described it as a "computing factory revolution": Vera Rubin delivers ten times more performance per watt than Grace Blackwell.

Hardware design has also undergone a disruptive transformation. The new NVL72 rack features 100% liquid cooling and a cable-free modular tray design, reducing installation time from two hours in the Blackwell era to just five minutes.

Huang showcased the internal structure of the Rubin Ultra system on stage, officially previewing the next-generation product slated for 2027. Rubin Ultra will adopt the new Kyber rack architecture, arranging 144 GPUs vertically instead of horizontally to increase density and reduce latency.

Disclosed specifications are equally impressive: the Rubin Ultra NVL576 configuration will integrate 576 GPUs in a single rack, boosting FP4 inference performance to 15 ExaFLOPS—four times that of the Rubin NVL144. Memory will be upgraded to HBM4e, with total rack memory capacity reaching 365TB. Power consumption per rack is expected to reach the 600-kilowatt range. NVIDIA anticipates Rubin Ultra will enter mass production and begin deliveries in the second half of 2027.

This confirms NVIDIA’s rigorous annual iteration cadence: Blackwell (2024), Blackwell Ultra (2025), Rubin (2026), Rubin Ultra (2027), and Feynman (2028).

While Rubin has entered mass production, formal deliveries are scheduled for the second half of 2026. Initial deployment partners have been confirmed: AWS, Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure (OCI), along with NVIDIA cloud partners such as CoreWeave, Lambda, Nebius, and Nscale, are among the first purchasers.

Microsoft has committed to deploying Vera Rubin NVL72 rack systems for its next-generation AI data centers, including the future Fairwater AI superfactory. CoreWeave will integrate Rubin systems into its AI cloud platform starting in the second half of 2026. On the manufacturing side, Taiwan’s Quanta has confirmed that initial units could be delivered to customers as early as August 2026.

A central theme of GTC was the paradigm shift of AI from a tool to "agents." Huang extensively discussed OpenClaw—an AI agent framework developed by Austrian developer Peter Steinberger and open-sourced by OpenAI. He compared OpenClaw to the significance of Windows for personal computers. NVIDIA simultaneously introduced the NemoClaw open-source project, integrated with OpenClaw, positioning it as the "operating system for agent computers."

NVIDIA’s expansion is also reaching into space. Huang announced the advancement of the Vera Rubin Space-1 initiative, aiming to build data centers in orbit with computing power equivalent to 25 times that of an H100. The company is repositioning itself from a chip supplier to a foundational infrastructure builder for the entire AI era.

Another notable detail from the keynote was the release of the Nvidia Groq 3 Language Processing Unit (LPU). In December of last year, NVIDIA completed a $20 billion acquisition of AI chip startup Groq’s assets, marking its largest acquisition to date. The inference-specific chip is expected to begin shipping in the third quarter of this year and is viewed as NVIDIA’s new weapon to compete against AMD in the inference market.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment