North American Cloud Giants Ramp Up NVIDIA Rack-Scale AI Server Purchases, Projecting 122% Surge in Inference Compute by 2026

Stock News05-20 15:23

According to the latest AI industry research from TrendForce, the five major North American cloud service providers (CSPs) are significantly increasing their procurement intentions for rack-scale AI servers in 2026 to expand AI training and inference application deployments. This move is expected to account for over 60% of the global demand for NVIDIA's GB/VR systems and will simultaneously drive the total AI training compute capacity of these five providers to grow by more than 56% annually, with total AI inference compute capacity projected to surge by approximately 122% year-over-year. TrendForce estimates that AI server shipments will increase by over 28% in 2026, with high-end AI training models remaining the mainstay, comprising about 55% of the total. However, in the medium to long term, AI inference models are set to become dominant, primarily because CSPs are actively promoting AI applications to accelerate the commercialization of AI cloud services. Furthermore, companies like NVIDIA are expanding their AI inference solutions and use cases, emphasizing that their flagship GB/VR AI server system for 2024, in addition to AI training, is specifically designed to support AI inference workloads. TrendForce estimates that the combined capital expenditure of Google, Amazon.com, Microsoft, Meta Platforms, Inc., and Oracle will exceed $770 billion in 2026, a year-on-year increase of nearly 87%. Analyzing the computing power obtained by the top five North American CSPs from purchasing NVIDIA's GB/VR series, for AI training based on FP16/BF16 estimates, the total compute capacity of these five providers already surpassed 9 ExaFLOPS in 2025 and is projected to grow by over 56% in 2026. For AI inference, based on FP4/NVFP4 performance estimates, the total compute capacity of the five major North American CSPs exceeded 37 ExaFLOPS in 2025 and is expected to grow substantially by nearly 122% in 2026, significantly higher than the growth for AI training. This reflects NVIDIA's specific focus on optimizing AI inference performance in its latest software and hardware system tuning, implemented in the new-generation GB300 and VR200 rack-scale solutions. In addition to GPU solutions, CSPs are simultaneously advancing their self-developed ASIC rack products, with Google being the most proactive in its布局. TrendForce estimates that Google's demand for its own TPU chips will increase by nearly 80% year-on-year in 2026, with a gradual upgrade from the v7 generation to the v8 generation starting in the second half of the year. Furthermore, Amazon's efforts in self-developed ASICs are second only to Google's; its Trainium series is expected to comprise over 40% of its own AI servers in 2026. TrendForce notes that the new-generation cabinets from NVIDIA, AMD, and CSP self-developed ASICs have all integrated liquid cooling systems. This helps reduce the U-number (server rack unit) of AI servers and increases the number of accelerators a single cabinet can accommodate. As the thermal design power (TDP) of individual AI GPUs or ASICs increases concurrently, the system power consumption of AI servers is structurally放大. According to TrendForce estimates, the combined annual increase in server power consumption for the top five North American CSPs will jump from 2.8GW in 2023 to 18GW in 2026, with a year-on-year growth rate of 116% from 2025 to 2026. The primary reason is the intensification of the AI competition, with platforms like NVIDIA's GB300, AMD's Helios, and CSP self-developed ASICs simultaneously ramping up volume production.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

North American Cloud Giants Ramp Up NVIDIA Rack-Scale AI Server Purchases, Projecting 122% Surge in Inference Compute by 2026

Comments