CITIC SEC released a research report stating that hyper-node solutions are expected to scale rapidly as fundamental computing units for future AI infrastructure. Hyper-node scale-up domains offer advantages such as high-efficiency communication bandwidth and native memory semantics, naturally aligning with the current mainstream Mixture of Experts (MoE) architecture models. The "reverse decoupling" at the system level enhances overall system value, though the design faces challenges like multi-chip power consumption, cooling, and rack reliability. CITIC SEC believes hyper-nodes can elevate the value of the integration segment through higher technological differentiation. The firm is optimistic about the future of hyper-node server integration and recommends focusing on related industry chain companies.
Key insights from CITIC SEC include: 1. **MoE Architecture Drives New Hardware Demands, Hyper-Node Emerges**: Under the scaling law trend, mainstream AI models increasingly adopt MoE architectures to achieve larger parameter scales and higher efficiency. While MoE optimizes computation and memory bottlenecks, it introduces communication challenges, leading to the rise of hyper-node solutions based on scale-up networks. Compared to traditional eight-GPU servers, hyper-nodes face systemic challenges like thermal management, stability in hybrid optical-copper interconnects, and long-term reliability. These require deep collaboration between server manufacturers and upstream suppliers, enhancing the integration segment's influence in the industry chain.
2. **Global Hyper-Node Innovations, Domestic Solutions Excel in Some Areas**: Overseas hyper-nodes include NVIDIA’s NVL72 and Google’s Ironwood Rack with TPUv7 chips (supporting up to 9,216 chips). Domestic solutions like Huawei’s CloudMatrix384, Alibaba’s Panjiu, and Sugon’s ScaleX640 have also emerged. CITIC SEC views this as an early-stage development phase, expecting hyper-node solutions to eventually converge toward limited directions. Key considerations: - **Compute Density**: Larger scale-up domains may improve training/inference performance, but cost and reliability trade-offs remain unresolved. - **Network Topology**: Fat-tree and 3D-Torus architectures each have pros and cons; fat-tree may dominate short-term due to versatility, while tech giants may explore 3D-Torus. - **Physical Connectivity**: Backplane-free orthogonal designs could become mainstream for simplicity and compactness. - **Cooling**: Liquid cooling (e.g., immersion) may gain traction as power density rises, provided stability issues are addressed.
3. **Reverse Decoupling Boosts System Value, Technical Differentiation Rises**: Unlike modular eight-GPU servers, hyper-nodes demand holistic system integration—addressing multi-chip power management, thermal constraints, and rack-level reliability. Server manufacturers evolve from assemblers to "system integrators," requiring cross-component optimization. This raises technical barriers and strengthens the integration segment’s role as a performance and innovation hub.
**Risks**: Supply chain disruptions, chip shortages, slower-than-expected tech giant capex, policy delays, AI adoption hurdles, GPU iteration lags, and intensifying domestic competition.
**Investment Strategy**: Hyper-node technology is nascent, with MoE likely becoming the dominant model architecture. Scale-up hyper-nodes, leveraging efficient networking and native memory semantics, could underpin future AI infrastructure. Despite varying topologies and protocols, advancements in compute density, cooling, and reliability are highly certain. Server manufacturers with customization and supply chain expertise stand to benefit. Focus on related industry chain players.
Comments