FuriosaAI Partners with Broadcom to Develop Next-Gen AI Inference Accelerator, Focusing on Memory and Bandwidth

Stock News15:39

South Korean AI chip startup FuriosaAI has announced a strategic partnership with Broadcom (AVGO.US) to co-develop its third-generation AI inference accelerator, targeting sample availability in the first half of 2028. This collaboration moves beyond the traditional ASIC model. The new chip will integrate a 2nm advanced process compute die, a separate I/O die, and HBM4(E) memory stacks. It will utilize Broadcom's vertical scaling Ethernet technology to achieve full connectivity between chips within a rack and will be delivered as a rack-scale system.

This partnership builds on the commercial maturity of RNGD, FuriosaAI's AI chip for data center inference, which is now in mass production using TSMC's 5nm process. The RNGD is a 180-watt, PCIe-based AI accelerator. According to FuriosaAI, RNGD has been deployed by Samsung SDS and LG AI Research for large language model and agent AI workloads in standard air-cooled data centers.

FuriosaAI states that its Tensor Contraction Processor architecture is optimized for the mathematical core of AI computation. The chip prioritizes memory access, focusing on high-bandwidth data transfer and large-scale tensor operations rather than managing thousands of fine-grained threads.

Charlie Kawwas, President of Broadcom's Semiconductor Solutions Group, commented, "Inference performance is no longer just about raw compute power; it's increasingly about data reuse and communication efficiency between servers and racks. By combining FuriosaAI's TCP architecture with Broadcom's market-leading XPU technology, IP platform, Ethernet vertical scaling, and fabric, we are building a platform to address key bottlenecks in large-scale agent AI."

The collaboration with Broadcom extends FuriosaAI's broader strategy of building vertically integrated inference infrastructure. Earlier this year, the company positioned its RNGD launch as part of a larger effort to reduce dependence on NVIDIA's (NVDA.US) software ecosystem. CEO June Paik stated at the time, "Our challenge is to replace the CUDA engine with our own software stack." The Broadcom partnership expands this strategy from single-server optimization to rack-level networking and cluster architecture.

The companies position this as more than just a chip-level collaboration. The rack-scale inference platform combines FuriosaAI's inference architecture with Broadcom's Ethernet architecture, PCIe technology, advanced packaging capabilities, and AI infrastructure IP to enable the scaling of inference clusters across thousands of nodes.

This partnership reflects a broader shift within AI infrastructure, where inference workloads are diverging from the training systems that fueled NVIDIA's rise. While large-scale model training still heavily relies on tightly coupled GPU clusters and proprietary interconnects like NVLink, operators face different constraints when deploying inference infrastructure at production scale, including power density, network efficiency, memory bandwidth, latency, and token throughput.

Ron Westfall, Vice President and Practice Lead for Networks and Infrastructure at HyperFrame Research, noted that large-scale inference is shifting AI infrastructure priorities away from the needs that shaped GPU-intensive training clusters. "Large-scale inference shifts the bottleneck to optimizing total cost of ownership, memory bandwidth, and power consumption per token," he said.

Regarding the Broadcom-FuriosaAI partnership, Westfall stated it reflects the industry's growing focus on network efficiency as AI deployment scales beyond tightly coupled training systems. "Optimizing network efficiency and rack-level interconnectivity is now as critical to inference economics as raw chip performance," he added.

Furthermore, Broadcom's Ethernet and PCIe technologies will provide the high-bandwidth, rack-level connectivity required to scale large inference clusters. This architecture also signals growing industry momentum for Ethernet-based AI infrastructure, as suppliers seek alternatives to proprietary GPU architectures. Broadcom is increasingly positioning itself as a core supplier of the network, switching, and interconnect infrastructure underpinning large AI clusters, particularly as hyperscale cloud providers pursue custom accelerators and heterogeneous computing environments.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment