Broadcom’s differentiation is what allows it to grow by not competing directly with NVIDIA. Would Broadcom’s success inspire Nvidia to broaden its customization services?
This article sought to help investors understand the market dynamics of Nvidia and Broadcom (I work in the tech industry for several years, certified AWS solution architect):
More specifically, Tesla won’t be buying Google TPUs (produced by Broadcom) to train their models. Enterprise clients typically operate within a locked-in ecosystem tailored to their specific needs. For example, Google TPUs are used internally to train and power Waymo’s models. While TPUs excel at tensor computations, they are less versatile for general-purpose applications.
To meet their unique scaling requirements, companies often focus on reducing costs by designing workload-specific chips. However, the architecture of these chips is heavily influenced by the design of their networks and systems. Tesla, for instance, approaches its hardware and software needs differently from Google’s framework.
On the cloud provider side, Nvidia rigs offer flexibility to consumers who want to run simple models and transfer those models across platforms(PyTorch and CUDA) without being locked into a specific cloud provider.
By contrast, using Google’s TPUs would tie clients to the Google ecosystem, making it difficult to move models to other platforms or adapt them for alternative applications (IoT, graphics, or automotive systems). Hence the adoption rate of Google TPU is also slow. Unless a company is a Google client and finds value in adopting Google’s framework, most large enterprises avoid this approach.
Likewise, the same can be said of Apple's Metal framework that was launched 10 years ago and yet to gain the same traction as Nvidia's CUDA. So Big Tech developing their own custom chips isn’t new at all as Google TPU was created 9 years ago.
Another consideration is that a specific model or framework can become obsolete, especially if it fails to gain traction. It is not uncommon for internal projects at Google to be scrapped. By locking themselves into Google’s framework, companies risk falling behind if broader, more flexible technologies emerge. For now, Nvidia remains the standard for AI development.
Combination of both allows Big Tech to scale up and grow AI applications even more efficiently in 2025. By focusing on different layers of the AI stack, NVIDIA and Broadcom complement rather than compete with each other. NVIDIA accelerated computing with Broadcom AI Network scaling.
Why Blackwell's demand won't be affected?
NVIDIA's Blackwell GPU architecture highlights the increasing complexity and sophistication of chip design.
As NVIDIA's first true multi-die GPU, the Blackwell series integrates two dies functioning as a unified CUDA GPU, connected by ultra-fast interconnects delivering 10 TB/s bandwidth. This innovation enables the GPU to operate as a single unit, overcoming past technical barriers such as reticle limits and memory locality issues. Blackwell's innovations include a second-generation transformer engine, enhanced security via Nvidia Confidential Computing, and advanced interconnect systems like NVLink for scaling across large GPU clusters. These features have set a new benchmark, leaving competitors like AMD, Intel and Broadcom grappling to catch up in the AI and data center segments
What's Broadcom role if they didn't surpass Blackwell?
Broadcom chips are often used for specialized workloads, such as data movement, AI/ML inference integration with networking fabrics, and large-scale interconnect solutions. Examples include AI-driven switches, data processing units (DPUs), and custom silicon for hyperscalers.
Its AI-related chips, such as the Jericho and Tomahawk series, are primarily designed for high-speed networking, data center connectivity, and AI model training acceleration at the network layer. Collaborations with major tech companies like Google and Meta. Broadcom has been instrumental in developing custom ASIC chips, including Google’s Tensor Processing Units (TPUs) and Meta’s third-generation AI training chips (MTIA 3).
These partnerships are part of a broader strategy to cater to the increasing demand for high-performance AI infrastructure. Broadcom designs chips for specialized use cases, often as fixed-function ASICs or application-specific accelerators for AI-related networking needs.
Overall, it is very common in a tech stack to have both generic and specialized technologies. This isn't the case of GPU making CPU obsolete. ASIC chips aren't going to replace GPU as these specialized chips don't show the ability to scale across more generic use cases. But they don't need to scale because they aren't built to be generic which makes them unique.
Modify on 2024-12-18 22:52
Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.
However, Broadcom's custom chips have indeed brought new possibilities to the market. Looking forward to more of your insights! 🙌📊✨