Amazon's AI Accelerator Gains Traction as NVIDIA's Dominance Faces Challenges

Stock News05-20

Amazon's (AMZN) Trainium AI accelerator is reportedly gaining favor among AI developers who have historically relied on NVIDIA (NVDA) products. NVIDIA's GPUs are widely considered dominant in the AI accelerator market, with supply constrained by robust demand from hyperscale data centers, cutting-edge AI model labs, and other buyers. While alternatives exist, including offerings from AMD, Google, and other custom Application-Specific Integrated Circuits (ASICs), reports indicate a growing number of developers are recognizing the appeal of Amazon's Trainium.

The Information cited interviews with six individuals who use or work with the chip. Daniel Svonava, CEO of Superlinked, stated, "We have always viewed insufficient software support as a barrier. But that has changed in the last few months; that barrier has been removed." Another developer, Bojan Jakimovski, head of machine learning at Loka, also noted increased interest in Trainium over recent months, partly due to tight supply of NVIDIA GPUs. He added that one client switched its inference workloads to Trainium's second-generation chip after testing showed it could reduce costs by up to 35% compared to NVIDIA's H100 series chips. However, Jakimovski still recommends using NVIDIA products for large language model training.

Amazon CEO Andy Jassy recently stated that the company's chip business, if operated independently, could generate $50 billion in annual revenue. In a recent letter to shareholders, Jassy wrote, "Our custom chip business is now one of the world's top three data center chip businesses."

Why are developers moving from "no choice" to "active embrace"? NVIDIA's GPUs are widely regarded as the leader in the AI accelerator market, with its CUDA software ecosystem creating a formidable moat for competitors. However, this very dominance has led to a prolonged state of tight supply, with insatiable demand from hyperscale cloud providers, AI labs, and other buyers creating a structural shortage of NVIDIA GPUs. This supply-demand imbalance has fostered a rigid need for alternatives. While options like AMD, Google's TPU, and other custom ASICs exist, Trainium is gaining practical adoption from developers at a pace exceeding market expectations.

Software Ecosystem: A Qualitative Shift from "Barrier" to "Removed" Daniel Svonava's comment to The Information succinctly captures this turning point: "We have always viewed insufficient software support as a barrier. But that has changed in the last few months; that barrier has been removed." The weight of this statement lies in the fact that in AI chip competition, hardware specifications often determine a product's lower limit, while the software ecosystem determines its upper limit. Trainium's transformation from a "barrier" to "removed" at the software level signifies it is no longer a mere alternative for limited testing but a productivity tool ready for scaled commercial deployment.

Cost Advantage: A New Generation Tool for "Cost Reduction and Efficiency" Bojan Jakimovski similarly observed Trainium's attractiveness rising significantly, backed by solid economic logic. While difficulty in procuring NVIDIA GPUs is a direct reason for some clients' shift, a more critical factor is the cost advantage. One client, after testing revealed Trainium's second-generation chip could lower costs by up to 35% compared to NVIDIA's H100 series, decisively switched its inference workloads to Trainium. With AI inference workloads increasingly consuming the majority of computing power (currently about two-thirds of all AI computation), a 35% cost advantage could mean annual savings of millions to tens of millions of dollars in computing expenses for a medium-sized AI company. This is not a minor shift in a zero-sum game but a structural advantage substantial enough to alter procurement decisions.

Architectural First-Mover Advantage: A Unique Moat in MoE Inference Gavin Baker's assessment is particularly sharp and technically insightful. He points out that current leading-edge AI models predominantly use a Mixture of Experts (MoE) architecture, and running inference tasks for such models requires a Switched Scale-up Network infrastructure. Currently, only two companies globally have operational switched scale-up networks: one supporting NVIDIA's GPU clusters and the other powering Amazon's Trainium. This means that in the rapidly growing, critical arena of MoE model inference, Trainium is not merely a follower but a first-mover with unique technical barriers. Baker further notes that Google's TPU does not possess equivalent capability in this domain, revealing that while Google invented the MLPerf benchmark, it has never submitted TPU test results. This detail undoubtedly strengthens the market's reassessment of Trainium's technical uniqueness. Baker predicts that after Trainium 3 enters mass production in the second half of this year, Trainium's market position in 2026 will be equivalent to that of TPU in 2025.

Customer Ecosystem: A Critical Transition from "Tens of Thousands" to "Hundreds of Thousands" Trainium's breakthrough is evident not only in technology but also in the scaled validation of its customer base. According to Amazon's disclosure during its deepened strategic cooperation with Anthropic in April, both Trainium and Graviton each have over 100,000 customers, with the majority of Amazon Bedrock's current inference tasks running on Trainium. The figure of 100,000 customers marks a qualitative leap in Trainium's customer base since the second half of 2025—it is no longer a niche product tested in a few labs but a systematic alternative with large-scale commercial validation.

Anthropic and OpenAI: The Ultimate "Proof of Quality" At the key client level, Trainium has secured deep commitments from two of the world's most important AI model companies. On April 20, Amazon and Anthropic announced a deepened strategic partnership: Amazon committed an additional investment of up to $250 billion in Anthropic, while Anthropic pledged to invest over $100 billion in AWS-related technologies over the next decade and purchase up to 5 gigawatts of computing power from AWS's current and future generations of Trainium chips. Anthropic's flagship Claude model runs on over 1 million Trainium2 chips.

OpenAI's involvement is equally significant. In February, OpenAI and Amazon established a multi-year strategic partnership, with Amazon investing $500 billion and providing OpenAI with 2 gigawatts of Trainium computing capacity. OpenAI committed to using Trainium 3 and the next-generation Trainium 4 chips to support its broad range of advanced AI workloads. For chip products, client quality often carries more signaling value than client quantity. When the world's most technically discerning AI frontier labs choose to run core workloads on Trainium, it serves as the most powerful endorsement of the chip's performance and ecosystem maturity.

From "Renting Computing Power" to "Direct Chip Sales": A Blueprint for a $500 Billion Empire More notably is the strategic elevation of Trainium's business model. In April, Amazon CEO Andy Jassy disclosed in a letter to shareholders that the company is considering shifting from its previous internal-use-only strategy to directly selling its self-developed chips and full server racks to third parties. If this division were independently operated and fully opened to the market, its annualized revenue could reach $500 billion. Jassy further noted this figure already surpasses the levels of AMD and Intel for the same period, stating plainly, "Our custom chip business is now one of the world's top three data center chip businesses."

This is not just theoretical. As of the disclosure, Amazon has secured $2.25 trillion in revenue commitments for Trainium chips, covering strategic clients like Anthropic and OpenAI. Trainium2's price-performance ratio is already 30% higher than comparable GPU products and is essentially sold out. Trainium3, which just began shipping in 2026, offers a 30% to 40% price-performance improvement over Trainium2 and is almost entirely pre-booked. Even Trainium4, which is about 18 months away from mass production, has had the majority of its capacity locked in. Having two generations of products sold out and the next generation pre-booked before mass production is an extremely rare demand signal in semiconductor industry history. It indicates Trainium's appeal is not short-term hype but a long-term strategic lock-in by clients after thorough evaluation.

The ASIC Structural Inflection Point Trainium's rise is reshaping the deepest industrial relationship in the AI chip field—the long-standing "supplier-customer" dynamic between Amazon and NVIDIA. The relationship was once clear: NVIDIA designed and manufactured the most powerful AI chips, and Amazon, as one of the largest cloud service providers, purchased them at scale. However, when Amazon began designing and deploying its own AI accelerators, their roles shifted subtly. Latest data shows Amazon currently deploys more Trainium servers than NVIDIA servers, and the company estimates its self-developed chips save tens of billions in capital expenditures compared to purchasing external GPUs.

Yet this relationship is not a simple substitution. Amazon has neither abandoned purchasing NVIDIA chips—recently signed procurement commitments are still expanding—nor stopped heavily investing in Trainium. The two currently present a complex "competitive coexistence" landscape: Trainium is rapidly expanding its share in inference workloads, while NVIDIA GPUs still dominate in training large-scale foundation models.

From a broader industry perspective, custom ASICs are undergoing a structural inflection point. Data shows that in 2026, custom AI chips from Google, Microsoft, Amazon, and Meta are expanding at a 44.6% compound annual growth rate, while general-purpose GPUs are growing at only 16.1%. The growth of custom ASICs is primarily targeting the inference market—which currently accounts for about two-thirds of all AI computation. Although NVIDIA still holds over 90% of the AI accelerator market share currently, analysts predict its share in the inference segment could drop from over 90% to 20-30% by 2028. Trainium is one of the most important variables in this wave of custom ASICs. As industry reports assert: 2026 marks the moment when "custom ASICs are no longer just experimental projects but have become productivity-scale alternatives to NVIDIA's GPU monopoly."

Realistic Boundaries: How Far is Trainium from "Complete Replacement"? Despite Trainium experiencing significant user growth and performance upgrades, an objective and sober assessment of its market positioning is essential. A crucial point to clarify is that for most frontier AI labs, Trainium is currently more suitable for inference than training. While Bojan Jakimovski confirmed Trainium's cost advantage in inference, he still stated he would advise clients to continue using NVIDIA products for large language model training. This reflects the reality that NVIDIA's CUDA ecosystem maintains a significant advantage in flexibility for large-scale model training, completeness of operator ecosystems, and depth of community support.

Furthermore, it is worth noting a certain disconnect exists between Trainium's hot demand and Amazon's recent stock performance. Despite Trainium AI chips attracting increasing developer interest, Amazon's stock has recently underperformed compared to other tech giants. The market is undergoing a comprehensive valuation repricing process for the intensified competition in the AI chip field—with NVIDIA, AMD, Google TPU, Microsoft Maia, and Meta MTIA all competing. While Gavin Baker holds a positive stance on Trainium, he also emphasizes, "I would never short Google, nor would I short Broadcom," indicating this is a multi-win market, not a zero-sum game.

Additionally, all mainstream AI chips—whether custom ASICs or NVIDIA GPUs—are manufactured using TSMC's 3nm process. This means Google, Microsoft, Amazon, Meta, and NVIDIA are all competing for the limited capacity of the same foundry. Capacity constraints apply equally to all players; any chip designer's rapid expansion may encounter the physical limits of delivery capacity.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Amazon's AI Accelerator Gains Traction as NVIDIA's Dominance Faces Challenges

Comments