AI models are undergoing their "largest capability leap in nearly two years," causing supply chains to face severe shortages and pushing U.S. public discontent to a boiling point, according to Dylan Patel, founder of SemiAnalysis, in a recent interview. On April 23, Dylan Patel, CEO of the well-known semiconductor and AI industry research firm SemiAnalysis, provided a detailed breakdown of the current AI industry's supply-demand dynamics, generational technological breakthroughs, and potential broader societal impacts during an in-depth discussion.
SemiAnalysis is an independent AI infrastructure research firm closely watched by Wall Street and tech investors, with clients including top hedge funds and technology companies. Using his own firm's AI usage data as a starting point, Patel offered a deep analysis of the entire AI industry's supply and demand landscape. He revealed that SemiAnalysis's AI expenditure last year was only tens of thousands of dollars. However, as employees, including non-technical staff, began heavily using tools like Claude for coding and data analysis, the annualized token spending has skyrocketed to $7 million this year.
Patel indicated that new models from leading organizations like Anthropic have achieved the "largest capability leap in nearly two years," advancing from Level 4 to Level 6 engineer proficiency in just two months, making "execution extremely cheap." However, this has also led to "extreme shortages" in the hardware supply chain, with DRAM prices potentially doubling or tripling. Taiwan Semiconductor Manufacturing Company (TSMC) capital expenditure could reach $100 billion by 2028. Alarmingly, as technology penetrates the economic system, anxiety over lower-level job losses is intensifying. Large-scale protests against AI are expected in the U.S. within three months, with public opposition becoming a significant macroeconomic variable.
**Model's "Largest Capability Leap in Two Years"** The core driver behind this surge in token consumption is the explosive growth in the capabilities of frontier models. During the interview, Patel specifically mentioned Anthropic's unreleased new model and the latest version of Opus. He noted that as scaling laws continue to hold, model iteration speeds have compressed dramatically from six months to just two months. The new model likely represents the most significant leap in model capability over the past two years. Patel emphasized that while Anthropic's initial goal was to reach Level 4 software engineer proficiency by the end of 2025, which they largely achieved, the new model's benchmarks indicate it has reached Level 6 proficiency. This leap from L4 to L6 took merely two months.
This capability shift is fundamentally altering business logic. The old market rule was that "ideas are cheap, execution is extremely hard." With the new generation of models, this logic is being rewritten: "Now ideas are cheap and abundant, but execution has become very easy. Therefore, only truly good ideas justify spending on extremely cheap execution." The emergence of the new model also validates a key proposition: scaling laws remain effective. Patel stated clearly that the new model is a materially larger model, with a training scale equivalent to 100,000 Blackwell chips, and all signs indicate that the trend of models improving with more compute investment continues.
**Supply Chain "Extreme Shortages"** Extreme model compute demands are directly causing comprehensive strain across the physical supply chain. Patel pointed out that despite cloud providers and chip manufacturers racing to expand production, the supply side is facing "extreme shortages" at almost every node, and profit margins across the entire industry chain are expanding irreversibly. "As demand soars, the price of everything on the supply side is increasing," he said, challenging the market's perception of short GPU lifespans. He argued that the idea GPUs last less than 5 years is nonsense, citing 3-4 year old Hopper clusters renewing contracts for another 3-4 years, suggesting effective lifespans could be 7-8 years.
Shortages are also staggering in the broader semiconductor supply chain. Patel highlighted astonishing expectations for memory chips and foundry services. For memory chips, he stated that "true incremental supply won't arrive until 2028. DRAM prices will double or triple from current levels because they must use higher pricing to destroy some demand; capitalist economies don't use rationing." Regarding foundry and equipment, he said people are overlooking TSMC's future capital expenditure, which could reach $100 billion by 2028, a possibility many find crazy but entirely real.
Additionally, Patel emphasized two often-overlooked bottlenecks: 1. **CPUs**: Severely underestimated for two reasons: reinforcement learning training environments run entirely on CPUs, not GPUs, and AI-generated code/content ultimately deploys on CPU-based servers. "CPUs are completely sold out, demand is exploding." 2. **Niche upstream materials**: Such as PCB copper foil, fiberglass, and lasers are also in extreme shortage, with advance payments common. Even if gross margins don't surge significantly, Return on Invested Capital is rising markedly.
Patel's overall assessment is: "The value created by the best economic-value models is growing faster than our ability to actually supply tokens. This gap will continue to widen, model lab profit margins will continue to expand, until people in the hardware supply chain also start saying, 'Wait, why aren't we raising our margins too?'"
**U.S. Public Resentment is Accumulating** Beyond the technological surge and capital frenzy, Patel issued a stern warning about the social sentiment AI is triggering in the United States. He indicated that as companies use AI to significantly boost efficiency, potentially leading to layoffs, public hostility towards AI among ordinary Americans is rapidly approaching a tipping point. "I think within three months, there will be large-scale protests targeting large model companies," Patel stated bluntly. He cited recent extreme events, noting that comments on news articles cheered such actions, suggesting this is just the beginning and that AI is now less popular than politicians.
He partly attributed this public anger to poor PR strategies by U.S. AI company executives. He criticized leaders like Sam Altman and Dario Amodei for lacking charisma in public appearances and being overly enthusiastic about discussing how AI will "change the whole world" and "automate all jobs." "They need to stop constantly talking about how future capabilities will change the world because ordinary people can't relate; it just makes them fear those capabilities. They must start showcasing exciting real-world use cases AI can deliver today."
**Interview Excerpts** The discussion highlighted how the paradigm has shifted: execution is now very easy, making truly good ideas paramount. Patel shared the dramatic story of his own team's token usage explosion this year, starting significantly after the release of a new model version in late December. Non-technical staff, led by the company president, began using AI to write code, causing spending to inflection point upwards in January and surge relentlessly. The firm now spends $7 million annually on one AI coding tool alone, which represents over 25% of their approximate $25 million payroll. Projecting this trajectory, spending on this tool could exceed total payroll by year-end, a prospect Patel finds somewhat frightening.
Patel clarified he doesn't currently face a choice between people and AI due to rapid company growth, allowing him to spend more on AI instead of hiring faster. However, he recognizes others will confront the reality that if one person using AI tools can do the work of five to fifteen people, layoffs might become necessary. Use cases within SemiAnalysis are vast. Examples include an employee using a few thousand dollars worth of tokens to create an application that automates complex chip material analysis—a task that previously required an entire team—and another employee single-handedly building sophisticated economic analysis tools that would have traditionally required a 200-person economist team working for a year.
From a business owner's perspective, Patel views the rising AI expenditure as essential for staying competitive. He believes AI will commoditize services, including his own information business. The key to survival is moving quickly, continuously improving offerings, and providing superior service to retain customers. Failure to adopt AI aggressively will lead to being overtaken by those who do. He illustrated this with another example where an energy analyst built a comprehensive U.S. power grid mapping and analysis dashboard in weeks—a project that took a competitor with 100 people a decade, demonstrating the disruptive potential.
Addressing whether clients might build similar capabilities in-house, Patel argued that specialized firms like his can move faster and be more agile. While investment funds have internal data teams, they still purchase external data because buying and building upon it is often cheaper and more efficient than starting from scratch, and specialist firms possess unique focus and speed.
Reflecting on token supply and demand dynamics, Patel noted the astronomical revenue growth of companies like Anthropic far outpaces their compute growth, implying massive margin expansion. The core issue is that the economic value created by the best models grows faster than the ability to supply the tokens, creating a persistent gap and allowing model providers to maintain and expand high margins. Demand is driven not just by cost reduction but by the explosion of new use cases enabled by increasingly capable models. Even as costs for achieving a given capability level plummet, the demand shifts to the new frontier models that enable even greater value creation.
Discussing the physical world and robotics, Patel suggested the current "software singularity" is a prelude. Once software execution becomes extremely cheap, it will accelerate breakthroughs in robotics, likely within 6-18 months, enabling few-shot learning where pre-trained robot models can learn new tasks from just a few examples. This will lead to an explosion of specialized physical applications, further fueling token demand, which Patel does not believe will slow down.
The new model's success confirms that scaling laws remain valid—it is a materially larger model, and throwing more compute at models continues to make them better. Simultaneously, research compute spending drives efficiency gains, drastically reducing the cost for a given capability level over short periods.
On the supply side, as demand soars, prices for everything are rising, and asset lifespans like those of GPUs are extending significantly. The entire supply chain, from cloud layers to hardware manufacturers to memory and foundries, is seeing expanding margins or rising ROIC due to prepayments, even if gross margins are stable. Key bottlenecks include memory, where significant new supply isn't expected until 2028, necessitating large price increases to destroy demand, and logic chips, where TSMC's capital expenditure could reach $100 billion by 2028. Other critical bottlenecks are CPUs, driven by RL training and deployment needs, and niche upstream materials.
Patel identified the greatest challenge as understanding the "tokenomics"—the economic value generated by token usage and its diffusion through the economy, which is incredibly difficult to quantify despite clear subjective indicators of massive value creation not fully captured by traditional metrics like GDP.
Looking ahead three months, Patel anticipates large-scale protests against AI companies, fueled by growing public fear and blame directed at AI for various problems. He advises AI industry leaders to stop giving interviews that exacerbate fear, cease focusing on futuristic world-changing capabilities, and instead showcase exciting, tangible present-day use cases to build positive connections with the public.
Comments