东方港湾黄海平2025年年度报告与展望:进化的底色!AI应用算力需求空间广阔 足以容纳GPU与TPU共治天下

Deep News01-07

Harbor View | The Essence of Evolution: Dongfang Harbor 2025 Annual Report and Outlook Source: Dongfang Harbor Investment Management

▷2025 Annual Report Harbor View Investment Newsletter◁

Author: Dongfang Harbor Huang Haiping I. Competition: The Root of Bubbles and the Source of Vitality By the end of 2025, capital markets continued to be filled with discussions of an AI bubble. Yet, in the real world over the past two months, the most significant evolution in model capabilities since 2025 has occurred. The protagonist this time is Gemini. Gemini 3 significantly surpassed ChatGPT across various benchmarks, with its most notable improvement being in "interactive multimodal" capabilities. "Multimodal" signifies providing answers not just through text, but by integrating images, videos, audio, and mini-programs for cross-modal expression, essentially generating a webpage or program in real-time to answer queries. "Interactive" implies not just static display but dynamic interaction with multi-layered presentation effects. For instance, visualizing plasma flow within a tokamak, transforming a recipe into a clickable interactive mini-program, or using interactive animations to explain the working principles of RNA polymerase. This represents an achievement in pre-training upgrades using mixed multimodal data, natively incorporating more modalities and serving as a strong rebuttal to skepticism about pre-training plateauing. This has triggered a butterfly effect within the AI industry. OpenAI sounded a red alert and hastily released the less-than-stellar GPT-5.2. Following the deployment of its latest B200 hundred-thousand-card cluster, OpenAI is intensively re-conducting model pre-training (the move from GPT-5 to 5.2 was not a successful pre-training outcome), hoping to launch GPT-5.3 in January 2026 to reclaim the SOTA throne. Meanwhile, Meta is regrouping; industry research suggests it is using a 100 trillion token dataset (Gemini's training dataset is estimated below 50 trillion) to accelerate training its next-generation model—codenamed "Avocado"—aiming to recover lost ground from 2025. On another front, X.AI, leveraging one of the industry's earliest 100k card clusters, is tackling pre-training for an even larger model—Grok 5—planned for release in Q1 2026. On the hardware side, facing competitive threats from Google's TPU, NVIDIA invested heavily, spending $20 billion to acquire a chip team with projected revenue of only $500 million this year—core members of which hail from the TPU team—to complete its puzzle in the dedicated inference chip domain and maintain its competitiveness in the inference market. It is difficult to conclusively predict which model or chip will ultimately prevail, as the war is far from over; it has merely passed the prologue. Today's Western tech industry is no longer the "harmony brings wealth" internet era where businesses stayed in their own lanes. The market remains preoccupied with OpenAI's strategic alliances and its frequent use of "circular supply chain financing" financial engineering, overlooking that today, every company is essentially undermining each other: OpenAI, with its 800 million users, is advancing into advertising and e-commerce, competing for Google Search's traffic market; Google is introducing an "AI Mode" within its search engine to intercept traffic traditionally going to third-party webpages, integrating decision assistants to complete commercial loops, forcing Amazon to experimentally block all AI search requests in September; GPU chips are encroaching on the CPU server market, while on the other hand, TPUs are quietly seeding growth beyond the Gemini model, taking root gradually in inference scenarios that are NVIDIA's strongholds, like Claude, ChatGPT, and Meta AI; Anthropic has sounded the war horn for a "undermining the foundation" assault on all traditional software paradigms, while ChatGPT has introduced an application SDK into its platform, deconstructing traditional software into "capabilities plus data"; Meta released its second-generation AI glasses, selling nearly 4 million units in 2025—comparable to early iPhone sales—hoping to replace our "handheld life" with全新的 interaction and intelligence; increasingly, AI application revenue is being intercepted by startups. Menlo data shows that the share of AI revenue captured by startups rose from 36% last year to 63% this year. Established large enterprises, despite starting early and having abundant resources, began losing significant ground in 2025. The AI storm has stirred up the Western internet industry, leaving players restless and feeling compelled to act or risk obsolescence. The massive AI capital expenditure largely stems from the FOMO (Fear of Missing Out) sentiment underlying this intense competition, from which few can escape. This is the root of the bubble, but also the source of its vitality.

II. The Economic Value of Evolution Driven by this competitive vitality, the evolution of AI capabilities and application proliferation in 2025 made significant strides in four key areas. First, reasoning became commonplace. In early 2025, ChatGPT had just developed logical reasoning abilities, learning to deliberate. By late 2025, reasoning and prolonged thinking had become standard features for all large language models, with thinking times extending from 1 minute to half an hour, and even several hours for certain tasks. Cost-wise, although the rate of reduction didn't match 2024's pace, the unit price for a million input tokens for leading models decreased by 50% while capabilities improved. For repetitive discussion content, the industry widely adopted cheaper input "caching," whose cost plummeted by 90%, from $1.25 per million tokens to $0.125. Driven by the trend of declining costs for prolonged thinking and reasoning, the monthly consumption of inference application tokens for all models surged dramatically. This is the first and most compelling evidence of AI technology's widespread application. Second, long-term memory emerged. In early 2025, all global large models existed within a single conversation window, ephemeral entities whose intelligence, constrained by a 120k token context window, was inadequate for 99% of human jobs. By late 2025, most large models possessed nascent "long-term memory," enabling them to recall user topics from a year ago and remember optimal execution strategies for specific past tasks. This makes it possible for AI to further extend task reasoning durations, handle more task types, serves as a prerequisite for future "personal super assistant" applications, and is a lifeline for certain homogeneous AI applications to build moats. Third, artisan intelligence began to take shape. In early 2025, focus was more on models' general intelligence like knowledge and reasoning; people waited tens of seconds anticipating an "accurate answer" on screen. By late 2025, emphasis shifted to models' "task experience" or "artisan intelligence" across different domains; people became willing to spend hours waiting for AI to deliver a "satisfactory result"—a PPT, an Excel financial model, or a web front-end prototype. This is no longer about using AI to simulate probability distributions of world correlations, but about replicating human best practice strategies in various work domains, even discovering superior solutions, akin to AlphaFold in pharmaceuticals. Practical strategies aren't about right or wrong, but constant optimization. Concurrently, AI learned to use human tools. Whether MCP, Skill, or SDK, tools created by humans are being deconstructed from fixed forms into sets of capabilities and databases. AI, originally an enhancement module or function button within software, is turning the tables; software is becoming AI's senses, tools, and actuators. Fourth, moving beyond text. In early 2025, large models were fundamentally "language"-based; understanding and generating other modalities like images required conversion and simulation through linguistic space, as difficult as describing a painting for someone else to visualize, leading to superficial multimodal performance. By late 2025, Gemini compressed speech, images, videos, code, and text into the same vector space, allowing models to genuinely see and hear, accurately expressing thoughts using images, videos, and interactive programs. Human-AI interaction is transcending text, while the efficiency of generating images, videos, and programs has significantly improved. Today's social media is increasingly filled with AI images and videos, sometimes indistinguishable from reality at a glance, but people are gradually adapting to and even enjoying this AI-generated content.

We observe that human adaptability to change is remarkably strong, especially for capital market investors accustomed to volatility, who easily relegate past monumental shifts and achievements to history, perpetually anxious about the future. Today, the anxiety centers on whether technological upheaval can generate substantial revenue, profit, and cash flow to justify the ever-expanding capital expenditures. Over the past year, we have repeatedly explained the actual economic value created by AI from various angles. Yet, with every stock price fluctuation, controversy reignites, and we still receive numerous anxious inquiries from investors—this reflexivity is deeply ingrained. Investing is like the blind men touching the elephant. As the year ends, we once again attempt to understand the economics of this AI progress from a more macro, simplified perspective.

We must first construct a mental model. As illustrated, AI, as a factor of production, is produced by Data Centers (AIDC). Approximately 60% of the world's AIDC capacity is built by a handful of "cloud service providers," with the remainder built internally by certain large enterprises (e.g., Meta, Tesla) or funded by sovereign governments. The AI "raw material" produced by AIDC, whether basic compute or model services, can be measured in Tokens. These AI Tokens are either sold/distributed by cloud providers to myriad industries or used internally by large enterprises and governments. Industries, after producing or procuring AI Token raw materials, use them either to boost their own business revenue growth or reduce costs, or further process them into intelligent products sold to B2B or B2C consumers, such as chatbots, image/video generators, enterprise BI decision intelligence, ambient programming, presentation document creation, spreadsheet creation, etc. The cost of building these "factories" is extremely high. Gartner predicts global AIDC investment will approach $500 billion in 2025. This money is spent in two areas: expensive chips and networking equipment (shorter lifespan, depreciation ~5-7 years) and facilities, power, and land (longer lifespan, depreciation up to 10-40 years). Combining these asset types, the average depreciation period for large enterprises is approximately 10-13 years. This means the AIDC investment in 2025 alone will generate annual depreciation expenses of at least around $50 billion. Furthermore, with future investment scale expansion, annual new costs are projected to surge by about 50% each year. Despite the massive investment, quantifying the revenue and profit generated by AI is challenging. AI's economic value is dispersed across three areas: EBITDA profit earned by cloud providers distributing AI Token raw materials; increased EBITDA profit from revenue growth or cost savings achieved by enterprises through self-building or procuring AI Tokens for their traditional businesses; and profit earned by other enterprises that purchase AI Token raw materials from cloud providers and develop them into AI products. When measuring AI's economic impact, the market often focuses intently on the third area—revenue/profit from reprocessed AI applications—easily concluding that AI investment yields insufficient returns. However, the profits created by cloud businesses (bearing most investment risk) and the profit enhancements from AI-driven improvements in traditional business operations are frequently overlooked. These two parts of economic value are indeed difficult to measure accurately, rarely itemized separately even in the financial reports of cloud providers and large internet companies. This is because a significant portion of the cloud business's新增 profit still comes from non-AI revenue (e.g., storage, data analytics), albeit growth is significantly boosted by AI revenue (e.g., surveys indicate $1 of Gemini API revenue drives $2 of traditional cloud revenue). Simultaneously, the新增 profit from revenue growth or cost savings resulting from AI transformation of a company's own business—whether through self-build or procurement of AI Tokens—is hard to disentangle from the endogenous growth of the original industry or company. Perhaps for these reasons, they are often entirely disregarded when considering AI investment returns. Therefore, we can simplify the observation of these two AI economic values: check whether the新增 EBITDA generated by major cloud providers selling AI Tokens externally, and by major enterprises using self-produced/self-consumed AI Tokens for internal business transformation, can exceed the新增 depreciation caused by their own AIDC investments, roughly indicating whether producing AI token raw materials is profitable. In 2025, the three major North American cloud providers reported revenue of approximately $270 billion, a year-on-year increase of about $60 billion from 2024. Assuming an AIDC EBITDA margin of 60%, this corresponds to约 $36 billion in新增 EBITDA. Concurrently, considering the four major consumer internet firms—Alphabet, Meta, Microsoft, and Amazon—their non-cloud business revenue in 2025 was about $1 trillion, a year-on-year increase of约 $100 billion. Assuming an average EBITDA margin of 50%, this yields约 $50 billion in新增 non-cloud EBITDA. The capital expenditure for these four companies in 2025 was approximately $380 billion. Using an 8-year depreciation period, this corresponds to约 $47.5 billion in新增 depreciation expense. Thus, $47.5 billion in新增 depreciation drove $86 billion in新增 EBITDA profit. While this profit includes non-AI-driven organic growth, the equation still appears manageable. For smaller cloud providers and other enterprises adopting private deployment, economic outcomes may vary, certainly giving rise to localized bubble risks. Meanwhile, as future capital expenditure surges, it demands higher growth rates for total future enterprise revenue and profit, necessitating ongoing observation of this balance. Regarding the third area of AI economic value, Menlo Venture's latest survey of 495 US enterprise AI decision-makers estimated North American enterprise AI spending in 2025 reached $37 billion, a rapid 3.2-fold increase from 2024. Over half of this spending, exceeding $19 billion, was on AI software, already accounting for 6% of the $300 billion SaaS market. Compared to 53% in 2024, 76% of enterprises in 2025 opted to purchase AI applications externally rather than building them in-house. This indicates that secondary-processed AI applications, though starting from a low base, are growing rapidly and gaining acceptance in corporate procurement.

Globally, B2B and B2C revenue from purchasing and reprocessing AI tokens is difficult to tally, but we can indirectly monitor it via cloud provider revenue growth: if downstream procurement and processing of AI tokens (by model vendors or AI applications) are profitable, cloud revenue growth rates should not decelerate significantly. Based on Q3 data, the growth rates of the three major North American cloud providers and China's Alibaba Cloud continue to trend upwards. This at least confirms the persistence of the third area of AI investment's economic value.

III. The Shape of the Future The future is difficult to predict because it often disguises itself as the past. The earliest smartphones still had keyboards, the earliest cars were seen as "horses that don't need feeding" requiring flag-waving guides to precede them, and today's ChatGPT dialog box still resembles the Google search bar. To deduce the future of AI applications, one must escape the gravitational pull of the "past," starting from first principles to identify potential paradigm shifts, not just incremental improvements in speed, quality, or automation. Thus, our starting point is the five fundamental essences of current AI industry development, providing a solid foundation for discerning trends in 2026 and beyond. 1) The essence of human or AI foundational intelligence is a probability distribution prediction machine for correlations among text, images, video, code, speech, and even entities in the physical world, expressed via the Transformer architecture. It has long surpassed the "large language" domain and jumped out of the "dialog box." 2) The essence of reasoning and Agent tasks is an "exploration and evaluation" engineering process for optimal execution strategies, utilizing multiple foundational intelligence models to "role-play" different functions within that engineering team. Individual cognitive mechanisms haven't qualitatively changed, but new team-based capabilities have emerged. 3) The essence of AI capability evolution and application proliferation is the continuous compression of existing human data—pre-training, coupled with learning optimal execution strategies in different task domains—mid/post-training. The learning data for mid/post-training comes from human user feedback, demonstrations and evaluations by domain experts, and data generated through model self-play. 4) The primary inherent obstacles currently limiting AI deployment include the speed of acquiring reinforcement learning data, the inability of model architectures for "continuous learning" (requiring periodic retraining/updates), and short context windows. 5) The essence of the economic value generated by AI is the low-cost extension of human intellectual resources, not simply replacing existing human work outputs. Based on these five points, we can infer at least six areas of transformation driven by AI technology. First, model performance will see major leaps: significant improvements in context window length and processing efficiency. The context length of mainstream models did not change significantly in 2025, although accuracy improved. Sebastian, Gemini's pre-training lead, recently revealed that Gemini will make major upgrades to the Transformer's attention mechanism in 2026, addressing a major bottleneck in AI application proliferation. Increased context window length and processing efficiency will substantially enhance task execution duration and complexity, allow models to discover more innovative solutions without performance degradation over extended work periods, enable flexible use of multiple tools within a single task for more personalized responses, and crucially, help solve the "quadratic complexity" problem,大幅 reducing costs for model inference and content generation. This will undoubtedly accelerate the unlocking, evolution, and proliferation of all AI applications. Second, reinforcement learning will unlock increasingly numerous "task capabilities" for models, placing traditional "software programs" in unprecedented peril. Large models are intensifying reinforcement learning across various domains, accumulating experience through user feedback, expert input, and self-play, and gradually beginning to distribute various "task capabilities" to users—like existing AI programming, intelligent shopping guidance, AI image generation, AI video production, PPT creation, and potentially future capabilities like AI travel planning, AI recipe creation/ordering, AI fitness coaching, AI investment advice, etc. In the internet era, information was stored, manipulated, retrieved, and reproduced in "fixed forms"—whether cloud storage, online notes, video sites, online games, OTA programs, or Photoshop, the essence of applications was similar. In the AI era, information is "vectorized," operations are "instructionalized," and retrieval/reproduction becomes "generative." Pre-made informational "dishes"—traditional software applications—will largely phase out historically. Correspondingly, the "operating system" might cease to exist as we know it, since human "operation" becomes less necessary, potentially evolving into an "instruction system": issuing commands via the most natural methods—voice and vision—waiting for the large model to调用 various capabilities and data, and iteratively调试 with human assistance until task completion. AI search and AI programming are just the first sparks of this revolution; a cohort of native AI application startups will emerge in 2026, capturing market share from traditional software. In 2025, 63% of the enterprise AI application market was already taken by startups, while the model of traditional software adding AI function buttons is faltering. Concurrently, whether through ChatGPT's application SDK, Claude's Skills, or Gemini 3's real-time generation of interactive front-ends, major model vendors are also attempting to deconstruct the traditional software world into "databases" and "task capabilities." With advances in reinforcement learning, and growth in context windows and efficiency, the traditional software market may face an unprecedented crisis in 2026. Third, "Growth Without Hiring" will occur across more traditional businesses, and company structures will consequently change. As emphasized earlier, we overfocus on application revenue from secondary processing of AI Tokens, significantly underestimating the business growth or cost savings from using (self-built or procured) AI Tokens to transform traditional operations—which represents AI's most substantial real-world impact. The essence of AI's economic value is extending human intelligence, not simply replacing its outputs. In other words, AI's TAM (Total Addressable Market) is an extension of the $50 trillion human labor market (via lower-cost substitution, extended working capacity, or efficiency gains), not merely a re-slicing of the existing $300 billion SaaS or $800 billion e-commerce markets. Therefore, to find AI investment's economic value, we need to look more within traditional industries for accelerated revenue growth and simultaneous margin expansion. Over the past two decades of the internet era, besides revenue growth, the S&P 500's overall profit margin more than doubled; this trend is likely to continue in the AI era. Regarding revenue growth, for instance, the market might view AI search as merely redistributing the existing advertising pie. However, the total potential space for advertising under AI search could be significantly larger than the current market. AI search advertising revenue can be broken down into "number of searches × monetization rate × ad load rate × ad click-through rate × ad price." All five factors are improving simultaneously due to AI, and their multiplicative effect could substantially expand the total search advertising market. A similar dynamic is occurring in feed-based advertising, where user time spent, ad precision, and ad prices are all significantly boosting corporate revenue and profit growth rates. Regarding profit margins, consider traditional transportation: post-pandemic freight rate declines, trade war-induced volume shrinkage, and oversupply competition created a triple pressure leading to普遍 revenue declines, low capacity utilization, and severe profit margin pressure. However, one listed company—C.H. Robinson—under its "Lean AI Initiative," used Agent intelligences to大量 replace labor costs. For example, a Quoting Agent调用 a dynamic pricing engine, considering market density, historical data, and real-time traffic to instantly provide competitive quotes. An Order Processing Agent reads unstructured customer emails, extracts key info (origin, destination, cargo type), and automatically creates orders, greatly freeing up workforce. Freight volume per employee increased by 40%. Consequently, despite an 11% revenue decline, the company's operating profit increased逆势 by 22%, margins surpassed previous cycle peaks reaching 31%, and EPS surged 67.5% due to share buybacks, driving significant stock price appreciation over the past half-year against the trend. Revenue and profit improvements—Growth Without Hiring—seen in sectors like search advertising and logistics will occur across more traditional businesses. Eventually, traditional corporate structures and job roles may be dismantled. The firm's essence is a container reducing coordination costs; job roles exist due to the scarcity and reusability of task abilities; products represent one-time bets on stable future demand. From AI's fundamental nature, all these will be reconfigured: the firm still bears responsibility for direction setting, resource allocation, and risk ownership, but information flow need not be hierarchical, task abilities can be outsourced/called upon, processes can be generated temporarily, organizational boundaries can be reconfigured per project, and products resemble non-fixed-form fulfillment capabilities or experiences. Production relations always evolve around productive forces; this is the inevitable trend.

Fourth, native multimodality takes center stage, upgrading the content industry into an "experience industry." Video-native multimodal pre-training, starting with Gemini 3, will become standard for models. Concurrently, if context window length and processing efficiency see major improvements, coupled with widespread deployment of the Blackwell chip series, the controllability, element consistency, length, and generation cost of video will see substantial gains in 2026. A recent phenomenon: on platforms like Douyin and WeChat, "comic-dramas" are succeeding short dramas as new traffic magnets. The market isn't short of excellent online literature, but traditional comics, anime, and short/long dramas are time-consuming, labor-intensive, slow to produce, and costly. AI video is gradually solving this industry bottleneck. Traditional animation costs can reach ¥15,000 per minute, while AI comic-drama costs have plunged to around ¥600 per minute, with production cycles shortened by 80-90%. "One-person production crews" are becoming reality. Furthermore, multimodal outputs will interconnect, potentially enabling "writing a book creates a series," with humans responsible only for textual creativity, leaving the rest to AI. In 2026, we might see the first AI-generated movie in theaters—a potential historical turning point. Under this trend, video content will explode. This is an industry where supply creates demand; its value ceiling has no cap, making estimations based on current market size potentially inappropriate. Beyond consumer entertainment, video will become a primary mode of information expression: AI search answering queries with generated videos, print ads increasingly shifting to video formats, synesthetic text-video teaching becoming standard in education, AI application interfaces potentially featuring real-time interactive digital humans, etc. Speculatively, in a few years, industries like video games might be redefined, upgraded into experience industries: previously, video game content was pre-made, passively consumed, distributed via recommendation algorithms. Future content will be generated on-demand, even in real-time, with storylines, visuals, and pacing adapting to users. Content ceases to be an object or product, but becomes individual "experiences."

Fifth, domains driven by self-play or possessing abundant user feedback will evolve faster. Currently, reinforcement learning faces data acquisition efficiency issues: task experience varies greatly across industries; human generalization ability isn't adept at mastering multiple professional skills simultaneously, so AI models require separate reinforcement learning in each domain; in many domains, if collecting data via user feedback, AI may face insufficient user scale; if via expert annotation, cost and scale efficiency are challenges; if via self-play, tasks may have limitations like asymmetric博弈 or unclear win/loss rules. Thus, the entire AI industry yearns for a paradigm shift towards "continuous learning," but expecting a breakthrough in 2026 seems overly optimistic. Therefore, the pace of task evolution across domains primarily depends on the "trial-and-error cost" and "feedback efficiency" of reinforcement learning. The first tier, the fast lane, likely exists in self-play driven domains, where simulated environments can generate and validate data efficiently, with low trial cost and immediate feedback. This includes, but isn't limited to, mathematics & programming, chip design, new materials/small molecule drug discovery, and gaming. The second tier, the medium-speed lane, exists in scenarios with abundant user feedback. For example, intelligent ad recommendation systems receive massive user feedback after each recommendation; once reinforcement learning is introduced, a data feedback loop accelerates development. The third tier, the slow lane, has high trial costs and slow feedback. Examples include embodied robots, urgently needing new data collection methods. Ningzhi Capital predicts the potential rise of remote-controlled robot services as a new form of labor transfer for less developed countries and a new data collection paradigm before full automation.

Sixth, the demand黑洞 for compute power is bottomless, ample for GPU and TPU to co-govern the world. In 2026, with批量 deployment of Blackwell chips, inference costs will continue declining. Coupled with progress in model pre-training and reinforcement training, improved context window efficiency, and maturation of multimodal approaches, AI Token consumption will continue its visible monthly surge. Analyzing from the perspective of per capita disposable compute, potential demand is millions of times current supply (Jensen Huang mentioned billions), making the continuous growth of total compute demand the primary investment矛盾, while the GPU vs. ASIC debate is secondary. Google's TPU is shaking NVIDIA's GPU dominance in the inference market. From its growing Gemini market share, to Anthropic adapting models and placing a $10 billion TPU order, to OpenAI, Meta AI, and others following suit with trials, TPU is making inroads into the model inference market. However, this doesn't imply other ASICs possess similar strength. TPU development began at Google in 2013, three years earlier than NVIDIA's first data center chip, the P100, in 2016. In today's rapidly evolving model/application landscape (2025), for a company lacking chip design experience and cutting-edge model development capability, betting on a specific 5-year model trend in silicon while keeping pace with GPU's multi-fold annual performance growth presents a vastly different competitive challenge. Simultaneously, specialized TPUs aiming to co-govern the world with GPUs face four layers of challenges: capacity, adaptation, competition, and business model. Regarding capacity, the chip battle involves supply chain management—competing for foundry packaging capacity and memory chip allocation. Google still lacks the成熟度 to firmly establish and expand its market share in the semiconductor industry. Regarding adaptation, for other models to run inference on TPU, they must either retrain using the JAX framework (subject to Google restrictions) or make significant compromises—converting model precision, operators, and data dimensions to fixed formats, and translating CUDA code into a format readable by the XLA compiler. This is like standardizing a master chef's ingredients and techniques for automated central kitchen cooking: efficiency increases, but some original flavor is lost. Regarding competition, clear rivalry exists between Gemini and other models. Entrusting underlying compute power to your largest competitor requires careful商业考量. Moreover, NVIDIA will certainly not remain passive regarding ASICs and will address its weaknesses to retain customers. Regarding business model, the largest buyers of data centers are cloud providers. To operate cloud services effectively requires diverse customers, diverse use cases, and diverse usage timeframes to maximize hardware utilization efficiency through synergy. Therefore, cloud providers, including Google Cloud, must rely on general-purpose GPUs for their PAAS and IAAS businesses to achieve optimal效益.

In summary, the potential demand space for AI application compute power is vast enough for GPUs and TPUs to co-govern the world. Eventually, when model architectures stabilize, other ASICs might also find their niche.

The above constitutes our observations on the 2025 AI industry and projections for the future. THE END Disclaimer The market carries risks; investment requires caution. Under any circumstances, the information in this article is for readers' reference only. Mentioned companies are solely for illustrating industry logic. All content is devoid of investment recommendations. Readers should not rely solely on this article代替 their independent judgment. Dongfang Harbor bears no responsibility for investment losses, risks, or disputes arising from the use, citation, or reference of this article's content.

Massive information, precise interpretation, all on the Sina Finance APP

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment