The escalating cost of AI tokens is fundamentally altering the economic logic of enterprise artificial intelligence, and a recent internal decision by Microsoft has brought this transformation into sharp focus.
As previously reported, Microsoft is considering integrating a fine-tuned version of the open-source model DeepSeek V4 into its enterprise AI tool Copilot Cowork. This move aims to provide a lower-cost alternative to models from providers like OpenAI and Anthropic, with a final decision expected in the coming weeks.
Concurrently, Microsoft has announced it will transition Copilot Cowork from an unlimited usage model to a consumption-based pricing structure. These actions collectively send a clear signal: even a tech giant like Microsoft finds the unchecked cost of model calls unsustainable.
This development has resonated widely across the enterprise AI market. The Silicon Data Token index, which tracks AI token prices, has fallen in 12 of the last 13 trading sessions, approaching recent lows. Cost pressure is evolving from an individual company concern to an industry-wide challenge. The question of "which model to use" is being superseded by "how to afford to use models."
When affordability overtakes raw capability as the primary business priority, "model routing"—the ability to dynamically match the most cost-effective model to a task's complexity—transitions from a technical consideration to a core requirement determining an AI project's financial viability.
Microsoft's Cost Challenge: The End of Unlimited Access
Copilot Cowork previously offered enterprise users unlimited usage, but this model has proven untenable.
Charles Lamanna, a Microsoft executive vice president overseeing Copilot, stated plainly: "Some users complete hundreds of tasks per week with high efficiency—but the cost can skyrocket."
Consequently, Microsoft is shifting to usage-based billing and simultaneously exploring the integration of a fine-tuned DeepSeek V4 or other open-source models to drastically reduce inference costs. The underlying logic is straightforward: significant pricing gaps exist for input/output tokens between different model providers, and the cost advantage of open-source models can no longer be ignored.
This decision reflects a shared dilemma across the enterprise AI sector. While frontier models grow more capable, their calling costs rise in tandem—for instance, the output token cost for a model like Fable 5 can be approximately 180% higher than for Opus 4.8 on comparable tasks. Greater intelligence is generating increasingly difficult-to-digest bills.
The Dominant Theme for the Coming Year: Token Economics
Cost pressure is now permeating every stage of enterprise AI procurement.
Mason Daugherty noted on social media that in every client conversation over the past two months, organization-wide token spend was raised as a pressing concern. He predicts "Token Economics" will be the dominant theme in discussions about AI procurement and usage over the next six to twelve months.
He pointed out that as annual enterprise contracts with major vendors come up for renewal, management is beginning to question whether they can justify renewing at the same or higher price points. Simultaneously, the cost gap between frontier API models and self-hosted open-source alternatives continues to widen, directly driving accelerated procurement of open-source options.
The sustained decline of the Silicon Data Token index confirms this trend's market-level impact—competitive pressure on token pricing is now visibly reflected in the data.
Architecture as the Moat: Model Routing Emerges as a Core Competency
Under cost pressure, the competitive focus of enterprise AI is undergoing a fundamental shift.
Arvind Jain of the enterprise AI platform Glean argues that the biggest bottleneck is no longer model intelligence itself, but "Token Output Efficiency"—how much productive work a system generates per token consumed. He emphasizes that most enterprise AI costs lie not in the prompt, but in the surrounding systems: retrieval, tool calling, memory management, and multi-step reasoning. An eleven-word request can balloon into thousands or tens of thousands of tokens once the system begins gathering context and processing the task step-by-step.
Jain believes true competitive advantage comes not from aggressively using the most powerful model, but from an AI architecture capable of matching the right model and reasoning level to the right task—a system with robust routing capabilities, spend controls, and governance. "Frontier intelligence is becoming abundant; efficient execution is not."
This assessment aligns closely with Microsoft's practical moves: introducing low-cost model alternatives is essentially about building a model routing mechanism, not simply "swapping to a cheaper model."
A Strategic Framework: Owning the Learning Loop
Microsoft CEO Satya Nadella recently provided a broader strategic framework that contextualizes these trends.
Nadella stated that every company must build what he calls "Token Capital" and "Human Capital." The former refers to a company's own AI capabilities and systems, while the latter encompasses employee knowledge, relationships, and judgment. He defines both as core assets for thriving in an AI economy, stressing that human capital's value does not diminish as token capital grows: "Without human direction, you're just spinning compute cycles."
He clearly stated that the real opportunity lies not in choosing the strongest model, but in building a continuous learning loop on top of models, allowing human and AI capabilities to compound together. A key test is whether a company can switch its underlying foundational model without losing its accumulated proprietary knowledge and capabilities. "This is the core test for control and sovereignty in the age ahead."
Nadella also issued a warning, suggesting that if all value ultimately concentrates in a handful of dominant models, it could replay the history of globalization hollowing out industrial economies. He stated, "There is no social license for an AI future that hollows out entire industries." This remark is particularly noteworthy as it coincides with Microsoft's own considerations to introduce open-source alternatives, actively diversifying its dependence on a few leading vendors.
Comments