The evolution of artificial intelligence is fundamentally reshaping economic models, with the concept of the token emerging as a new core metric for measuring, settling, and analyzing AI services.
Currently, the token economy is experiencing explosive growth, with diverse application scenarios constantly emerging. However, significant challenges persist, including the presence of high traffic but low quality, consumption without effective evaluation, and difficulties in linking token consumption to the true value of AI services. Issues such as chaotic pricing mechanisms and a lack of effective incentives for high-quality supply are becoming increasingly prominent.
In a recent move, the National Data Administration held a symposium on the token economy, signaling its formal inclusion into the national work framework. This action strongly indicates a strategic shift for the industry from pure scale expansion towards high-quality development. Experts and representatives from leading enterprises, including those from the China Economic Times, participated in the discussions.
This edition of the Intelligence Monthly focuses on the theme of enhancing token quality to drive the high-quality development of the token economy. It features insights from four participating experts who delve into the core issues of healthy growth for the token economy. They offer forward-thinking and practical recommendations from policy, technological, economic, and governance perspectives.
Key Insights from Mu Fei, Dean of Alibaba Cloud Research Institute
Tokens are becoming a foundational element driving the intelligent economy. The development of AI has progressed through three stages: pre-training, inference, and agentic action. The core variables of the production function are evolving from capital and labor to tokens and intelligent agents.
Model capability determines token quality. Inference service platforms enable large-scale, low-cost production. Tool ecosystems enhance the productivity of intelligent agents. Agent-native clouds provide the environment for value realization. The token economy enters an irreversible growth trajectory when the economic value created by tokens surpasses their cost.
Token Consumption as the 'Thermometer' of the Intelligent Economy
In March 2026, the National Data Administration released new data showing China's average daily token consumption exceeded 140 trillion, a growth of over 40% from the end of the previous year. Just two years prior, at the beginning of 2024, this figure stood at 100 billion. A token is the smallest unit of information processed by a large model and the basic unit for billing AI services. This thousand-fold increase vividly illustrates how more businesses and individuals are integrating token consumption into their daily workflows, from code generation and data analysis to content creation, customer service, scientific research, and business decision-making.
The significance of this data lies not just in the growth rate but in the underlying structural shift. Tokens are no longer merely a technical unit of measurement; they are becoming an economic hub connecting data, computing power, and commercial value. A new economic logic is taking shape around the production, circulation, pricing, and consumption of tokens.
Retrospective: The Three-Year AI Journey - Acquiring Knowledge, Solving Problems, Taking Action
The past three years of AI development can be summarized in three phases, each profoundly altering the economic meaning of tokens.
The first phase, 'Acquiring Knowledge,' was the pre-training stage. Large models accumulated knowledge through massive text training, compiling millennia of human literature, code, and scientific achievements into model parameters. Before 2024, computing power was primarily consumed in this phase. Models were akin to erudite scholars—knowledgeable but not yet adept at solving highly complex problems or completing high-quality tasks.
The 'Solving Problems' phase marked the inference stage. In 2024, with the advent of deep reasoning models, AI gained the capacity for long-chain thinking when facing complex problems. Tokens were no longer just for 'seeing' information but for 'thinking through' problems—unfolding multi-step thought chains, validating hypotheses, and eliminating incorrect paths. Each deep reasoning process consumed a significant number of tokens, where tokens began to possess explicit, billable value delivery.
The current phase, 'Taking Action,' is the era of agentic AI, starting from 2025. AI has moved from conversation to action. Intelligent agents no longer just answer questions; they autonomously plan paths, call tools, operate environments, and iterate execution until a task is complete. The value output of tokens has leaped from a text response to a completed work task—code written and tested, data analyzed and reports generated, system optimization executed and validated.
This transition from intellect to action points to a clear industry trend: AI is evolving from a conversational tool to an acting entity. The value density of the same one million tokens increases stepwise. In the inference stage, it might represent the 'electricity bill' for a deep analysis; in the action stage, it could correspond to an engineering delivery worth thousands. Tokens are no longer just a unit for API metering; they are the foundational element driving the intelligent economy.
Transformation of the Production Function: From Capital and Labor to Tokens and Agents
One of the most fundamental formulas in economics is the production function Q = f(K, L). In the intelligent era, this function is undergoing a significant transformation, with its core variables evolving from capital (K) and labor (L) to tokens and intelligent agents.
Tokens represent the new capital element. The essence of capital is that it is accumulable, priceable, transferable, and can be invested in production to generate incremental returns. Tokens perfectly fit this definition—data is transformed via computing power and models into tokens, forming billable, deliverable intelligence services. Through API interfaces, tokens can be delivered globally in an instant. Crucially, tokens are not homogeneous commodities; those produced by different models possess varying intelligence densities. Tokens from more powerful models can solve more complex problems and drive higher-quality actions, commanding a significant premium.
Intelligent agents represent the new labor element. They can understand goals, autonomously plan, call tools, and execute tasks. They are scalable on-demand, capable of parallel work, and operate around the clock without the training cycles and management costs associated with traditional human labor. Tokens are the 'fuel' for agents—every perception, reasoning step, and action consumes tokens, akin to how human labor consumes wages.
These two forces interlock and amplify each other in a virtuous cycle. The unit cost of tokens continues to decline with iterations in model architecture and inference technology, lowering barriers to use. Simultaneously, the wider the adoption of agents, the more feedback data they generate. This data flywheel continuously enhances model capabilities, producing better outputs—stronger and cheaper models, which in turn lower barriers and expand usage further. Thus, the intelligent economy exhibits increasing returns to scale, a phenomenon rare in classical factor economies at specific stages.
Four Key Components Enabling the Healthy Operation of Tokens and Agents
If tokens are capital and agents are labor, then the infrastructure enabling this new production function to operate efficiently constitutes the industrial system of the intelligent economy. From industry practice, this system comprises four key components, each precisely corresponding to a critical link in the production function: the quality of capital, the quantity of capital, the tools for labor, and the workplace for labor.
Model Capability: The Quality of Capital
In classical economics, the same amount of capital investment yields vastly different outputs depending on whether it's a precision machine tool or general-purpose equipment—this is the difference in capital quality. The same applies to tokens. Consuming one million tokens, a top-tier model excelling in tool invocation, coding, and long-horizon tasks can produce economic value several orders of magnitude greater than a model capable only of simple dialogue. Model capability directly determines the intelligence density of tokens.
The agent era imposes new requirements on model capabilities. First, precision in tool invocation—can the model seamlessly collaborate with external services via standard protocols like MCP and dynamically adjust decisions based on real feedback? Second, depth in coding ability—evolving from code completion to becoming the core engine covering the entire software development lifecycle, capable of working independently in terminal interactions, web development, and multi-language environments. Third, endurance in executing long-horizon tasks—this is the fundamental capability distinguishing agents from chatbots. Taking Alibaba Cloud's recently released Qwen3.7-Max as an example, starting from a blank workspace with no prior knowledge, it operated continuously for over 35 hours, performed thousands of tool calls and hundreds of solution evaluations, ultimately producing underlying kernel code ready for deployment, achieving a 10x acceleration ratio.
From the perspective of the production function, enhancing model capability is equivalent to improving the quality of capital. When each unit of token can carry more intelligence and drive more complex actions, the output boundary of the entire production function expands outward.
Inference Service Platform: The Quantity of Capital
Transforming models into continuously productive capacity relies on inference service platforms. This is essentially a problem of large-scale token production: how to integrate models of different sizes and specializations, dynamically schedule them based on task difficulty, and stably produce tokens with the highest throughput and lowest unit cost.
If training determines how smart a model can be, the inference platform determines at what scale and price this intelligence can be invoked. Whether the unit cost of tokens can continue to decline and whether capacity can elastically expand with demand are decided at this layer.
An inference service platform requires four core capabilities. First, high performance—maintaining low latency and high throughput under long-chain tasks and sudden traffic surges. Second, cost controllability—through engineering methods like context caching and reuse, elastic resource pool scheduling, and batch inference, making token consumption predictable and optimizable. This is akin to lean manufacturing in factories—increasing effective output per unit input by reducing waste. Third, security and reliability—from multi-tenant isolation to confidential computing, enabling core business processes to confidently adopt agent workflows. Security is not the opposite of efficiency but the prerequisite for sustainable efficiency, much like factory safety protocols, which may seem to slow operations but prevent massive losses from shutdowns. Fourth, continuous optimization of effectiveness—using agent-oriented reinforcement learning to allow models to continuously evolve in specific business scenarios, enabling smaller models to achieve results comparable to large models in vertical domains. Alibaba Cloud's Bailian is such an inference platform, integrating Qwen and numerous ecosystem models to provide high-performance, cost-effective inference services, consistently lowering the production cost of tokens. This layer determines the quantity of capital.
Agent-Oriented Tool Ecosystem: The Tools for Labor
The capability of an agent depends highly on both the underlying model and the toolchain engineering—the entire scaffolding built around the model, including task planning, tool invocation loops, context management, memory management, and error recovery. No matter how powerful the model, it requires efficient toolchains to complete tasks reliably.
In classical production, a worker's output depends on their machine tools, fixtures, and measuring instruments. In the intelligent economy, the efficient, high-quality output of an agent similarly depends on the tool ecosystem it can invoke. The richer the ecosystem of tools and skills, the more and better work an agent can perform—the richness of tools also influences the capability ceiling of this new form of labor.
Fostering a thriving tool ecosystem requires the continuous emergence of agent-specific tools and skills, encapsulating domain best practices into reusable skill templates, enabling ordinary agents to perform expert-level work. Observations indicate that cloud services, as critical infrastructure for the intelligent economy, are also evolving for agents. Cloud capabilities are being 'skillified' and encapsulated via protocols like MCP into forms directly callable by agents, transforming past human-designed cloud services into agent products. The richer the tools and the more mature the toolchains, the stronger the productivity of this new labor force—the intelligent agent.
Agent-Native Cloud: The Workplace for Labor
With high-quality tokens and capable agents, a workplace for their efficient combination is needed—this is the agent-native cloud. It is where tokens are transformed into action via agents, determining whether capital and labor can combine efficiently.
The workload of agents differs significantly from traditional applications, presenting six new challenges. First, short task lifecycles—an agent task may exist for only seconds to minutes, requiring a runtime that is lightweight, capable of millisecond-level startup, and disposable. Second, unpredictable burst loads—when agents initiate tasks and their concurrency levels are unpredictable, requiring resources to handle sudden, large-scale loads. Third, dynamic environmental dependencies—each task requires different tools, dependencies, and execution environments that change in real-time, necessitating a runtime that can assemble and inject resources on-demand. Fourth, complex data modalities and storage forms—agents have high demands for data quality, semantics, and real-time retrieval, requiring a unified data plane supporting multi-level memory storage, multi-modal data ingestion, and session state storage. Fifth, large-scale dynamic orchestration—coordinating thousands of agents in parallel requires an orchestration layer to automatically decompose tasks, dynamically route them, and maintain state consistency and instruction integrity across long chains. Sixth, task-level security controls—every action of every agent must be governed by identity, permissions, and audit trails, requiring runtime-level isolation, data protection, and confidential computing, embedding governance and security into every call.
Alibaba Cloud has comprehensively upgraded its cloud services to provide robust infrastructure for agent workloads. The relationship between these four components fully maps the structure of the new production function. Model capability determines the intelligence density of tokens. The inference service platform drives the large-scale, low-cost production of tokens. The agent-oriented tool and toolchain engineering enhance agent capability. The agent-native cloud provides the venue for value release. Together, these four components form the production system that turns the new production function Q = f(Token, Agent) from a formula into reality. This is also why full-stack capability is becoming a core competitive moat for leading enterprises—not because one company needs to do everything, but because the key components of this production function must work in end-to-end synergy to fully unleash the productivity of tokens and agents.
Five Initiatives to Promote the Healthy and Sustainable Development of the Token Economy
The token economy is in a rapid growth phase. Several initiatives are proposed to promote its healthy and sustainable development.
The first initiative is to redefine the planning logic for computing power infrastructure. The core function of future intelligent data centers will be to produce tokens. Key metrics should shift from pure computing power indicators to effective token production capacity per unit of energy consumption. Inference loads will become the primary workload, so infrastructure planning should be designed around elastic inference scheduling and heterogeneous chip adaptation. The coordinated planning of computing power and power supply needs to be prioritized.
The second initiative is to establish a 'good-faith acquisition' fault-tolerance mechanism for data use. For activities using public data to promote AI applications, the rights of users should be fully protected, and compliance costs reduced. Pilot programs for applying high-quality datasets should be launched in key industries. Without data flow, tokens are like water without a source.
The third initiative is to expand AI application penetration on the demand side. The growth flywheel of the token economy lies in demand. Support should be provided for small and medium-sized enterprises to consume tokens via public cloud at low thresholds and for large enterprises to build enterprise-level intelligent agents. Making 'using tokens' and 'using agents' infrastructure-level capabilities, akin to 'using electricity.'
The fourth initiative is to promote the standardization of token quality evaluation. Current token pricing is based on simple models considering model version and consumption volume. Establishing a multi-dimensional evaluation framework covering accuracy, completeness, efficiency, and robustness will lay the foundation for transitioning tokens from 'pay-per-use' to 'pay-for-quality,' and ultimately towards a healthy 'pay-for-output' model.
The fifth initiative is to advance the construction of AI-native organizations. The realization of benefits from every general-purpose technology revolution depends on the synchronous upgrade of organizational forms. No matter how fast technology spreads, if the organization cannot keep up, productivity gains may remain 'missing' in statistics. Enterprises are encouraged to place organizational transformation on par with technological investment, reshaping divisions of labor, processes, and governance boundaries, and cultivating composite talent proficient in both business and agentic AI.
Conclusion
The ancient saying speaks of reading ten thousand books and traveling ten thousand miles. AI has traversed this path in less than three years—from reading (pre-training to accumulate knowledge) to solving (reasoning to overcome problems) to acting (agents executing tasks). The continuously growing value of tokens serves as milestones on this journey, recording each transformation from knowledge to intelligence to productivity.
From the steam engine to electricity to the internet, each rewriting of the production function has spawned new economic forms and social divisions of labor. The token economy may well be our generation's 'electricity moment.' The difference is that this time, what flows through the grid is not energy but intelligence; power plants have become token factories; and household appliances have become intelligent agents.
When the production cost of tokens continues to decline while the value of the actions they drive continues to rise, a critical threshold is crossed: the economic value created by tokens exceeds their own cost. Beyond this point, the growth of the token economy possesses an endogenous, self-reinforcing logic. It is not a narrative driven by concept but an irreversible transformation of production methods propelled by the maturity of technology and market demand.
Comments