The practice of reselling AI model access tokens has evolved into a profitable venture, with a surge of domestic intermediaries flooding the market.
Promotional posts like "Pure Claude, primary account pool, 0.1 rate, 50 million tokens upon registration" have proliferated across technical forums and platforms such as Xiaohongshu, Douyin, and Xianyu over the past two weeks. The AI relay station business, touted as one of the most lucrative opportunities this year, is gaining significant attention.
These AI relay stations operate as "proxy purchasers." Due to regional restrictions on major overseas AI models, domestic users send their requests to these stations. The stations then utilize their own channels to access the models and return the results, charging users based on token consumption.
This system functions like an underground power grid in the AI world. When official access is expensive or unavailable, unofficial channels emerge to provide and distribute the service.
The demand is substantial. One operator who entered the market early this year now manages three separate stations: one specializing in Claude, another in ChatGPT, and a third dedicated to enterprise privatization. Clients range from individual developers to domestic research institutions and AI-powered content creation companies.
The industry is rapidly expanding and segmenting. Many operators report that the most frequent inquiries now come from individuals seeking to become agents, obtaining token quotas from larger stations to handle downstream distribution. "The market is now saturated with multi-layered resellers, along with various 'public welfare' stations offering unrealistically low prices," remarked one operator wryly.
A price war has erupted. Many relay stations now offer tokens at 10% to 30% of the official price, with the cheapest even as low as 1%. A highly profitable, dual-revenue tactic is an open secret in the industry: using fake models to deceive users and charge for tokens, then selling user data for additional profit.
A growing number of operators, sensing the deteriorating ecosystem, are considering an exit. However, new sellers and buyers continue to enter the AI relay station market.
"The survival of these relay stations hinges on two factors: demand and the significant information gap within the AI industry," noted a former procurement manager at a prominent AI startup. This gap exists not only among general users but also within companies in the sector. Many firms developing AI products are unaware of the legitimate channels available for accessing overseas models.
Like many operators, V was initially a customer. When the operator of a station he had used for six months disappeared at the end of last year, he had an idea: why not build his own?
For domestic developers, the challenges of using overseas models persist. Since 2024, companies like OpenAI and Anthropic have explicitly tightened access and sales restrictions for mainland China. Stable use requires navigating issues like overseas phone numbers, foreign currency credit cards, and network environments, while also facing token bills dozens of times higher than those for domestic models.
Initially, some service providers helped top up account memberships, which gradually evolved into the AI relay station model: websites offering access to over a dozen major models, akin to a model supermarket, with token "wholesale prices" slightly below official rates.
With a decade of experience in the internet industry and active participation in developer communities, V quickly identified the three core components of a relay station:
First, the "account pool"—subscription accounts for major models are the foundational resource. Currently, the most sought-after are Claude Code MAX plan accounts, costing $200 monthly with dialogue limits roughly a hundred times that of free accounts. Building a pool requires at least 10-20 such accounts.
Second, "reverse engineering"—most stations do not use official APIs but instead repackage web chat or client interfaces for shared use.
Third, the "rate," essentially a discount; a 0.1 rate means 10% of the official price. Stations track token consumption during request forwarding and charge based on their set rate.
While it sounds complex, the most challenging part is sourcing and connecting the account pool. The rest can be streamlined. GitHub hosts numerous open-source projects, such as the popular New API, which packages protocol conversion, channel management, billing, and user backend management. Deployment can be achieved with a few commands, and the project has been pulled over a million times via Docker.
However, at that time, AI relay stations remained a niche business solving problems for developers within the circle. The real explosion occurred in March of this year.
Starting in March, with the emergence of products like Lobster OpenClaw, a wave of non-traditional programmers began experimenting with coding, product development, and solo entrepreneurship. Tokens became an economic factor. Data from China's National Data Bureau in March showed the daily average token calls by domestic users rapidly exceeded 140 trillion, a 40% increase from the beginning of the year.
Lan Wei entered the market during this period, representing a third-generation operator profile: lacking an account pool and technical expertise, he leveraged platforms like Xiaohongshu, Douyin, and Xianyu to attract a broad audience, purchasing token quotas from upstream suppliers.
"I'm essentially a middleman who happened to catch this wave of traffic," Lan Wei admitted frankly. "I don't inquire about the methods used by the upstream stations. First, I don't have the time; second, I lack the capability. I even had to look up what 'vibe coding' meant."
Before starting the relay station, Lan Wei worked in a factory workshop in central China. Following a tutorial, he used AI to build a website, integrated the New API framework, and surprisingly secured his first enterprise client. When the site was unstable or conversations lagged, he asked AI for solutions. Customer service responses were also handled by AI, with only unresolved售后 issues escalated to the upstream supplier.
Lan Wei now operates three relay stations: one focused on ChatGPT, another on Claude, and a third for enterprise privatization. A single station can generate daily recharge revenues ranging from thousands to over ten thousand yuan. His role involves purchasing token quotas from upstream, marking them up by 50%, representing pure incremental income.
Many envious individuals have followed suit. Last week, Lan Wei posted on Xiaohongshu intending to attract clients by educating them about AI relay stations. Instead, he received the most inquiries from "peers"—four university students consulting about becoming his sub-agents with a 30% commission split.
"The market is now dominated by multi-layered resellers; even second-hand stations are becoming rare," another operator remarked, unsurprised. He noted that even his station, intended for a small circle of acquaintances, now sees three or four new "mini-stations" attempting to connect daily. These mini-stations further distribute access, creating fifth or sixth-level resellers.
With the influx of sellers, a price war has intensified.
Taking the currently热门 Claude-Opus-4-6 as an example, the official API output price is approximately 170 yuan per million tokens. A price monitoring platform for relay stations shows that a leading domestic station has pushed prices down to 78 yuan per million tokens, nearly a 50% discount. More small and medium-sized stations普遍 offer 20% to 30% discounts, with the cheapest甚至 selling for 2 yuan per million tokens—almost free.
Is it truly possible for buyers to get quality access cheaply while sellers still profit? The reality is more complex.
Roland, operator of the model verification platform Modelknow, explained that relay stations use a complex billing formula. A top-up of 10 units for 1元 and 100 units for 1元 might result in a higher final bill for the latter due to differences in purchasing power and consumption speed.
"Most users struggle to understand token billing, which is normal. Simply put, relay stations lack significant technical barriers and standardized pricing. Operators find profit margins within this formula," Roland said.
Cheap tokens typically come from two methods: reducing costs or increasing revenue margins. Higher profits often involve "grayer" tactics.
An early method involved free refunds: a Claude MAX account subscription was $200 monthly, but many discovered that if the account was banned by Claude, the company issued a full refund regardless of token消耗. Operators could intentionally send sensitive or illegal content to trigger a ban, effectively zeroing out their costs.
As Claude's refund policies changed, this zero-cost method diminished, but stations found other cheap channels for批量 registering accounts.
A major current source is IDE reverse proxy. IDEs are development tools for debugging code. In recent years, common software like Cursor, Kiro, and Windsurf have built-in access to models like Claude. Relay stations extract the model调用窗口 from these products and reverse-proxy them into APIs for resale.
Reverse proxy is relatively stable but offers slightly inferior performance. "It's the genuine model, but not the original version," Roland意味深长地说.
"Any reverse engineering through proxy involves 'system prompt' contamination," explained former Manus engineer Xu Changpeng. "The station requests AI responses through another product, which carries its own product logic. This means, although invisible to the user, the product's system prompt is added to each request before it's sent. The最终 returned result inevitably differs from the original."
This is akin to comparing an Apple iPhone from an official store, a contract manufacturer, or a refurbished channel—versions differ, but few can accurately discern the区别, and there's no official verification mark.
Regarding the effectiveness of relay stations, the waters are much murkier than the prices suggest.
Many assume a relay station is merely a courier forwarding requests. In reality, it更像 a messenger who reads the letter first, then sends it out under their own name. The API address and endpoint are replaced with the station's domain, allowing the operator to theoretically view, modify, or replace the entire request-response process.
The potential for暗箱操作 is vast. The industry's most暴利 practice is termed "killing two birds with one stone": mixing in fake models to骗 users' token fees, then selling user data for another profit.
Lan Wei has witnessed numerous substitution cases: ChatGPT冒充 Claude, DeepSeek冒充 GPT, and even Doubao伪装成 DeepSeek. The cost of one掺假 might be less than 1/10 of the official price. "There are plenty of half-genuine, half-fake operations, exploiting whoever they can," he said.
Even academia might unknowingly use substituted models for research. In March, the CISPA信息安全研究中心 published the first academic paper auditing relay station security, titled "Real Money, Fake Models: Deceptive Model Claims in Shadow APIs." The paper tracked 17 stations cited in formal academic papers and found nearly half failed model identity verification.
Wasting money isn't the worst outcome; the greater danger is data resale.
Operators of various scales reported being approached about selling user data. Buyers claimed to be data companies外包 by model manufacturers, specifically requesting user logs from Claude 4-6 or above versions, with multiple tool calls and over 30 dialogue rounds. Some offered 0.1~0.2 yuan per conversation, others打包 10 yuan for 1M of data.
"Several major companies in the industry buy data for model training; it's common knowledge," Xu Changpeng confirmed. The SOTA model requests and responses retained by relay stations naturally serve as training data needed by model companies.
Even Lan Wei, who自称完全不懂技术, felt the巨大利益诱惑和法律风险. A buyer once persuaded him with payment screenshots from other operators. He wavered momentarily, as exporting data seemed简单 and the赚钱的诱惑 was substantial.
However, Lan Wei ultimately refused, fearing he might unwittingly become an accomplice to illegal fraud groups—a thought that still gives him chills.
"Selling is out of the question; leaking data would be disastrous," another operator warned peers in a tech community. He noted many small operators lack data cleaning capabilities, and if user logs contain private keys or passwords, the泄露风险 is severe.
Lan Wei is considering an exit. He's not alone; many seasoned operators sense a shift: the industry生态 is souring, model manufacturers are tightening policies, and risks are visibly approaching.
Starting April 2026, Claude initiated what's called its strictest real-name verification yet, randomly triggering facial recognition with passports or driver's licenses. Domestic e-commerce platforms are also cracking down on ChatGPT resales. "We can't even type '5.5' (ChatGPT's latest version number). Previously, we used暗示 like 'DeepSeek鼻祖' for ChatGPT, but lately, posts with titles like 'DeepSeek祖宗' get blocked too."
Yet, demand persists. Enterprise clients have signed contracts with unused quotas. Lan Wei feels pulled forward by orders even as he contemplates withdrawal.
"This is a market born from demand. If someone wants an Apple product but can't buy it domestically, they will inevitably seek各种渠道," Roland said.
But how much of this is genuine need, and how much is anxiety about falling behind? How long will the current火热 of relay stations last?
During interviews, nearly every operator's phone notifications were constant. V自嘲 he's engaged in "AI literacy," earning辛苦钱 as a 24/7客服: "There are hordes of novice developers with极低 reading comprehension and hands-on ability, yet they want immediate access to the most顶级模型."
Lan Wei's user base is more diverse. He now has six enterprise clients, the largest being an外包公司 handling demands from domestic research institutions. The other five are small companies with registered capital around one million yuan, mostly involved in AI-powered content creation for漫剧.
With graduation season approaching, numerous university students on Xianyu and Douyin are placing orders to use relay stations for论文修改. "A quick online search yields countless tutorials on using GPT for论文修改. Many influencers specifically emphasize foreign models, claiming the latest GPT 5.5 can reduce AI detection rates to certain levels," Lan Wei said. However, in practice, most users lack even basic key调用 skills.
"If it genuinely boosted productivity, that would be one thing. But do these ordinary university students or office workers really need to upgrade to the latest, most expensive models?" Lan Wei expressed a矛盾的情绪.
More clients are good for business, yet Lan Wei sometimes finds himself silently advising them: there's no need to follow the trend blindly.
In Xu Changpeng's view, relay stations are fundamentally a business built on information asymmetry. This gap exists not only among普通用户 but also extensively within the AI industry itself.
During the startup phase of his previous company, Xu explored various cloud vendor and relay station合作 paths. He discovered that besides individual developers, relay station clients included many中小公司. They believed relay stations were their only option for purchasing Claude or ChatGPT access, unaware that正规申请 was often possible.
"Companies that have secured funding基本上 get主动联系 by official cloud代理厂商 like AWS and Google Cloud Platform," he said. "Especially AI companies, most already have overseas entities. Using that entity to negotiate cooperation usually allows正规接入, and商务关系 can secure discounts. But wanting to buy tokens at below 30% discount本身 isn't a reasonable demand."
Xu Changpeng doesn't deny that民间 relay stations have addressed the痛点 of regional restrictions to some extent. However, he believes the current近乎失控的形态 is unsustainable.
For future development, a more合规的方向 is official procurement: platforms aggregating larger token调用量, negotiating discounts directly with model manufacturers, and then distributing quotas to smaller clients—wholesale first, then retail. This is what "正规军" like OpenRouter are doing in the U.S.
As for a more fundamental shift, it depends on the development pace of domestic models.
"I strongly oppose the sentiment of一边骂 Claude, 一边离不开 Claude; 一边说不喜欢国外公司, 一边拼命给它送钱," Xu Changpeng stated. In his view, large models are an industry reliant on real usage feedback. Only when more people are willing to use本土 models can manufacturers establish a positive循环 of data, feedback, and product iteration. The market must vote with its long-term choices.
(V, Lan Wei, and Roland are pseudonyms.)
Comments