Cursor's Self-Developed Model Claims to Outperform Opus 4.6 with 90% Price Cut, Faces Backlash Over Kimi 2.5 Base Allegations, Receives Musk's Endorsement

Deep News03-21 13:38

AI programming tool Cursor has publicly launched its self-developed model, Composer 2, claiming it surpasses the performance of Claude Opus 4.6 while significantly reducing costs. However, within three hours of the announcement, developers exposed that the model's underlying foundation is actually the open-source model Kimi K2.5 from Chinese company Moonlit AI.

The controversy surrounding this "self-developed" model quickly spread across the AI community. Elon Musk personally weighed in to confirm the findings, and the situation concluded with a public apology from Cursor's co-founder and a congratulatory message from Kimi's official account.

On March 21, according to Hard AI reports, Cursor co-founder Aman Sanger acknowledged the oversight after the incident gained traction, stating, "It was our omission not to mention the Kimi base model in the initial blog post. We will correct this in the next model." Moonlit AI's official account responded promptly: "Congratulations to Cursor on the launch of Composer 2. We are proud to see Kimi K2.5 serve as the base model. This is the open-source ecosystem we advocate." Moonlit AI clarified that Cursor accessed Kimi K2.5 through the reinforcement learning and inference platform hosted by Fireworks AI, confirming it as an authorized commercial partnership.

**Outperforming Opus 4.6 with a "Massive" Price Cut** Cursor officially released Composer 2 this past Friday, announcing in its blog that the model achieved significant improvements across all benchmark tests it measured, including Terminal-Bench 2.0 and SWE-bench Multilingual.

On Terminal-Bench 2.0, which evaluates an agent's terminal operation capabilities, Composer 2's performance ranked between GPT-5.4 and Claude Opus 4.6. On the CursorBench metric for cost-effectiveness, it significantly outperformed both competing models.

Pricing was a central selling point of Cursor's release. The standard version of Composer 2 is priced at $0.5 per million input tokens and $2.5 per million output tokens, representing what the company described as a "massive" reduction compared to Claude Opus 4.6.

Cursor also introduced a faster variant, Composer 2 Fast, priced at $1.5 per million input tokens and $7.5 per million output tokens, maintaining a price advantage while emphasizing improved response speed.

Cursor attributed this breakthrough in cost-effectiveness to a new reinforcement learning method, emphasizing it is "a genuinely trained capability, not an inference trick."

**Base Model Exposed Within Three Hours of Launch** Composer 2's moment in the spotlight was short-lived. Less than three hours after its release, X platform user @fynnso discovered that the model's ID was kimi-k2p5-rl-0317-s515-fast, leading to the conclusion: "Composer 2 is essentially a reinforcement-learned version of Kimi K2.5."

This discovery rapidly spread across technical communities like X and Hacker News, generating both memes and serious discussion. Elon Musk directly replied to @fynnso's post with "Yeah, it's Kimi 2.5," amplifying the topic's visibility.

Discussions on the r/singularity subreddit were equally intense. One user commented:

"The funniest part is everyone praising Composer 2 as a huge leap, when they were using someone else's model the entire time. It makes you wonder how many so-called 'proprietary models' are just fine-tuned open-source versions with a new logo."

Another perspective suggested that Cursor's true competitive advantage lies in the task-solving data accumulated from extensive developer usage, not in pre-training itself. "Every investor knows they aren't building their own base model," the comment read. "They should have been transparent about it from the start."

**Cursor Apologizes, Kimi Confirms Authorized Partnership** Facing public pressure, the Cursor team responded directly. Aman Sanger publicly confirmed that the team evaluated several base models using perplexity tests and found Kimi K2.5 to be "demonstrably the strongest." He explained that they then applied continued pre-training and a high-compute reinforcement learning process, four times the scale, deploying it via Fireworks AI's inference and RL sampler.

Cursor's VP of Developer Education, Lee Robinson, added further technical details: the final model's compute footprint is approximately one-quarter derived from the base model, with the remaining three-quarters coming from Cursor's own training. Robinson also indicated that while Composer 2 is based on an open-source model, future iterations would involve full pre-training from scratch.

Moonlit AI subsequently issued a clear statement, emphasizing that the cooperation complies with licensing requirements and constitutes an authorized commercial partnership, while also congratulating Cursor on the Composer 2 launch.

While this clarified the legal and authorization aspects of the controversy, Cursor's initial decision to omit information about the base model during the launch continued to generate discussion within developer communities.

**"Taking Notes" Reinforcement Learning: Cursor's Technical Explanation** Despite the controversy over the base model's origin, Cursor's technical work possesses independent merit. Cursor's blog detailed its core methodology—a reinforcement learning mechanism called "self-summary," designed to address the issue of AI coding assistants losing focus on ultra-long, complex tasks due to limited context windows.

Specifically, during task execution, the model pauses upon reaching a fixed token length trigger point, generates a "stage summary," and then continues the task based on the compressed context. This summarization capability is integrated into the reinforcement learning reward mechanism: higher-quality summaries leading to higher subsequent task success rates result in greater rewards for the model, while failures are penalized.

Cursor's disclosed internal test data indicated that compared to traditional summarization methods, this approach uses only one-fifth the tokens, while compression-related errors are reduced by approximately 50%. Cursor provided the example of the highly complex task of "running the Doom game on a MIPS architecture." Composer reportedly found an exact solution after 170 rounds of interaction, compressing a context of over 100,000 tokens down to about 1,000.

**Debate on Open-Source Ecology and Transparency** The incident sparked a deeper discussion about trust between the AI application layer and the open-source ecosystem. Hugging Face co-founder and CEO Clement Delangue highlighted the value of open source, noting that Chinese open-source models have become a major force shaping the global AI technology stack.

Competitor Windsurf quickly capitalized on the situation, announcing it would offer free access to Kimi K2.5 for all users for the upcoming week, aiming to attract Cursor's user base.

Analysis suggests this controversy adds unexpected public relations pressure for Cursor at a critical juncture for fundraising. Reports indicate Cursor is currently seeking a new funding round at a valuation of $50 billion.

Cursor CEO Aman Sanger has previously described the company as a new type of entity that is "neither a pure application developer nor a model provider." This event underscores that as the performance of open-source base models approaches that of top-tier proprietary models, downstream application companies will face an unavoidable industry-wide challenge: balancing commercial packaging with technical transparency.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment