Cursor Composer 2: Benchmarks, Pricing & Full Review (2026)
Cursor just released a coding model that beats Claude Opus 4.6 on Terminal-Bench 2.0 while costing 10 times less. That is not a typo. Composer 2 launched on March 19, 2026, and I have been going through every piece of data Cursor published to give you the most complete picture of what this model actually does, what it costs, and whether it should change how your team uses AI for coding.
The short version: Composer 2 scores 61.3 on CursorBench, 61.7 on Terminal-Bench 2.0, and 73.7 on SWE-bench Multilingual. Cursor's prior model, Composer 1.5, scored 44.2, 47.9, and 65.9 respectively. That is not a small jump. It is also worth knowing that Composer 2 is built on Kimi K2.5, an open-source model from Moonshot AI, with Cursor's own continued pretraining and reinforcement learning layered on top. The provenance detail matters, and I will get into why.
I am going to break down the architecture, the benchmark data, the pricing comparison against GPT-5.4 and Claude Opus 4.6, and what the Kimi K2.5 base actually means for people who care about model transparency.
What Is Cursor Composer 2?
Cursor Composer 2 is Cursor's third-generation proprietary coding model, released on March 19, 2026, and available directly inside the Cursor IDE. It is positioned as a frontier-level agentic coding model that can handle complex, multi-step coding tasks requiring hundreds of sequential actions.
Cursor, the AI code editor built by San Francisco startup Anysphere (currently valued at $29.3 billion), first introduced its in-house Composer model series in October 2025 alongside the Cursor 2.0 platform redesign. Composer 1.5 followed in February 2026. Composer 2 is the biggest leap so far.
The model ships with a 200,000-token context window and comes in two variants: a standard version priced at $0.50 per million input tokens, and a fast version at $1.50 per million input tokens. The fast variant is now the default option inside Cursor.
What makes Composer 2 different from simply plugging in a third-party model like Claude or GPT-5.4 is deep IDE integration. Composer 2 has direct access to search, terminals, version control, and isolated worktrees inside Cursor, which reduces the friction of multi-file, multi-step coding tasks compared to chat-based alternatives.
How Composer 2 Was Built: Architecture and Training
Composer 2 uses a Mixture-of-Experts (MoE) architecture built on Kimi K2.5, the open-source model from Moonshot AI, enhanced with Cursor's own continued pretraining and reinforcement learning. Cursor confirmed the Kimi K2.5 base on March 20, 2026, after a user discovered it in API request headers. Lee Robinson, VP of Developer Education at Cursor, acknowledged that roughly 25% of the model's computational foundation derives from the original Kimi K2.5 architecture.
Here is what changed compared to Composer 1.5. Prior Composer models were built by applying reinforcement learning directly on top of a frozen base model. Think of it like teaching advanced skills on a foundation that was never specifically prepared for them. Composer 2 flips this: Cursor first ran continued pretraining to update the foundational model weights using coding-specific data, then applied RL on top of that stronger base.
The RL training itself focuses on long-horizon coding tasks. Cursor's approach, which they call compaction-in-the-loop reinforcement learning, builds context summarization directly into the training process. When a generation sequence hits a token-length threshold, the model compresses its own context to approximately 1,000 tokens from 5,000 or more. According to Cursor's March 2026 research documentation, this approach reduces compaction error by 50% compared to prior methods and enables the agent to work through hundreds of sequential actions on project-scale refactors without losing its goal.
The MoE architecture means only a subset of model parameters activates for any given input, which keeps inference fast while maintaining a large total parameter count. Cursor has not published the exact total parameter count.
Benchmark Results: Composer 2 vs Opus 4.6 vs GPT-5.4
Composer 2 outperforms Claude Opus 4.6 on Terminal-Bench 2.0, scoring 61.7 against Opus 4.6's 58.0, while GPT-5.4 still leads the field at 75.1 on the same benchmark. Here is the full comparison across all three benchmarks Cursor reported:

A few things I want to flag about these numbers. CursorBench is Cursor's own proprietary evaluation suite, which means the scores there are self-reported and not independently verified yet. Terminal-Bench 2.0 is maintained by the Laude Institute and uses the Harbor evaluation framework, which gives it more credibility as a third-party standard. SWE-bench Multilingual is a well-established benchmark for multi-language software engineering tasks.
The gain from Composer 1.5 to Composer 2 is genuinely large: 38% improvement on CursorBench and 29% on Terminal-Bench 2.0. The benchmark jump is also bigger than the jump from Composer 1 to 1.5, which makes sense given the architectural change from RL-only scaling to continued pretraining plus RL.
Cursor is not claiming the top spot overall. GPT-5.4 still leads Terminal-Bench 2.0 at 75.1, and Cursor's messaging is deliberately pragmatic: Composer 2 offers a strong cost-to-intelligence ratio for everyday coding inside the Cursor IDE, not universal benchmark dominance. That honesty, I think, is the right move.
Pricing: How Much Does Composer 2 Cost?
Composer 2 Standard costs $0.50 per million input tokens and $2.50 per million output tokens, which is approximately 86% cheaper than Composer 1.5's previous pricing of $3.50 and $17.50 respectively. Here is how Composer 2 stacks up against competing models:

The price drop is significant. Composer 1.5 cost $3.50 per million input tokens and $17.50 per million output tokens in February 2026. Composer 2 Standard is 86% cheaper on both counts. Even Composer 2 Fast at $1.50/$7.50 is 57% cheaper than Composer 1.5.
On individual Cursor plans, Composer model usage falls within a separate usage pool with a generous base allocation. When you use Cursor's Auto mode (letting it pick the best model per request), Composer usage is unlimited on paid plans with no credit deduction. Third-party models like GPT-5.4 and Opus 4.6 draw from your monthly credit pool instead.
Cache-read pricing is also discounted: $0.20 per million tokens for Composer 2 Standard and $0.35 per million for Composer 2 Fast, compared to $0.35 per million for Composer 1.5.
Cursor Composer 2 vs Claude Code: Which One Should You Use?
Cursor Composer 2 and Claude Code serve different workflows and are more complementary than competitive. According to a 2026 developer survey cited by DataCamp, Claude Code now leads as the most-used AI coding tool among professionals, with 46% naming it the tool they love most. Cursor came in second at 19%.
The practical difference comes down to where you work. Claude Code is Anthropic's terminal-based coding agent. It excels at complex, autonomous tasks that benefit from deep reasoning, like long-term system maintenance and multi-step architectural decisions. Many developers use Cursor for everyday IDE editing and switch to Claude Code for more demanding autonomous tasks.
Composer 2's advantage is its tight integration with Cursor's IDE environment. It has direct access to your codebase's search, terminal, file system, and version control without requiring external tooling. That makes it faster and less friction-heavy for routine coding, multi-file edits, and iterative development cycles.
My take: if you are already a Cursor user, Composer 2 should be your default model for day-to-day coding. It is unlimited on paid plans when used through Auto mode, and the benchmark data shows it is now legitimately competitive with the frontier. For complex reasoning tasks or system-level operations, Claude Code still has an edge, as one analyst noted that Composer lacks the reasoning depth of Opus 4.6 for non-coding tasks. But for writing, editing, and testing code inside an IDE? Composer 2 makes a strong case.
GitHub Copilot, for comparison, still has the widest adoption at over 20 million all-time users, but many developers report that Cursor's multi-file editing capabilities go deeper than Copilot's Agent mode. Roughly 70% of developers now use two to four AI tools simultaneously, so picking one tool as your exclusive option is increasingly a minority approach.
The Kimi K2.5 Controversy: What It Means for You
Cursor did not disclose at launch that Composer 2 is built on Kimi K2.5, an open-source model developed by Moonshot AI in China. The disclosure came one day after launch, after a user discovered the base model identity in API request headers.
Lee Robinson, Cursor's VP of Developer Education, confirmed the Kimi K2.5 foundation and clarified that Cursor's continued pretraining and RL account for about 75% of what makes Composer 2 perform the way it does. Robinson stated the performance is now very different from the base Kimi K2.5 model.
I think the lack of upfront disclosure was a mistake, not a scandal. Open-source model bases are common in the industry. The more relevant question for most teams is: does the model work well, and is it priced appropriately? On both counts, the data suggests yes.
For teams with strict data sovereignty requirements or supply chain policies around Chinese-origin technology, the Kimi K2.5 foundation is a real consideration that should factor into your procurement process. Cursor does enforce sandbox execution and commit signing, and provides audit trails for enterprise governance. But the underlying model origin is a legitimate question for compliance-sensitive environments.
How to Use Composer 2 in Cursor
Composer 2 is available now inside Cursor and in the early alpha of Cursor's new interface called Glass. Here is how to access it:
• Open Cursor and navigate to the model selector in the Composer panel.
• Select Composer 2 or Composer 2 Fast from the model list. Fast is now the default option.
• Alternatively, use Auto mode and Cursor will route appropriate requests to Composer 2 automatically, with unlimited usage on paid plans.
• API access is available via Cursor's model API at $0.50/$2.50 per million tokens for Standard and $1.50/$7.50 for Fast.
Cursor's individual plan includes Composer 2 usage in a standalone pool separate from third-party model credits. If you are currently spending credits on Opus 4.6 or GPT-5.4 for routine coding tasks, switching to Composer 2 through Auto mode is likely to reduce your credit burn without a meaningful quality drop for most use cases.
Is Cursor Composer 2 Worth It? My Honest Take
The benchmark improvements are real and substantial. A 38% jump on CursorBench and a pass rate of 73.7 on SWE-bench Multilingual puts Composer 2 firmly in the competitive tier of coding models, not a budget option that makes you feel the tradeoff.
The pricing story is even more interesting. At $0.50/$2.50 per million tokens, Cursor has priced Composer 2 more aggressively than any comparable frontier coding model. Claude Opus 4.6 costs 10 times more on input tokens and 10 times more on output tokens. GPT-5.4 costs 5 times more on input and 6 times more on output. For teams running high token volumes, the economics shift significantly.
The contarian point I will make: benchmark leadership does not always translate to daily-use satisfaction. Composer 2 does not match GPT-5.4's Terminal-Bench 2.0 score of 75.1, and Opus 4.6 still has stronger general reasoning capabilities outside pure coding tasks. If your workflows require the model to do significant reasoning about system design or long-term planning beyond just writing code, Composer 2 may not fully replace a frontier reasoning model.
But for what most developers actually use Cursor for? Editing files, refactoring functions, generating boilerplate, fixing bugs, writing tests? Composer 2 at this price point is hard to argue against.
Frequently Asked Questions
What is Composer 2 in Cursor?
Composer 2 is Cursor's third-generation proprietary AI coding model, released on March 19, 2026. It is built on Kimi K2.5 from Moonshot AI with additional continued pretraining and reinforcement learning. The model scores 61.3 on CursorBench, 61.7 on Terminal-Bench 2.0, and 73.7 on SWE-bench Multilingual.
How much does Cursor Composer 2 cost?
Composer 2 Standard is priced at $0.50 per million input tokens and $2.50 per million output tokens. The fast variant, which is now the default, costs $1.50 per million input tokens and $7.50 per million output tokens. Both variants are roughly 86% and 57% cheaper, respectively, than Composer 1.5.
Is Composer 2 free on Cursor?
Composer 2 usage is included in a standalone usage pool on Cursor's individual paid plans. When using Auto mode, Composer model usage is unlimited on paid plans with no credit deduction. Direct access to third-party frontier models like GPT-5.4 and Opus 4.6 still draws from your monthly credit pool.
How does Composer 2 compare to Claude Code?
Cursor Composer 2 and Claude Code serve different use cases. A 2026 developer survey found that 46% of professionals named Claude Code as their most-loved AI coding tool versus 19% for Cursor. Composer 2 excels at in-IDE coding tasks with tight integration into Cursor's file system, terminal, and version control. Claude Code is preferred for more complex, autonomous, reasoning-heavy tasks outside the IDE context.
What benchmarks does Composer 2 score on?
Composer 2 scores 61.3 on CursorBench (up from 44.2 for Composer 1.5), 61.7 on Terminal-Bench 2.0 (up from 47.9), and 73.7 on SWE-bench Multilingual (up from 65.9). It outperforms Claude Opus 4.6's Terminal-Bench 2.0 score of 58.0 but trails GPT-5.4 at 75.1 on the same benchmark.
Is Cursor Composer 2 built on Kimi K2.5?
Yes. Cursor confirmed on March 20, 2026, that Composer 2 is built on Kimi K2.5, an open-source model developed by Moonshot AI. Cursor applied continued pretraining and reinforcement learning on top of the Kimi K2.5 base. Lee Robinson, VP of Developer Education at Cursor, stated that roughly 75% of Composer 2's performance characteristics come from Cursor's additional training.
What is the context window for Cursor Composer 2?
Cursor Composer 2 ships with a 200,000-token context window, which is sufficient for large codebase operations and project-scale refactoring tasks.
Cursor Composer 2 vs Composer 1: What changed?
Composer 2 represents the largest generational jump in the Composer series. The main architectural change is the introduction of continued pretraining on the base model before applying reinforcement learning. Composer 1 scored 38.0 on CursorBench and 40.0 on Terminal-Bench 2.0. Composer 2 scores 61.3 and 61.7 on the same benchmarks, a gain of over 50% on Terminal-Bench 2.0.
Recommended Reads
If you found this useful, these posts from Build Fast with AI go deeper on related topics:
• 7 AI Tools That Changed Developer Workflow (March 2026)
• Best AI for Coding 2026: Nemotron vs GPT-5.3 vs Opus 4.6
• Every AI Model Compared: Best One Per Task (2026)
• GPT-5.4 Mini vs Nano: Pricing, Benchmarks & When to Use Each
• 12+ AI Models in March 2026: The Week That Changed AI
References
11. Introducing Composer 2 - (Official Cursor Blog, March 19, 2026)
13. Cursor's Composer 2 beats Opus 4.6 on coding benchmarks at a fraction of the price - thenewstack.io (The New Stack, March 2026)
15. Cursor Admits Composer 2 Is Built on Chinese AI Model Kimi K2.5 - (eWeek, March 2026)
16. Cursor launches Composer 2 with state-of-the-art coding - (TechZine, March 2026)
17. How Good is Cursor's Composer 2? - offthegridxp.substack.com (Michael Spencer, March 2026)


