Claude Mythos 5 Review: Anthropic's 10-Trillion Parameter Model (2026)

Anthropic revealed its most powerful AI model ever. Not through a press release. Through a misconfigured server.

On March 26, 2026, security researcher Roy Paz discovered that Anthropic's content management system had left roughly 3,000 internal documents publicly accessible, including a draft blog post announcing a model called Claude Mythos. That document described it as 'by far the most powerful AI model we've ever developed,' flagged unprecedented cybersecurity risks, and introduced a new model tier above Opus named Capybara. Anthropic locked the files down within hours and confirmed: yes, it exists, yes it's in testing, yes it represents 'a step change.'

That was five weeks ago. Since then, Anthropic has launched Project Glasswing, briefed senior US government officials, given preview access to Apple, Google, Microsoft, Nvidia, CrowdStrike, and around 50 other organizations, and watched the model find a 27-year-old bug in OpenBSD. Here is everything we know.

1. What Is Claude Mythos 5? (The Story Behind the Leak)

Claude Mythos 5 is Anthropic's unreleased frontier AI model, confirmed by the company on March 26, 2026, after an internal draft announcement was accidentally left in a publicly searchable data store. Anthropic described the model as 'the most capable we have ever built' and 'a step change' in AI performance.

The leak happened because of a misconfiguration in Anthropic's CMS. Around 3,000 assets, including draft blog posts, internal PDFs, and details of an upcoming CEO summit, were accessible without authentication. The files were discovered and reported by Roy Paz, a senior AI security researcher. Anthropic locked everything down and confirmed the model's existence in the same news cycle.

Two names appear in the leaked materials. Claude Mythos is the public-facing product name. Capybara is the internal codename for the new model tier it introduces. The draft post described Mythos as 'larger and smarter than our Opus models, which were until now our most powerful.' Given that Claude Opus 4.6 was Anthropic's flagship as recently as February 2026, that framing is not subtle.

My take: I find it telling that the leak happened at Anthropic, of all companies. The firm built its brand on safety and careful rollout. The irony of the world's most capable, potentially most dangerous AI model being exposed by a database misconfiguration is worth noting, and Anthropic's candid acknowledgment of that irony in their response was the right move.

2. The Capybara Tier: How It Changes Anthropic's Model Lineup

Claude Mythos 5 does not slot into Anthropic's existing three-tier hierarchy. It creates a fourth.

As of early 2026, Anthropic's lineup ran from Haiku (smallest, fastest, cheapest) through Sonnet (mid-range) to Opus (most capable, most expensive). The leaked documents describe Capybara as a tier 'even larger and more capable than Opus, but also more expensive.' Claude Mythos is the first model in that Capybara tier.

The Capybara Tier- How It Changes Anthropic's Model Lineup

The introduction of a new tier is architecturally significant. It signals that Anthropic is not treating Mythos as a cleaned-up Opus 5 upgrade. This is a structural break, driven by raw compute scale and what appears to be a genuinely different capability profile, particularly in cybersecurity.

Anthropic explicitly states the Capybara tier models are 'very expensive to serve.' That phrasing is deliberate. It signals inference costs that put Mythos well outside consumer pricing, targeting instead deeply capitalized enterprise accounts, government contracts, and specialized API use cases where accuracy and capability matter more than cost per token.

3. What Claude Mythos 5 Can Actually Do

The leaked draft identifies three areas where Mythos achieves 'dramatically superior scores' compared to Claude Opus 4.6: software coding, academic reasoning, and cybersecurity. Anthropic has not released official benchmarks. What follows is based on confirmed capabilities from Project Glasswing testing and corroborated leak details.

Software Coding

Mythos reportedly handles multi-file, multi-step coding tasks at a level meaningfully above Claude Opus 4.6, which was already competitive with GPT-5.4 on SWE-Bench Verified. Early enterprise testers describe it as capable of understanding and modifying large codebases without losing context across complex dependency chains. Agentic coding sessions that would require multiple Opus calls can be completed in fewer Mythos turns.

Academic Reasoning

On the Artificial Analysis Intelligence Index, which aggregates performance across MMLU-Pro, GPQA Diamond, Humanity's Last Exam, LiveCodeBench, SciCode, AIME, and MATH-500, Claude 4 Sonnet in Thinking Mode scored 61. Claude Opus 4.6 scored higher. The leaked materials describe Mythos as pushing 'significantly further' beyond Opus on these same tests. Until Anthropic publishes official numbers, the exact benchmarks remain unconfirmed.

Cybersecurity: The Part That Has Washington Worried

This is where Mythos becomes genuinely unprecedented. In the weeks since Project Glasswing launched, the model has identified thousands of zero-day vulnerabilities across major operating systems and browsers. The oldest was a 27-year-old bug in OpenBSD, an operating system historically celebrated for its security record. It had evaded detection by professional security researchers for nearly three decades. Mythos found it.

Anthropic's own leaked draft warned: 'Mythos is currently far ahead of any other AI model in cyber capabilities and presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.'

That is not marketing language. That is a safety warning from the company that built the model. When Anthropic says they are concerned, the cybersecurity industry should take it literally. CrowdStrike, Palo Alto Networks, Zscaler, SentinelOne, Okta, Netskope, and Tenable shares fell between 5% and 11% following Fortune's initial report, reflecting how seriously investors read the implications.

4. Project Glasswing: The Controlled Release Strategy

Because Mythos is too powerful and too risky to release publicly, Anthropic built a containment strategy. Project Glasswing gives preview access to a curated list of organizations for defensive cybersecurity work only.

Named participants as of April 2026:

Amazon Web Services
Apple
Broadcom
Cisco
CrowdStrike
Google
JPMorgan Chase
Microsoft
Nvidia
Palo Alto Networks
Linux Foundation
Approximately 40 additional critical infrastructure organizations

Each organization can use Mythos Preview to find bugs in their own software, test whether known hacking techniques work against their products, and share learnings with the broader industry. The access is scoped to defense. Anthropic has briefed 'senior US officials across the US government' on the model's full offensive and defensive capabilities and made itself available to support government testing.

Why this matters: Unlike most AI labs, which release models and retroactively patch misuse, Anthropic is attempting pre-emptive defense distribution. The idea is to get defenders access to the same offensive capabilities before attackers do. Whether a few dozen organizations can outpace a global threat landscape is a real open question, but the attempt is meaningful.

5. Claude Mythos 5 vs GPT-5.4 vs Gemini 3.1 Ultra

Comparing Mythos to its peers is difficult because Anthropic has not published official benchmarks. Based on confirmed data points across GPT-5.4, Gemini 3.1 Ultra, and the leaked Mythos materials, here is the most accurate side-by-side available as of April 8, 2026.

Claude Mythos vs GPT-5.4

GPT-5.4 is the strongest publicly available coding model as of April 2026, scoring 71.7% on SWE-Bench Verified and 75.0% on the OSWorld desktop task benchmark. Its 'Thinking' variant adds test-time compute for complex reasoning. But GPT-5.4 does not claim any cybersecurity advantage near what Mythos is demonstrating. For coding in general, the gap between GPT-5.4 and Opus 4.6 is measurable but not dramatic. If Mythos is genuinely 'dramatically superior' to Opus 4.6, the gap over GPT-5.4 could be significant.

Claude Mythos vs Gemini 3.1 Ultra

Gemini 3.1 Ultra has the largest publicly available context window at 2 million tokens, native multimodal reasoning across text, image, audio, and video, and a GPQA Diamond score of 94.3%. It is genuinely excellent at long-context research and cross-modal reasoning. Where Mythos has an advantage is in raw compute scale and specialized cybersecurity capability. Gemini 3.1 Ultra is the model to use today for long-document analysis. Mythos may be the model for agentic security and deep reasoning tasks once it becomes accessible.

6. What Will Claude Mythos 5 Cost?

Anthropic has confirmed one thing about Mythos pricing: it will be expensive. The exact phrase used in the leaked materials is 'very expensive to serve,' and Anthropic has repeated this framing in official statements.

For context, Claude Opus 4.6 is priced at $15 per million input tokens and $75 per million output tokens via the API. Mythos, sitting in a tier above Opus, will almost certainly exceed those numbers. Based on the architecture (MoE with an estimated 10T total parameters) and Anthropic's stated efficiency concerns, I expect initial pricing in the range of $50-$100+ per million input tokens for API access, possibly higher for agentic or extended compute use cases.

Consumer availability via Claude.ai is plausible eventually, but unlikely at launch. The compute requirements make a flat-rate subscription implausible until Anthropic achieves the efficiency improvements they are explicitly working toward. Watch the Claude API changelog and the enterprise pricing page closely.

7. When Will Claude Mythos Be Publicly Available?

No official date has been announced. Anthropic's language throughout every public statement is conditional: they will release when safeguards are in place, when efficiency improves, when safety evaluations are complete.

Reading between the lines: the cybersecurity risk is the real bottleneck. Anthropic has briefed the US government. They are running a structured defense-first preview. The steps they are taking look like an organization trying to establish a track record of responsible deployment before they open the floodgates. That process takes months, not weeks.

A realistic public API timeline, based on Anthropic's historical cadence and the complexity of the safety evaluation here, is Q3 or Q4 2026. That estimate could slip to 2027 if offensive capability testing reveals risks that require architectural changes. If you need frontier capability today, Claude Opus 4.6 remains the best publicly available Claude model by a significant margin.

My honest prediction: Anthropic will release Mythos in a limited enterprise API tier, with heavy rate limits and strict use-case restrictions, before offering a broader release. Expect a waitlist, not a launch day free-for-all.

8. What This Means for AI Developers and Builders in 2026

If you build software, start auditing now

A model that found a 27-year-old OpenBSD bug is not a curiosity. It is a signal that entire classes of vulnerabilities that professional security teams have missed for decades are now findable at scale. If you maintain production software, the right move today is not waiting for Mythos access. It is running your codebase through Claude Opus 4.6 with a security-focused system prompt, then through the best available tools like Claude Code Security, and treating that as a floor, not a ceiling.

If you build on the Claude API, watch for the Capybara tier

When Anthropic adds Capybara to the API, the model IDs and pricing tier will appear in their API documentation. Set up monitoring on Anthropic's changelog, their developer Discord, and their status page. The first developers with production-ready Capybara integrations will have a competitive window before the ecosystem catches up.

The arms race just changed permanently

Anthropic's leaked draft warned that Mythos 'presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.' That sentence is not about Mythos specifically. It is about the trajectory. Every frontier lab is building in this direction. The era of AI-native offensive security capabilities is not coming. Based on what Mythos Preview has already found in testing, it is here. The defenders who adapt fastest will win.

Frequently Asked Questions

What is Claude Mythos 5?

Claude Mythos 5 is Anthropic's most powerful AI model to date, accidentally revealed on March 26, 2026, when an internal draft blog post was left in a publicly accessible data store. The model carries an estimated 10 trillion parameters, introduces a new model tier called Capybara that sits above Opus in Anthropic's lineup, and dramatically outperforms Claude Opus 4.6 on benchmarks for coding, academic reasoning, and cybersecurity. Anthropic confirmed the model exists and called it 'a step change' in performance.

Is Claude Mythos 5 available to use?

Claude Mythos 5 is not publicly available as of April 2026. Anthropic is running a controlled preview called Project Glasswing, giving access only to selected companies including AWS, Apple, Cisco, Google, JPMorgan Chase, Microsoft, Nvidia, CrowdStrike, and approximately 40 other critical infrastructure organizations. The model is restricted to defensive cybersecurity work. Anthropic has not announced a public release date, citing ongoing safety evaluations and the need for efficiency improvements before scaling.

How many parameters does Claude Mythos have?

Claude Mythos is estimated at approximately 10 trillion parameters, based on corroborated reports and leaked internal documents. Anthropic has not officially confirmed this figure. For context, GPT-4 was reported to have around 1.7 trillion parameters, and Claude Opus 4.6 is estimated in the hundreds of billions. Claude Mythos uses a Mixture of Experts (MoE) architecture, meaning not all 10 trillion parameters are active during each inference call.

What is Project Glasswing?

Project Glasswing is Anthropic's controlled access program for Claude Mythos Preview, named companies include AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, Nvidia, the Linux Foundation, and Palo Alto Networks, along with roughly 40 additional organizations responsible for critical software infrastructure. Participating companies can use Mythos Preview to find vulnerabilities in their own software and test security techniques on their products. In early testing, Mythos Preview identified thousands of zero-day vulnerabilities, including a 27-year-old bug in OpenBSD.

Is Claude Mythos better than GPT-5.4?

Based on leaked information confirmed by Anthropic, Claude Mythos outperforms GPT-5.4 in cybersecurity benchmarks and deep academic reasoning tasks. GPT-5.4 has advantages in computer use and desktop task automation, scoring 75% on OSWorld-Verified, and offers a 1M token context window with a live public API. Claude Mythos has no publicly confirmed context window or benchmark scores beyond descriptions of 'dramatically superior' results versus Claude Opus 4.6. A definitive comparison requires official benchmarks, which Anthropic has not yet released.

What is the Capybara tier in Claude?

Capybara is the internal codename for a new model tier Anthropic is adding above Opus in its model hierarchy. Where Anthropic currently offers Haiku, Sonnet, and Opus, Capybara sits above all of them. The Capybara tier is described in leaked documents as 'larger and more intelligent than our Opus models, which were until now our most powerful.' Claude Mythos is the first model in the Capybara tier. Anthropic explicitly notes this tier will be 'very expensive to serve' and targets enterprise and specialized deployments.

When will Claude Mythos be released publicly?

Anthropic has not announced a public release date for Claude Mythos. The company has stated it wants to safely deploy Mythos-class models at scale 'when new safeguards are in place' and after efficiency improvements reduce inference costs. The current phase is a limited cybersecurity-focused preview via Project Glasswing. Historically, Anthropic has launched major models within 6-12 months of internal testing, but the cybersecurity risk concerns around Mythos may extend that timeline significantly.

Why did Anthropic leak Claude Mythos early?

Anthropic did not intentionally leak Claude Mythos. On March 26, 2026, a configuration error in Anthropic's content management system left approximately 3,000 internal assets, including a draft blog post about Claude Mythos, in a publicly accessible data store. Security researcher Roy Paz discovered the exposed files. Anthropic quickly locked down the assets, confirmed the leak was due to human error in marking items as private, and released an official statement acknowledging the model's existence and development status.

References

Fortune - Anthropic says testing Mythos powerful new AI model after data leak reveals its existence'
Fortune - Anthropic giving some firms early access to Claude Mythos for cybersecurity
CNN Business - Anthropic will share its new AI model for cybersecurity defence
CoinDesk - Anthropic Claude Mythos leak reveals new AI model that could be a cybersecurity nightmare
SmartChunks - Anthropic 10T Parameter Model Puts Pressure On OpenAI, Google'
Google DeepMind - Gemini 3.1 Pro Model Card