buildfastwithaibuildfastwithai
GenAI LaunchpadAI WorkshopsAll blogs
Download Unrot App
Free AI Workshop
Share
Back to blogs

Project Glasswing: Claude Mythos Found 23,019 Bugs

May 25, 2026
16 min read
Share:
Project Glasswing: Claude Mythos Found 23,019 Bugs
Share:

Project Glasswing: How Claude Mythos Found 23,019 Vulnerabilities - and Why That Is the Smaller Story

On May 22, 2026, Anthropic published the first month of results from Project Glasswing — and the headline numbers are doing exactly what they were designed to do: dominate every cybersecurity feed in the world. A restricted preview model called Claude Mythos scanned over 1,000 open-source projects and flagged 23,019 vulnerabilities. Of those, 6,202 were estimated high or critical severity. Independent security firms checked a 1,752-finding sample and confirmed 90.6% were real bugs.

Mozilla patched 271 of them in Firefox 150 — a single release. Cloudflare found 2,000 across its critical infrastructure. A banking partner stopped a $1.5 million fraudulent wire transfer mid-execution. One wolfSSL vulnerability, CVE-2026-5194, would have allowed attackers to forge TLS certificates across billions of IoT and industrial devices.

That is the lead. Here is the real story, the one that should bother you more: less than 1% of the vulnerabilities Mythos found have actually been patched. The bottleneck in security has officially moved. We can find bugs faster than humans can ever read about them. We cannot fix them at anything close to that pace. And the same model that finds them defensively will, eventually, be in the wrong hands.

1. What is Project Glasswing?

Project Glasswing is Anthropic's restricted cybersecurity initiative, announced on April 7, 2026 and reporting first results on May 22, 2026. It is the deployment vehicle for a model called Claude Mythos Preview — Anthropic's most powerful and most restricted model — into the hands of roughly 50 vetted partner organizations for defensive security work only.

The partner list reads like a CISA address book. AWS, Apple, Cisco, Google, JPMorgan Chase, Microsoft, NVIDIA, CrowdStrike, Cloudflare, and Mozilla are among the confirmed participants. The UK AI Security Institute is involved on the evaluation side. The US government has been briefed at CISA and the Commerce Department. Anthropic has committed $100 million in model usage credits specifically for defensive cybersecurity work under the program.

The rules are strict by design. Partners may only use Mythos for finding and fixing vulnerabilities in their own software, or in open-source projects, and only as part of coordinated defensive efforts. Offensive use — even authorized penetration testing — sits outside the program.

If you want the full backstory of how Mythos was accidentally leaked in March 2026, formally announced two weeks later, and then deliberately locked behind a 50-org wall, our Claude Mythos release-date and access deep-dive covers that timeline in detail. This post is the post-Month-1 update.

2. What is Claude Mythos?

Claude Mythos is Anthropic's frontier AI model that sits one tier above Claude Opus in the company's published lineup. In the leaked internal documents from March 2026, this tier carried the codename Capybara — described as 'even larger and more capable than Opus, but also more expensive.' Mythos is the first model in that tier to ship in any form.

Anthropic's own framing is precise and worth quoting exactly: Mythos is 'a new general-purpose language model that performs strongly across the board, but is strikingly capable at computer security tasks.' Three words there are doing all the work. 'General-purpose' — this is not a specialized security tool. 'Strikingly capable' — Anthropic chose that adjective deliberately. 'Strongly across the board' — the security capability is a side effect of overall reasoning quality, not a fine-tuned vertical.

On the CyberGym benchmark (vulnerability reproduction), Mythos Preview scores 83.1%. For context, Claude Opus 4.7 scores 73.1% and GPT-5.4 scores 66.3%. That 10-point gap from Opus to Mythos is the difference between 'good at finding bugs' and 'genuinely changes the offense-defense balance.' Anthropic deliberately trained Opus 4.7 to be less capable than Mythos at cybersecurity tasks — the public model ships with automatic detection and blocking of prohibited security uses.

The UK AI Security Institute's evaluation went further. Mythos became the first model to solve both of the institute's full cyber ranges end to end. It autonomously completed a 32-step corporate network attack simulation called 'The Last Ones' in three out of ten attempts. It chains multiple small vulnerabilities into single devastating attacks, reconstructs source code from deployed binaries, and builds custom lateral-movement tooling on the fly. That is the capability Anthropic is choosing not to release.

3. The Month 1 numbers, in context

Here are the open-source scan results, which are the most clearly attributable numbers in the report:

Here are the open-source scan results, which are the most clearly attributable numbers in the report:

The 23,019 total findings come from Anthropic's own scan of over 1,000 open-source projects over the past few months. The 6,202 number is the estimated high- or critical-severity subset. The 90.6% true-positive rate is the result of independent security firms manually validating 1,752 of the high/critical findings — and 62.4% of that sample were confirmed at the original severity level.

Beyond Anthropic's internal scans, partners deploying Mythos found additional vulnerabilities at unprecedented rates:

Beyond Anthropic's internal scans, partners deploying Mythos found additional vulnerabilities at unprecedented rates:

Two numbers worth highlighting. Mozilla's 271 Firefox vulnerabilities is more than 10× what the same team found in Firefox 148 using Claude Opus 4.6 in the prior cycle. Firefox CTO Bobby Holley described the defects as 'finite' and said defenders can 'finally find them all.' Cloudflare's report notes that Mythos's false-positive rate beat human penetration testers on their internal systems — a structural change, not a marginal one.

Across all 50 partners combined, the broader bug-finding rate has 'increased by more than a factor of ten,' according to Anthropic's report. Most partners individually report 'hundreds' of high- or critical-severity findings inside their own software. Microsoft's recent statement that future Patch Tuesday releases will 'continue trending larger for some time' is, per Engadget's reporting, directly attributable to Mythos findings.

4. Inside the wolfSSL, Firefox, and bank-fraud catches

Three findings out of the 23,019 are worth understanding in depth, because they illustrate what makes Mythos categorically different from the static-analysis tools it is replacing.

CVE-2026-5194 — the wolfSSL certificate-forgery flaw

wolfSSL is a TLS library embedded in roughly five billion devices — IoT sensors, automotive systems, industrial control gear, even satellite firmware. It is one of the most-deployed pieces of cryptography software in the world, and it is maintained by a small commercial team. Mythos identified a flaw in its certificate validation logic that would have let an attacker forge digital certificates and stand up convincing fake versions of bank, email, or any TLS-protected website. The patched CVE-2026-5194 carries a CVSS score in the 9.1+ range — multiple sources have reported it as effectively CVSS 10.0 in the worst-case configuration.

The exploit chain Mythos built was not a one-line bug. It was a multi-step composition that defeated the library's defense-in-depth. Static analyzers had not flagged it. Fuzzers had not surfaced it. Human researchers had reviewed the code for years. Mythos read it once.

271 Firefox 150 fixes

Mozilla shipped Firefox 150 on a Monday with a single security advisory page citing 271 vulnerabilities — every one of them surfaced by Mythos Preview during a single audit cycle. The prior cycle, using Claude Opus 4.6 on Firefox 148, surfaced 22. The 12× increase is not a marginal improvement on the same workflow; it is a different category of capability applied to the same codebase.

The disclosure pace forced Mozilla to compress its normal security release cadence. Some of the 271 had been sitting undisturbed in the codebase for years — invisible to static analysis, fuzzing, and expert manual review. They are not invisible to a frontier model that can hold the entire browser engine in its context window and reason about cross-component data flows the way a security researcher does.

The $1.5M bank fraud catch

Mythos's defensive use cases are not limited to source code. An unnamed banking partner used Mythos to detect and prevent a fraudulent $1.5 million wire transfer in progress. The attack pattern: a customer's email account was compromised, the attacker placed spoofed phone calls to relationship managers, and the wire was authorized through normal channels. Mythos identified the fraud pattern in the cross-channel signal — email metadata, voice transcripts, transaction request — before the wire executed.

This is the part of the report that does not get enough attention. Mythos as a vulnerability hunter is one application. Mythos as a real-time fraud-pattern analyzer across structured and unstructured data is a different one, and arguably a bigger commercial opportunity than the source-code work.

5. The patching bottleneck nobody planned for

Here is the line buried in the Picus Security analysis of the Glasswing report, citing Anthropic's own data: fewer than 1% of the vulnerabilities Mythos found have actually been patched. Read that twice.

The vulnerability discovery problem is, for the first time in computing history, solved at scale. The patching problem is not. Some open-source maintainers have reportedly asked Anthropic to slow the pace of disclosures because they need more time to develop fixes — a request that would have been unthinkable two years ago. Enterprise deployments are moving faster: internal teams patched more than 2,100 vulnerabilities within three weeks, because they control their own repositories and can ship fixes without coordinating with volunteer maintainers.

The asymmetry between enterprise patching speed and open-source maintainer capacity is the real strategic problem. Anthropic committed $4 million in donations to open-source security groups alongside the May 22 update — directionally correct, structurally insufficient. For the broader context on how Anthropic's enterprise security stack is rolling out, our Claude Security review covers the public-tier version of these capabilities that is shipping today via Claude Opus 4.7 — the model trained specifically to be less capable than Mythos.

The uncomfortable arithmetic: if a private-preview model with restricted access can find 23,019 vulnerabilities in 30 days, what does the patch queue look like when the same capability ships to every Fortune 500 security team in 2027? And when the same capability — through model exfiltration, an open-weight competitor, or just the next generation of GPT — reaches attackers?

6. Why Mythos is being held back from public release

Anthropic's official position is unambiguous: Mythos-class capability is too dangerous to release broadly. The company explicitly does not plan to make Mythos Preview generally available. The roadmap is to develop stronger safeguards first, then make Mythos-class systems available at scale once those safeguards are validated.

Three pieces of evidence make that position credible rather than performative. First: Claude Opus 4.7, the public flagship released April 16, 2026, was deliberately trained to have lower cybersecurity capabilities than Mythos. Opus 4.7 ships with automatic detection and blocking of prohibited cyber use cases. Anthropic invented an entire intermediate testing tier to validate Mythos-tier safeguards on a less dangerous model first.

Second: the UK AI Security Institute's finding that Mythos can autonomously execute multi-stage network attacks is documented in the institute's own evaluation report, not Anthropic marketing material. Third: posts on X from independent researchers including Nicholas Carlini have publicly verified specific exploit chains Mythos produced — including the wolfSSL CVE-2026-5194 root cause analysis.

For the speculative roadmap — including the rumored 'Mythos 1.1 cybersecure' variant that has appeared in research threads, and Anthropic's apparent strategy of training safeguards on Opus before relaxing them on Mythos — our deep-dive on Mythos in Google Cloud and the missing 'Preview' label tracks the public signals Anthropic is sending about how this release schedule actually plays out.

My honest read: independent analysts estimate limited enterprise access no earlier than late 2026, with broader API availability in 2027 or beyond. That timeline assumes safeguards work. If they don't, this could stretch to 2028.

7. What Project Glasswing means for the rest of us

If you are a security team without Glasswing access — which is almost everyone — three things change starting now.

One: your threat model needs to assume the attacker has Mythos-equivalent capability within 18 months. Maybe through an exfiltrated Mythos, maybe through a comparable model from a competing lab, maybe through open-weight progress. Plan defenses accordingly. The bugs an autonomous AI finds in your codebase will not stay private forever, even if your AI vendor swears it is keeping them safe.

Two: the supply chain is now the front line. Every open-source dependency in your stack just had its threat surface re-evaluated by a model that finds things human reviewers missed for decades. The 23,019 number is a count of how much undiscovered debt your software has been carrying. Pin versions. Subscribe to upstream security mailing lists. Have a patch-application SLA. If the upstream maintainer is one person on a hobby project, build a contingency.

Three: the public-tier tools available today are the rational starting point. Claude Opus 4.7 with the Claude Security product is shipping commercially, scans production codebases by reasoning about data flows the way a human researcher does, and is integrated with CrowdStrike, Microsoft Security, Palo Alto Networks, SentinelOne, Trend Micro, and Wiz. For a deeper look at how this layer compares to traditional SAST tools, our Claude Security vs Snyk breakdown walks through the practical setup and what it actually finds in production code.

If you are building defensive tooling on top of Anthropic's stack and want production-ready patterns for vulnerability discovery agents, RAG pipelines for security knowledge bases, and multi-agent code review setups, the 130+ open-source GenAI cookbooks at Build Fast with AI cover the LangChain, CrewAI, and agent-orchestration patterns that compose into real security workflows on the publicly available Claude tier.

The era of human-paced vulnerability discovery is over. We are now in the era where the model finds the bug faster than the maintainer can drink coffee. Project Glasswing is the first month of that era. The next year is going to be loud.

8. Frequently Asked Questions

What is Project Glasswing?

Project Glasswing is Anthropic's restricted cybersecurity initiative, announced on April 7, 2026 and reporting first-month results on May 22, 2026. It gives approximately 50 vetted partner organizations — including AWS, Apple, Cisco, Google, JPMorgan Chase, Microsoft, NVIDIA, CrowdStrike, Cloudflare, and Mozilla — controlled access to Claude Mythos Preview for defensive cybersecurity work only. Anthropic has committed $100 million in model credits to the program.

What is Claude Mythos?

Claude Mythos is Anthropic's frontier AI model in a new compute tier above Claude Opus. The internal codename for the tier was Capybara. Anthropic describes Mythos as 'a general-purpose language model that performs strongly across the board, but is strikingly capable at computer security tasks.' On the CyberGym vulnerability reproduction benchmark, Mythos Preview scores 83.1%, compared to 73.1% for the publicly available Claude Opus 4.7.

How many vulnerabilities did Claude Mythos find in Project Glasswing's first month?

Anthropic's internal scan of more than 1,000 open-source projects flagged 23,019 total vulnerabilities, including approximately 6,202 estimated high- or critical-severity findings. Independent security firms validated a 1,752-finding sample and confirmed a 90.6% true-positive rate. Across all 50 partner organizations combined, more than 10,000 high- or critical-severity vulnerabilities were identified in the first 30 days.

Why is Claude Mythos not publicly released?

Anthropic has determined that Mythos-class capability poses too high a risk for general release. The model can autonomously chain vulnerabilities into multi-stage exploit chains, reconstruct source code from binaries, and complete full 32-step corporate network attack simulations. The UK AI Security Institute independently confirmed these capabilities. Anthropic's stated plan is to develop stronger safeguards on less capable models (Claude Opus 4.7 was the first such test) before considering broader Mythos-class deployment. Independent analysts estimate limited enterprise access no earlier than late 2026.

Which companies are Project Glasswing partners?

Confirmed partners include AWS, Apple, Cisco, Google, JPMorgan Chase, Microsoft, NVIDIA, CrowdStrike, Cloudflare, and Mozilla. The UK AI Security Institute is involved on the evaluation side. Anthropic has briefed senior officials at CISA, the US Commerce Department, and other federal agencies. The total partner count is approximately 50 organizations, described by Anthropic as 'the most systemically important cyber defenders.'

What is CVE-2026-5194?

CVE-2026-5194 is a critical vulnerability in wolfSSL — an open-source TLS cryptography library embedded in roughly five billion IoT, automotive, and industrial-control devices — discovered by Claude Mythos Preview. The flaw would have allowed attackers to forge digital certificates and create indistinguishable fake versions of banking, email, and other TLS-protected websites. The CVE was assigned a CVSS score in the 9.1+ range, with some reports citing CVSS 10.0 in worst-case configurations. The vulnerability has been patched.

Can I get access to Claude Mythos?

No. Claude Mythos Preview is not publicly available and cannot be purchased through the standard Anthropic API. Access is restricted to the approximately 50 Project Glasswing partner organizations, selected based on systemic importance and cyber-defense maturity. The publicly available Anthropic model is Claude Opus 4.7, which was deliberately trained to have lower cybersecurity capabilities than Mythos. For enterprise security teams, Anthropic offers Claude Security (built on Opus 4.7) via claude.ai/security for Enterprise customers.

Is Mythos used for offensive or defensive purposes only?

Strictly defensive. Project Glasswing rules require partners to use Mythos exclusively for finding and fixing vulnerabilities in their own software or in open-source projects, as part of coordinated defensive efforts. Offensive use — including authorized penetration testing — sits outside the program. The Frontier Red Team at Anthropic uses Mythos for adversarial evaluation purposes only, with findings disclosed through standard coordinated disclosure channels.

Recommended Blogs

  • Claude Mythos: Release Date, Access, and What Comes Next (2026)
  • Claude Mythos 5 Review: Anthropic's 10-Trillion Parameter Model
  • Claude Mythos in Google Cloud: What the Missing 'Preview' Label Means
  • Claude Security: How It Works, What It Finds, vs Snyk (2026)
  • Claude Opus 4.7: Full Review, Benchmarks & Features (2026)
  • Claude Code Source Code Leak: The Full Story 2026
  • AI News Today - May 21, 2026: 13 Biggest Stories

References

  • Anthropic — Project Glasswing official page
  • Engadget — Anthropic says Mythos has already found more than 10,000 vulnerabilities
  • The Next Web — Claude Mythos found 10,000 critical vulnerabilities in one month
  • The Next Web — Mozilla fixes 271 Firefox vulnerabilities found by Claude Mythos
  • Interesting Engineering — Project Glasswing 10,000 critical software flaws
  • HotHardware — Anthropic Project Glasswing targets safer AI agents
  • Investing.com — Anthropic finds over 10,000 software flaws in first month
  • Picus Security — Project Glasswing analysis and patching paradox
  • Forrester — Project Glasswing: The 10 Consequences Nobody's Writing About Yet

Cloud Security Alliance — Claude Mythos vulnerability discovery whitepaper

Enjoyed this article? Share it →
Share: