securityweek.com via Reddit

Anthropic Mythos: 23,000 Bugs Found, Only 75 Patched

8 sources tracking this story
anthropic cybersecurity ai-security cybersecurity

Key insights

  • Mythos Preview scores 83.1% on CyberGym vulnerability reproduction versus Opus 4.6's 66.6%, a performance gap that justifies the controlled-access structure.
  • Already-patched discoveries include a 27-year-old OpenBSD vulnerability and a 16-year-old FFmpeg flaw, placing the patching deficit in documented historical terms.
  • ENISA's entry as the first non-US/UK partner required US government approval and European Commission travel to San Francisco, signaling the program's geopolitical weight.

Why this matters

Anthropic's first-party benchmarks confirm Mythos Preview scores 83.1% on CyberGym vulnerability reproduction versus Opus 4.6's 66.6%, and the program has already surfaced a 27-year-old OpenBSD flaw and a 16-year-old FFmpeg flaw before public disclosure. The institutional footprint has expanded to NATO, ENISA, Okta, Samsung, SK Hynix, and SK Telecom across 14 countries, with ENISA's entry requiring US government approval and European Commission travel to San Francisco, placing the program in the diplomatic register as much as the technical one. Practitioner commentary frames the patching pipeline as the critical failure point: AI-generated vulnerability advisories are already outpacing open-source maintainer capacity, driving Chainguard's $50M hard-fork proposal for unified disclosure infrastructure. Daniela Amodei's public characterization of Mythos as 'very good at cyber warfare' is the first direct acknowledgment of offensive capability from Anthropic leadership, arriving as the company files its IPO S-1.

Summary

Anthropic is adding roughly 150 organizations to Project Glasswing, raising Claude Mythos Preview's partner count to around 200 across more than 15 countries. New participants include Okta, Samsung, NATO, and ENISA (per Financial Times reporting), joining earlier partners like Mozilla, Palo Alto Networks, and Cloudflare. The selection threshold: systems whose breach could affect more than 100 million people, with national and global security implications. Essentially: (Anthropic, NATO, Samsung, ENISA) running an AI-powered security sweep of critical global infrastructure. - Mythos has flagged more than 23,000 potential vulnerabilities since the program launched with around 50 partners in early April. - More than 6,000 of those are expected to be confirmed as severe flaws. - Only 75 critical and high-severity issues have been patched so far. Anthropicis now working to substantially scale up reviewing and patching of open-source vulnerabilities, but detection has clearly outpaced remediation.

Potential risks and opportunities

Risks

  • The more than 6,000 expected severe vulnerabilities in widely used codebases represent an active exposure window if threat actors independently discover the same flaws before patches are ready.
  • Okta and Samsung face reputational and customer-trust exposure if vulnerabilities found by Mythos become public before patches are deployed, given their prominence in the named partner list.
  • Anthropic's expanding access to codebases for NATO and ENISA partners makes the Mythos platform itself a high-value target for nation-state espionage or interference.

Opportunities

  • Vulnerability triage and management vendors (Snyk, Veracode, Tenable) are positioned to capture budget from organizations overwhelmed by the volume of findings Mythos is generating.
  • Critical infrastructure vendors with demonstrably fast patch cycles can use Project Glasswing participation as a differentiator when bidding on government and defense contracts.
  • Anthropic gains proprietary insight into real-world vulnerability patterns across critical infrastructure codebases, strengthening Claude Mythos as a commercial security product beyond the current program.

What we don't know yet

  • No timeline or methodology disclosed for how Anthropic and partners plan to triage and close the gap between more than 23,000 flagged vulnerabilities and only 75 patches.
  • Whether NATO and ENISA's participation grants Anthropic access to sensitive or classified infrastructure codebases, and how that data is governed and retained.
  • Which specific open-source projects scanned by Mythos carry the most critical unpatched flaws, and whether affected maintainers have been formally notified ahead of any disclosure.

What others are reporting

Coverage cluster as of 24h after publish

  1. Anthropic Read →

    First-party source with Mythos Preview's 83.1% CyberGym benchmark, $100M usage credits, $4M donations, 90-day disclosure timeline, and named zero-days already patched before public release.

    AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure.
  2. TechCrunch Read →

    Names the new entrants (Okta, Samsung, SK Hynix, SK Telecom, NATO, ENISA) and identifies the 14 countries, grounding the program's scope in named actors rather than counts.

    A successful attack on their codebase could be catastrophic; for most partners, a major attack could affect over 100 million people.
  3. Cybersecurity Dive Read →

    Focuses on sector breadth across power, water, healthcare, and telecom, and flags the AI-generated vulnerability report flooding problem facing open-source maintainers.

  4. Help Net Security Read →

    Centers the patching-pipeline bottleneck through expert commentary warning of 'giant backlogs' as AI advisory volume outpaces human remediation capacity.

    Within 6 to 12 months, we expect that many other AI companies will have Mythos-class models, and they could release them without safeguards.
  5. Forbes Australia Read →

    First extended public comments from Amodei on Mythos capabilities, framing the expansion against Anthropic's IPO timeline and the competitive pressure from OpenAI's expected listing.

    The model is 'very good at cyber warfare.'
  6. DevOps.com Read →

    Chainguard CEO Dan Lorenc treats Mythos as 'real' and proposes a hard fork with a trusted fork registry, addressing the maintainer bandwidth gap as an infrastructure problem.

    We need new trust infrastructure for open-source consumption.
  7. Business-press framing of the expansion, connecting Glasswing's patching gap to Anthropic's IPO filing and broader AI market context.