Palo Alto Networks AI Audit Finds 85 Hidden Bugs
Key insights
- Palo Alto Networks found 85 previously unknown production vulnerabilities using Anthropic Mythos and OpenAI GPT-5.5 within weeks.
- CTO Lee Klarich set a concrete three-to-five-month window before AI-driven cyberattacks become routine business operations.
- This marks the first public disclosure of a major security vendor conducting a frontier-AI audit against its own live production codebase.
Why this matters
The 85-bug find in a security vendor's hardened production code is a concrete data point that frontier AI can outpace traditional static analysis and red-team cycles, which resets assumptions about how frequently enterprises need to run AI-assisted audits. Klarich's three-to-five-month warning is actionable in a way that generic AI threat forecasts are not, giving security and engineering leaders a deadline against which to measure current tooling and staffing. The fact that two competing frontier models were used in parallel suggests that multi-model auditing will become a standard practice, shifting procurement and vendor relationships for every company running critical infrastructure code.
Summary
Palo Alto Networks ran Anthropic's Mythos and OpenAI's GPT-5.5 against its own production codebase and surfaced 85 previously unknown vulnerabilities in a matter of weeks, making it the first major security vendor to publicly disclose a frontier-AI self-audit at production scale.
The company's chief technology officer Lee Klarich put a precise window on the defensive opportunity: three to five months before AI-driven attacks become routine. That framing turns this from a research curiosity into a near-term operational deadline for every enterprise security team.
Essentially: (Palo Alto Networks, Anthropic, OpenAI) have demonstrated that frontier models can find what years of internal security review missed.
- 85 unknown vulnerabilities surfaced within weeks, not months, using two separate frontier models running against live production code.
- Klarich's three-to-five-month window is a specific, time-bounded claim, distinct from vague warnings about AI-enabled threats.
- This disclosure is separate from earlier Palo Alto research on pentest-speed compression, meaning production-scale bug-finding is now a distinct, documented capability.
If attackers gain access to the same frontier models that just found 85 bugs in a hardened security vendor's own code, the asymmetry between offense and defense narrows faster than most enterprise roadmaps currently assume.
Potential risks and opportunities
Risks
- Enterprises that delay AI-assisted code audits past Klarich's three-to-five-month window face adversaries who will use the same frontier models offensively before defenders have patched equivalent vulnerability classes.
- Palo Alto Networks customers could question the security posture of products already shipped if any of the 85 bugs were in customer-facing production features, creating potential liability and churn ahead of renewal cycles.
- Smaller security vendors without budget access to frontier model APIs (Mythos, GPT-5.5) face a widening capability gap in their own product security, making them softer targets and less credible in enterprise procurement conversations within the next six months.
Opportunities
- AI-assisted code audit vendors (Semgrep, Socket, Snyk) can use this disclosure to accelerate enterprise deals by benchmarking their tooling against the Palo Alto 85-bug result as a public reference case.
- Anthropic and OpenAI gain a referenceable enterprise security customer in Palo Alto Networks, strengthening the case for frontier model API contracts with other Fortune 500 security and infrastructure companies.
- Managed security service providers (MSSPs) offering AI-augmented red-team services can move immediately to productize multi-model audits similar to the Palo Alto approach, targeting the window Klarich identified before demand peaks.
What we don't know yet
- Severity breakdown of the 85 bugs is undisclosed: how many were critical or remotely exploitable versus low-severity logic errors?
- Whether Palo Alto Networks has shared the audit methodology or vulnerability classes with CISA or industry partners ahead of the three-to-five-month window Klarich named.
- Which version or capability tier of Anthropic's Mythos was used, given that public availability and API access terms for Mythos remain unclear as of May 2026.
Originally reported by axios.com
Read the original article →Original headline: Palo Alto Networks Used Mythos and GPT-5.5 to Find 85 Unknown Bugs in Its Own Production Code — Warns AI-Driven Attacks Will Be 'New Norm' in Three to Five Months