Pwn2Own Berlin Adds AI Hacking Category, Sells Out
Key insights
- Pwn2Own's first AI category saw Cursor, OpenAI Codex, and LiteLLM all compromised within two days at OffensiveCon 2026.
- Researchers earned $908,750 from 39 unique zero-days in two days, with the event selling out for the first time in 19 years.
- Rejected researchers publicly released zero-days against Claude Code, Copilot, Ollama, and LM Studio after being turned away due to oversubscription.
Why this matters
AI coding agents have crossed from 'theoretically exploitable' to 'reliably exploitable at scale' — the oversubscription proves researchers had more working exploits than slots, meaning the known vulnerability surface is almost certainly larger than what was demonstrated. Vendors shipping AI coding tools (Cursor, GitHub, Anthropic, OpenAI) now face the same coordinated disclosure timelines and public patch-or-shame pressure that browser makers have operated under for years, compressing their security response windows significantly. The public release of rejected zero-days targeting Claude Code, Copilot, Ollama, and LM Studio means unpatched, researcher-confirmed exploits are already in the wild, creating immediate risk for enterprises running these tools in production.
Summary
Pwn2Own Berlin 2026 made AI coding agents a formal attack surface, with Cursor, OpenAI Codex, and LiteLLM all successfully exploited across the first two days of competition at OffensiveCon.
Researchers collected $908,750 across 39 unique zero-days in just two days, including a $200,000 Microsoft Exchange RCE chain, a $30,000 Cursor exploit, and a $20,000 OpenAI Codex zero-day. The event sold out for the first time in the competition's 19-year history, with organizers attributing the oversubscription directly to a surge in AI exploit submissions.
Essentially: (Cursor, OpenAI, LiteLLM) are now in the same vulnerability disclosure ecosystem as Windows and Exchange.
- Oversubscribed researchers who didn't make the cut publicly released zero-days targeting Claude Code, Copilot, Ollama, and LM Studio.
- The AI category's debut signals that exploit researchers have enough reliable, reproducible attack paths against production AI tools to justify the structured payout model.
- $908,750 in two days is a pace that rivals historically dominant categories like browsers and operating systems.
AI coding agents moving into Pwn2Own's formal bracket means their vulnerability lifecycles will now follow the same coordinated disclosure and patch pressure that browser vendors have navigated for a decade.
Potential risks and opportunities
Risks
- Enterprises running Cursor or OpenAI Codex in developer workflows are exposed to unpatched zero-days with no confirmed patch timeline, creating a window for supply-chain-style code injection attacks targeting source repositories.
- Claude Code and Copilot users face active risk from the publicly released rejected exploits, which were disclosed without the vendor coordination that Pwn2Own's standard process enforces, leaving no coordinated patch-first window.
- LiteLLM, widely used as an API proxy layer in enterprise AI infrastructure, being compromised at Pwn2Own could trigger security audits and procurement pauses across organizations that route multiple model providers through it.
Opportunities
- AI application security vendors (Protect AI, Lakera, Invariant Labs) gain a credible sales narrative: Pwn2Own now provides third-party proof that production AI agent stacks require dedicated security tooling.
- Bug bounty platforms (HackerOne, Bugcrowd) and Pwn2Own organizer ZDI can expand AI-specific researcher programs with higher payouts, capturing researcher interest that oversubscribed Berlin 2026 could not accommodate.
- Cyber insurers (Coalition, At-Bay, Resilience) can reprice AI coding tool coverage upward and introduce AI-agent-specific policy riders, with Pwn2Own results now providing actuarial grounding for elevated risk models.
What we don't know yet
- Technical details of the Cursor and OpenAI Codex exploits have not been publicly disclosed — unclear whether vulnerabilities are in the LLM inference layer, the agent orchestration layer, or the IDE integration surface.
- Whether Anthropic, GitHub, Ollama, and LM Studio received advance notice of the publicly released zero-days before researchers posted them, and what patch timelines those vendors have committed to.
- Prize payout structure for the new AI category relative to legacy categories is unconfirmed — whether AI targets carry lower caps that would disincentivize top researchers in future years.
Originally reported by bleepingcomputer.com
Read the original article →Original headline: Pwn2Own Berlin 2026 Debuts AI Category for First Time — Cursor, OpenAI Codex, and LiteLLM All Fall on Days One and Two as Competition Sells Out in 19-Year First