The pattern this week is uncomfortable to name out loud: the same systems being built to defend critical infrastructure are being stolen, leaked, or weaponised against it within days of release. Supply chains are the soft underbelly — third-party AI tools, vendor portals, unmaintained sandboxes — and breach windows have collapsed from hours to seconds. Meanwhile the regulatory machinery is fragmenting, with state legislatures, federal agencies, and foreign governments all trying to draw the line in different places, often after the line has already moved.

Get more from AI Weekly

More signal, less noise — pick your channels.

You're reading the weekly brief. Below are the other ways to follow the story — every channel free, easy to leave.

  • → Explore 16 deep dives
    Weekly topic-specific newsletters: Generative AI, Machine Learning, AI in Business, Robotics, Frontier Research, Geopolitics, Healthcare, and more.
    Browse all 16 deep dives →
  • → Breaking AI alerts
    When something major breaks (a $60B acquisition, a regulator's emergency meeting, a frontier model leak), alert subscribers know within hours. Typically 0-2 emails per day.
    Get breaking alerts →
  • → AI News Today (live)
    Live dashboard updated as the scanner finds news: scored stories from the last 48 hours, weekly entity movers, and quarterly trend lines across 113 AI companies, people, and topics.
    Open AI News Today →

The Big Story

Hackers breach Anthropic's restricted Claude Mythos via vendor credentials · April 22-23 · [Fortune] · [Euronews]
→ A Discord-linked group accessed Claude Mythos Preview — a model Anthropic restricted under Project Glasswing because it can autonomously discover and exploit zero-days — by guessing endpoint URLs and pivoting through a third-party contractor portal. The implication is brutal: containment of a "too dangerous to release" model failed not at the model layer but at the procurement layer, and the same week Microsoft confirmed it had embedded Mythos into its Security Development Lifecycle, Japan stood up a first-of-its-kind financial taskforce to deal with the vulnerabilities Mythos has been finding. Defenders and attackers are now drinking from the same well.


Also This Week

Cohere AI Terrarium sandbox: root code execution and container escape (CVE-2026-5752) · April 23 · [The Hacker News]
Critical sandbox escape (CVSS 9.3) in Cohere's Terrarium Python sandbox via JavaScript prototype-chain traversal — code runs as root, escapes the container, reads sensitive files. The project is unmaintained, so a patch is unlikely. Treat AI-vendor sandboxes as untrusted by default.

Checkmarx supply chain attack hits Bitwarden CLI, KICS Docker, VS Code extensions · April 23 · [The Hacker News]
→ A 90-minute injection window was enough to push credential-stealing payloads into images with 5M+ pulls; if you ran a CI/CD job on April 22, rotate everything.

Florida AG opens criminal probe into OpenAI over FSU shooter chat logs · April 22 · [ClickOrlando]
→ Prosecutors say a human giving the same answers about weapon selection and lethality would face murder charges — the first serious test of whether model output is treated as speech, product, or accomplice.

DOJ joins xAI's lawsuit to block Colorado's algorithmic discrimination law · April 24 · [Axios]
→ First federal intervention against a state AI rule, six days before SB24-205 takes effect; if the Equal Protection argument lands, the patchwork of state AI laws collapses overnight.

Minnesota becomes first US state to ban AI nudification apps · April 24 · [PetaPixel]
→ HF 1606 carries $500K civil penalties and a private right of action — narrower than Colorado, but exactly the kind of harm-specific statute that survives federal preemption fights.

Meta installs keystroke and screenshot recorders on US employees to train agents · April 21-22 · [TechCrunch] · [CNBC]
→ The Model Capability Initiative is the cleanest articulation yet of the bargain inside frontier labs: workforce surveillance is the training corpus for the agents that replace the workforce.


From the Lab

Forcepoint X-Labs documents 10 in-the-wild indirect prompt injections targeting AI agents · [Forcepoint X-Labs]
Ten verified indirect-prompt-injection payloads found on live websites: financial fraud, API key exfiltration, and attempts to coerce shell-enabled agents into destructive commands (e.g. sudo rm -rf). The class moves from theoretical to enumerated.

Anthropic MCP design flaw exposes 200K+ servers; Anthropic declines to patch · [The Register]
→ OX Security found arbitrary code execution paths across the official MCP SDKs in Python, TypeScript, Java, and Rust — 150M+ downloads, 10+ downstream CVEs in Cursor, VS Code, and Claude Code. Anthropic's response: "expected behaviour." Treat MCP servers like you'd treat an unauthenticated RCE primitive, because that's what they are.


Worth Reading


If your threat model still assumes humans are in the loop on either side, you don't have a threat model — you have a memory.

— Alexis