The Register web signal

Fable 5 Export Ban Triggered by 'Fix This Code' Prompt

anthropic cybersecurity safety regulation ai-security ai-policy anthropic-ban

Key insights

  • Fable 5's alleged jailbreak was 'fix this code' after the model initially refused a direct vulnerability review request.
  • Moussouris, the only outside expert to read the triggering research paper, calls this standard defensive find-fix-test security work.
  • The controls conflict with Wassenaar Arrangement cybersecurity exemptions Moussouris helped negotiate across 42 nations from 2013 to 2017.

Why this matters

The line between a guardrail bypass and legitimate defensive security work is now a live regulatory question with legal weight: the US government restricted two Anthropic models based on a prompt sequence any security engineer would recognize as routine. For AI teams building or deploying security tooling, this signals that vulnerability-analysis capability carries export-control risk even when the use case is purely defensive. The Wassenaar Arrangement precedent, a 42-nation consensus on cybersecurity exemptions, is now in direct tension with unilateral US restrictions, which could fragment how frontier models are deployed globally.

Summary

Anthropic's Fable 5 and Mythos 5 were disabled for all customers after US export controls flagged a supposed jailbreak that was simply the phrase 'fix this code.' Katie Moussouris, CEO of Luta Security and the only outside expert to read the research paper behind the ban, says this is standard defensive security work. Researchers fed the models code with known vulnerabilities, asked them to review for security issues (Fable 5 refused), then asked to 'fix this code' (it complied), and followed up with additional prompts to generate test scripts. She argues this isn't a guardrail bypass; it's the find-fix-test loop defenders run every day. Essentially: (Anthropic, US government) differ on what counts as dangerous AI capability. - Over 100 cybersecurity leaders signed an open letter opposing the controls. - The restrictions conflict with Wassenaar Arrangement cybersecurity exemptions Moussouris helped negotiate between 2013 and 2017. - Restricting these models leaves defenders worse off while open-weight Chinese models close the gap.

Potential risks and opportunities

Risks

  • Security teams relying on Fable 5 and Mythos 5 for vulnerability patching workflows lose access with no confirmed reinstatement timeline, degrading patch verification capacity now.
  • If the 'fix this code' standard is applied broadly, other frontier models face export restrictions that cripple defensive security tooling across the industry.
  • Moussouris warns these restrictions make models 'worse at finding bugs and verifying patches,' handing a structural advantage to attackers using unrestricted open-weight alternatives.

Opportunities

  • Open-weight model providers whose code-analysis capabilities remain unrestricted gain immediate positioning with security teams displaced from Anthropic's flagged models.
  • Cybersecurity policy experts with Wassenaar Arrangement negotiating experience, a rare credential Moussouris holds, become critical advisors as companies navigate export-control compliance.
  • AI vendors that explicitly document their tooling as operating within Wassenaar cybersecurity exemptions can capture customers seeking a compliant alternative to Fable 5.

What we don't know yet

  • Whether the third-party research paper behind the export restrictions will be shared with additional independent experts beyond Moussouris, or remains classified.
  • How Anthropic intends to distinguish permissible defensive security workflows from restricted dual-use prompts in future model versions, given 'fix this code' is the trigger.
  • Whether open-weight Chinese models approaching Fable 5 capability will face equivalent Wassenaar scrutiny, and whether that changes the US policy calculus on restricting domestic models.

Shared on Bluesky by 1 AI expert

  • Fran Litterio @fpl9000.bsky.social amplified

    @theregister.com

    Feds freaked over Fable 5 after simple 'fix this code' prompt, not jailbreak, says researcher

    View on Bluesky →