Anthropic Redeploys Fable 5 With Cross-Lab Jailbreak Rubric
TL;DR
- Fable 5 returns to Claude Platform, Claude.ai, Claude Code, and Claude Cowork on July 1 after US export controls were lifted June 30.
- Pro, Max, Team, and select Enterprise plans get Fable 5 for up to 50% of weekly usage limits through July 7.
- Anthropic is drafting a four-criterion jailbreak severity framework with Amazon, Microsoft, Google, and other Glasswing partners, and opening a HackerOne program.
Anthropic's redeployment note for Fable 5 does double duty. On the surface it is a return to service: the model comes back to Claude Platform, Claude.ai, Claude Code, and Claude Cowork on July 1, after the US government lifted export controls that had been in place since June 12. Underneath it is the closest thing the frontier labs have offered so far to a shared vocabulary for jailbreaks.
The trigger was Amazon researchers finding a method of bypassing Fable 5's safeguards by prompting it into identifying a number of software vulnerabilities. Anthropic's own testing then found that less capable models, including Claude Opus 4.8, GPT-5.5, and Kimi K2.7, could identify the same vulnerabilities, which is the interesting part: the jailbreak was not unique to Fable 5. The company says a new classifier blocks the specific technique described in the Amazon report in over 99% of cases. Pro, Max, Team, and select Enterprise plans will get Fable 5 for up to 50% of weekly usage limits through July 7.
The more strategic move is the joint work with Amazon, Microsoft, Google, and other Glasswing partners on what Anthropic calls a consensus framework for assessing the severity of AI jailbreaks. Four criteria are on the table: capability gain, breadth of capability gain, ease of weaponization, and discoverability. Alongside the rubric, Anthropic is launching a HackerOne program where security researchers can submit potential cyber jailbreaks they have discovered in Fable 5.
The honest caveat is that this is Anthropic's own post, so the specifics — the 99% figure for the classifier, the eventual shape of the severity rubric, the extent to which Microsoft and Google actually adopt it in their own products — are the company's claim rather than an independently verified state of the world. The post does not name a lead author for the framework, does not set a timeline for when the rubric becomes public, and does not explain how the participating labs plan to resolve disagreements when they score the same jailbreak differently.
What is interesting is the direction. A shared severity scale, if it holds together, would let researchers and buyers talk about jailbreaks the way they already talk about CVSS scores for regular software vulnerabilities. For enterprise buyers weighing model risk, that would be a real upgrade over the current signal, which is usually just the raw headline that a model got jailbroken.
Shared on Bluesky by 1 AI expert
Originally reported by anthropic.com
Read the original article →Original headline: Anthropic Redeploys Fable 5 July 1 With July 7 Usage Credits — Also Unveils Joint Jailbreak Severity Standard With Amazon, Microsoft, and Google