Redeploying Claude Fable 5: The Complete Story of the Export Control Crisis

Name: Redeploying Claude Fable 5: The Complete Story of the Export Control Crisis
Uploaded: 2026-07-01T08:31:20+00:00
Channel: Sanwal Zia
Description: On June 12, 2026, the US government banned Claude Fable 5 and Mythos 5 globally. On June 30 it lifted the ban. This is the complete story: what triggered it, what the Amazon jailbreak was, how Anthropic responded, and what changed.

Quick Summary

On June 9, 2026, Anthropic launched Claude Fable 5 and Mythos 5. Three days later, the US Commerce Department issued an emergency export control order, forcing a global shutdown of both models. The trigger was a jailbreak technique found by Amazon researchers. On June 30, the export controls were lifted after Anthropic trained a new safety classifier. Fable 5 returned to global users on July 1. Mythos 5 was partially restored to approved US organizations on June 26. This is the complete account of those 19 days.

On the evening of Friday, June 12, 2026, Anthropic’s users around the world opened their Claude interfaces and found two model names missing. Claude Fable 5 and Claude Mythos 5, launched just three days earlier to significant industry interest, had vanished.

The cause was not a technical failure. The US government had issued an emergency export control order, invoking national security authorities to force Anthropic to shut off access to both models for any foreign national anywhere in the world, including Anthropic’s own non-citizen employees. Because Anthropic had no reliable way to verify user nationality in real time across its global platform, it did the only thing it could: it took the models offline for everyone.

Eighteen days later, on June 30, 2026, the controls were lifted. Fable 5 returned to global users the following day. This is the complete account of what happened, why it happened, what changed, and what it means for the future of frontier AI governance.

Complete Timeline of Events

Date	Event
April 7, 2026	Project Glasswing launches. Anthropic introduces Claude Mythos Preview to approximately 50 partner organizations for defensive cybersecurity use. Partners include AWS, Google, Microsoft, Cisco, JPMorganChase, and government agencies. Anthropic commits up to $100 million in usage credits and $4 million in direct donations to open-source security.
May 5, 2026	CAISI announces pre-deployment testing agreements with Google DeepMind, Microsoft, and xAI, expanding its evaluation program to five major AI labs. Anthropic and OpenAI are already partners.
June 2, 2026	President Trump signs the Executive Order Promoting Advanced Artificial Intelligence Innovation and Security. The order establishes a voluntary framework for pre-release government access to frontier models and directs agencies to build a classified benchmark to designate covered frontier models. Agencies face a 60-day implementation deadline.
June 2-3, 2026	Anthropic expands Project Glasswing to approximately 200 organizations across more than 15 countries, adding partners in power, water, healthcare, and communications sectors. NATO and the EU’s ENISA cybersecurity agency are among the additions.
June 4, 2026	Anthropic publishes ‘When AI builds itself,’ describing AI’s role in Anthropic’s own research and calling for globally coordinated consideration of frontier AI risks.
June 9, 2026	Anthropic launches Claude Fable 5 and Claude Mythos 5. Fable 5 is the first publicly available Mythos-class model, with safety classifiers blocking high-risk cybersecurity, biology, chemistry, and model distillation tasks. Mythos 5 is released only to existing Project Glasswing partners. Both are priced at $10 per million input tokens and $50 per million output tokens.
June 10, 2026	Security researcher ‘Pliny the Liberator’ publicly claims to have jailbroken Fable 5 using multi-agent decomposition, Unicode character substitution, and narrative framing. He also publishes Fable 5’s approximately 120,000-character system prompt to GitHub.
June 11, 2026	Anthropic tells Wired: ‘We made the wrong tradeoff and we apologize for not getting the balance right,’ referring to Fable 5’s initially silent fallback behavior when safety classifiers triggered. Anthropic announces it will make refusals visible to users.
June 12, 2026, 5:21 PM ET	The US Department of Commerce issues an emergency export control directive ordering Anthropic to suspend access to both Fable 5 and Mythos 5 for any foreign national, inside or outside the United States. The order takes effect immediately. Anthropic shuts both models down for all users globally.
June 12-13, 2026	White House AI adviser David Sacks publicly claims Anthropic refused to fix the jailbreak issue before the shutdown. Anthropic disputes the characterization of both the jailbreak’s severity and the account of prior discussions. AI policy expert Dean Ball calls the action ‘simply cartoonish.’ Anthropic rushes representatives to Washington to negotiate.
June 13, 2026 onward	Anthropic and government partners, including Amazon, begin detailed technical review of the Amazon report. Anthropic’s testing confirms that models including Claude Opus 4.8, GPT-5.5, and Kimi K2.7 can replicate the same behaviors using the same technique. Work begins on a new safety classifier.
June 26, 2026	The US government approves restoring Mythos 5 access for a set of approved US organizations through Project Glasswing.
June 30, 2026	The US Department of Commerce lifts export controls on Fable 5 and Mythos 5. Anthropic publishes its full account of the 19-day period, including details of the Amazon report, the new classifier, a proposed industry jailbreak severity framework, and expanded government collaboration commitments. Commerce Secretary Howard Lutnick confirms the lifting of controls on social media.
July 1, 2026	Fable 5 returns to global users on Claude.ai, Claude Platform, Claude Code, and Claude Cowork. Pro, Max, Team, and select Enterprise plans receive up to 50% of weekly usage limits through July 7, after which usage credits apply. Re-enablement on AWS, Google Cloud, and Microsoft Foundry is pending.

Background: Understanding the Models at the Center of the Crisis

What Is the Mythos-Class Tier?

To understand what happened in June 2026, it helps to understand what Mythos represents in Anthropic’s model hierarchy. The Mythos class sits above the Opus class, which was previously Anthropic’s most capable publicly available tier. Mythos-class models represent capabilities that Anthropic itself determined were too powerful to release without exceptional safeguards.

The distinction that matters most for this story is in cybersecurity. Claude Mythos-class models can identify and exploit software vulnerabilities more effectively than all but the most skilled human security experts. Anthropic’s own red-team testing found that Mythos Preview could identify zero-day vulnerabilities across every major operating system and every major web browser. It found and demonstrated exploitation of a 27-year-old flaw in OpenBSD and wrote a remote code execution exploit against FreeBSD’s NFS implementation.

That capability is valuable for defenders but dangerous in the wrong hands. It is precisely because of this that Anthropic initially restricted Mythos Preview to vetted partners in Project Glasswing before attempting a broader release.

What Is Claude Fable 5?

Claude Fable 5 is the publicly available version of the Mythos-class model. Both Fable 5 and Mythos 5 share the same underlying model weights. What distinguishes them is a layer of safety classifiers applied to Fable 5 that route certain categories of requests to Claude Opus 4.8 as a fallback. These classifiers cover cybersecurity, biology, chemistry, and model distillation tasks.

The name Fable comes from the Latin fabula, meaning ‘that which is told,’ which is etymologically related to the Greek mythos. Anthropic chose these names deliberately to signal that the two models are versions of the same technology, distinguished only by their safety surface.

Fable 5 launched on June 9, 2026, as the first Mythos-class model available to the general public. According to Anthropic, it is state of the art on nearly all tested AI benchmarks, with particular strength in software engineering, knowledge work, vision, and long-horizon agentic tasks. Independent testing by Artificial Analysis scored Fable 5 at 1,932 on its GDPval-AA benchmark, placing it first among all evaluated models. An external SWE-Bench Pro score of 80.3% was reported by Anthropic (with independent confirmation by Epoch AI pending at the time of writing).

What Is Claude Mythos 5?

Mythos 5 is the same underlying model as Fable 5 but with the safety classifiers either lifted or significantly reduced in specific domains. It is not available for general purchase. At launch, it was deployed only to approved Project Glasswing partners for defensive cybersecurity work. Mythos 5 has the strongest cybersecurity capabilities of any AI model publicly acknowledged at the time of its launch.

Project Glasswing: The Controlled Deployment That Preceded the Crisis

Project Glasswing launched on April 7, 2026. It was Anthropic’s attempt to put Mythos-class capabilities to productive use before the broader public release that Fable 5 would eventually represent.

The program gave selected organizations access to Claude Mythos Preview to find and fix vulnerabilities in their critical software infrastructure. Anthropic described it as an urgent attempt to put powerful capabilities to work for defensive purposes before they inevitably proliferated more broadly. The program launched with approximately 50 partner organizations including AWS, Google, Microsoft, Cisco, JPMorganChase, and government agencies in the US and internationally.

By May 22, 2026, Project Glasswing partners had collectively identified more than 10,000 high-severity or critical vulnerabilities across what Anthropic called the most systemically important software in the world. Anthropic manually reviewed 1,752 of the highest-rated findings and determined that 90.6% genuinely qualified as high or critical severity.

The program expanded significantly on June 2-3, 2026. By that point, approximately 200 organizations across more than 15 countries had received access. New additions included organizations in the power, water, healthcare, and communications sectors. The Financial Times reported that NATO and the EU’s ENISA cybersecurity agency also received access. Intercontinental Exchange (the parent of NYSE) and Rubrik joined to self-assess their exchanges and clearinghouses.

Project Glasswing had both a security purpose and a reputational one. It allowed Anthropic to demonstrate that Mythos-class capabilities were producing measurable defensive benefits before the more contentious public release. It also gave Anthropic a structured relationship with the US government around a model that both sides understood to be strategically significant.

What Triggered the Export Controls?

The Amazon Research Report

The immediate trigger for the export control order was a report produced by Amazon researchers. While Anthropic has not published the full contents of the report, the company’s own account describes its findings in detail.

Amazon researchers found a technique for bypassing Fable 5’s safety classifiers. Specifically, they discovered that prompting Fable 5 in a particular way caused it to identify a number of software vulnerabilities. In one case, the model produced code demonstrating how one of the relevant vulnerabilities could be exploited.

The Amazon researchers were Project Glasswing partners. Given their access to Mythos 5 and their existing role in hardening their own critical infrastructure, they were precisely the kind of partner that Project Glasswing was designed to attract. Their research function, identifying vulnerabilities and methods to exploit them, was exactly what the program existed to do defensively. The question was whether what they found constituted a dangerous jailbreak or a routine finding of borderline defensive work.

How the Government Interpreted the Report

The US government, after learning of the Amazon report, concluded that the finding warranted emergency action. According to reporting by Axios, Commerce Secretary Howard Lutnick sent a letter to Anthropic CEO Dario Amodei directing the company to suspend all access to Fable 5 and Mythos 5 for any foreign national anywhere in the world.

The administration’s position, as articulated by White House AI adviser David Sacks, was that Anthropic had refused to fix the jailbreak issue before the government acted. Sacks had previously been critical of Anthropic’s approach, having called the company ‘woke’ and ‘leftist’ and accused it of regulatory capture through safety-focused messaging.

The government’s interpretation of the Amazon finding reflected a concern that if Fable 5’s classifiers could be bypassed using this technique, a consumer-facing AI product effectively became an unrestricted access point to the Mythos-level cybersecurity capabilities underneath it.

Anthropic’s Counter-Argument

Anthropic disputed both the severity of the finding and the government’s characterization of prior discussions. The company’s public position was that the reported technique allowed access to routine defensive cybersecurity work that was within the safety margin, not to the core harmful behaviors the classifiers were designed to prevent.

More significantly, Anthropic’s own testing showed that the same behaviors could be produced by models that were not subject to export controls: Claude Opus 4.8, OpenAI’s GPT-5.5, and China’s Kimi K2.7 all produced the same vulnerability identifications using the same technique. In the specific case of the exploit demonstration, every model Anthropic tested, including Claude Haiku 4.5, Sonnet 4.6, Opus 4.6, 4.7, and 4.8, GPT-5.4, GPT-5.5, and Kimi K2.7, could produce the same output.

Anthropic’s written statement at the time of shutdown read: ‘We disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people.’ The company argued that treating Fable 5 as uniquely dangerous when the same behaviors were available through widely accessible models without similar controls was both disproportionate and potentially counterproductive to US competitive interests in AI.

The Legal Mechanism: Deemed Export Controls

The legal instrument the government used was significant and novel in its application. Export control laws, specifically the Export Administration Regulations administered by the Commerce Department’s Bureau of Industry and Security, restrict the sharing of controlled technologies with foreign nationals. The ‘deemed export’ doctrine treats providing a foreign national with access to a controlled technology through a hosted endpoint as legally equivalent to physically shipping the technology to their home country.

This doctrine had previously been applied primarily to physical goods, software, and hardware. Applying it to a commercially deployed AI model served via cloud API was a significant extension of the doctrine’s scope. It meant that any foreign national with a Claude account, anywhere in the world including within the United States, was legally in the same position as someone attempting to receive a physically exported controlled item.

The precedent established by this action extends beyond Anthropic. As tech policy analyst site TechPolicy.Press noted in its coverage: the novel development was not the jailbreak or the model, but the legal instrument. Any future frontier model with similar cybersecurity capabilities could theoretically be subject to the same mechanism.

Why Were All Users Locked Out? The Nationality Verification Problem

The export control order required Anthropic to restrict access specifically to foreign nationals. In theory, US citizens and permanent residents could have continued to use the models. In practice, Anthropic had no way to implement that restriction.

Claude’s platform serves users globally across a web interface, mobile applications, desktop applications, and third-party integrations through AWS, Google Cloud, and Microsoft Foundry. Users authenticate with email addresses and payment credentials, not with nationality documentation. Real-time nationality verification at the scale of hundreds of millions of users across dozens of global deployment surfaces was not operationally feasible, particularly on a Friday evening when the order arrived immediately.

As a result, Anthropic concluded that the only compliant course of action was to disable both models for all users. The shutdown went into effect at 5:21 PM ET on June 12, 2026. AWS executed a simultaneous global takedown, rerouting Fable 5 and Mythos 5 requests to fallback models. The suspension affected every commercial platform simultaneously: AWS Bedrock, Google Cloud Vertex AI, Microsoft Foundry, and the direct Claude APIs.

This outcome revealed a structural vulnerability in AI deployment at scale. Enterprise contracts, data processing agreements, and service level agreements had not anticipated an instant regulatory kill-switch applied to a commercially deployed model. Most standard force majeure and compliance-with-law contract clauses lacked specific procedures for vendor-mandated model standdowns, rapid failover requirements, or downstream nonperformance provisions.

Understanding Claude’s Cybersecurity Safety Architecture

Defense in Depth

Fable 5 launched with what Anthropic described as the strongest safeguards ever applied to a Mythos-class model. In the month before launch, Anthropic transferred staff from across the organization to double the number of researchers and engineers working on safety mechanisms.

The safety architecture follows a principle called defense in depth: rather than relying on any single protection, multiple overlapping safeguards make the model progressively harder to misuse. Each individual mechanism has weaknesses, but the combination creates a system where defeating one layer does not defeat the others.

Safeguards include training the model itself to decline certain requests, retroactive analysis of misuse patterns, and the most important mechanism for this story: the classifiers.

What Classifiers Are and How They Work

A classifier, in this context, is a smaller automated AI system that operates alongside the main model. When a user makes a request, the classifier evaluates both the request and the model’s planned response for signs of potentially harmful cybersecurity activity. If the classifier determines that a request falls into a category it is designed to block, it prevents the model from responding and routes the query to Claude Opus 4.8 instead. Users are notified of the fallback.

This design deliberately avoids binary refusal in favor of graceful degradation. A user who asks a cybersecurity question that trips a classifier gets a response from Opus 4.8 rather than an error. In most cases this produces a useful result for legitimate purposes while preventing the Mythos-level cybersecurity capabilities from being accessed.

The Safety Margin and False Positives

A key architectural decision in Fable 5’s classifier design is the safety margin. Rather than setting the classifier to trigger only on requests that are clearly harmful, Anthropic deliberately set it to also trigger on requests that are probably benign but have some small chance of being harmful. This wider boundary is the safety margin.

The practical consequence is that users occasionally experience the classifier blocking a request that is actually harmless. A developer asking about security vulnerability patterns for defensive research, for example, might find their query routed to Opus 4.8 even though their intent is entirely legitimate. These are false positives: correct requests that the classifier incorrectly treats as potentially harmful.

For Fable 5, Anthropic made the safety margin significantly wider than in any previous model launch. This was a deliberate tradeoff: a larger safety margin produces more false positives and user frustration, but it means that a jailbreak attempting to edge into harmful territory has a longer journey before it can reach the genuinely dangerous behaviors at the classifier’s core. Anthropic judged this tradeoff worthwhile given the capabilities of the underlying model.

Source: Anthropic Official Blog (anthropic.com/news/redeploying-fable-5)

The Four Categories of Behavior

Clearly benign: Allowed.

Requests that are clearly not related to cybersecurity harm fall outside the classifier’s scope entirely. A user asking Claude to help debug their Python code or explain a networking concept is in this category.

Safety margin: Blocked as a precaution.

Requests that look probably benign but carry some non-trivial probability of harmful intent fall into the safety margin. These are requests the classifier blocks even though they are likely legitimate. The Amazon researchers’ technique accessed behavior in this zone, which is why Anthropic characterizes the finding as a borderline case rather than a discovery of a deep vulnerability.

Ambiguous: Blocked.

Requests that are clearly cybersecurity-related and could plausibly serve either defensive or offensive purposes fall into the ambiguous category. Finding complex security vulnerabilities in specific production systems falls here.

Clearly harmful: Blocked.

Requests that clearly seek dangerous capabilities, such as building a chain of software exploits designed for offensive deployment, are in the harmful category. These are the core behaviors the classifier is designed to prevent at all costs.

Understanding AI Jailbreaks: A Taxonomy

Source: Anthropic Official Blog (anthropic.com/news/redeploying-fable-5)

The term jailbreak, as used in AI safety, refers to a technique that causes an AI model to bypass its own safety guidelines and produce outputs it was designed to prevent. Anthropic uses the term interchangeably with ‘bypass.’ Understanding the different categories of jailbreak is essential to understanding what the Amazon report described and why Anthropic and the government assessed the severity so differently.

Minor Jailbreaks

A minor jailbreak edges past the classifier’s boundary but remains within the safety margin. The behavior it unlocks is still probably benign or involves only ambiguous harm potential. Because the safety margin extends beyond the classifier boundary specifically to catch these cases, a minor jailbreak that lands in the safety margin does not necessarily expose any genuinely dangerous capability.

Anthropic’s assessment of the reported Amazon technique was that it falls into this category. The technique accessed behavior that was within the safety margin: routine defensive cybersecurity work that was blocked as a precaution, not because it was inherently dangerous.

Narrow Harmful Jailbreaks

A narrow harmful jailbreak breaks through the classifier and unlocks a specific harmful behavior. Unlike a minor jailbreak, it crosses the line between the safety margin and genuinely harmful territory. However, because it is narrow, it only works for a specific type of request or against a specific target. An attacker using a narrow harmful jailbreak gains a limited capability advantage.

Narrow harmful jailbreaks are of low to moderate severity according to Anthropic’s proposed framework. The narrowness limits the attacker’s ability to scale the technique or use it across multiple offensive objectives.

Universal Jailbreaks: The Most Serious Category

A universal jailbreak unlocks a wide range of harmful behaviors using a single technique. Where a narrow jailbreak opens one specific door, a universal jailbreak effectively removes the door from its hinges across an entire class of harmful capabilities. This is the category that concerns safety researchers most.

As of the time Anthropic published the redeployment announcement on June 30, 2026, no universal jailbreak had been discovered for Fable 5. The company noted that expert safety researchers continue to red-team the model and that finding one remains a possibility. Anthropic’s stated goal is to identify major jailbreaks before malicious actors do.

It is worth noting that Anthropic acknowledges it is probably impossible to make any AI model fully impervious to jailbreaks, drawing an analogy to software vulnerabilities: no software system is immune, though software vulnerabilities are generally more straightforward to discover and patch than LLM jailbreaks.

The Pliny Jailbreak Claims

Separate from the Amazon report that triggered the government’s action, a security researcher using the pseudonym Pliny the Liberator publicly claimed on June 10, 2026, to have jailbroken Fable 5 using a combination of techniques: multi-agent decomposition, Unicode and homoglyph character substitution to evade keyword classifiers, long-context reference tracking to distribute harmful intent across a large conversation, and fictional framing to mask offensive intent as creative content.

Pliny also published what he claimed was Fable 5’s approximately 120,000-character system prompt to GitHub. Anthropic did not publicly address the Pliny claims in its June 30 announcement, which focused specifically on the Amazon report that had triggered government action. Whether the Pliny techniques constituted genuine harmful jailbreaks under Anthropic’s taxonomy was not assessed in public materials available at the time of writing.

How Anthropic Responded: The New Safety Classifier

While the export controls were in effect, Anthropic worked closely with the US government and Amazon to review the report and its evidence. Anthropic confirmed the finding, assessed its severity, and developed a response.

The response was a new safety classifier trained specifically to target and block the technique described in the Amazon report. This classifier was developed in parallel with the government negotiations and was ready for deployment by the time export controls were lifted.

The new classifier blocks the specific technique from the Amazon report in more than 99% of cases. In the small remaining fraction, the model may provide information about previously discovered or already-patched security vulnerabilities, which Anthropic says is insufficient to meaningfully assist a cyberattacker.

Researchers from the US Department of Commerce’s Center for AI Standards and Innovation (CAISI) independently tested both the original safeguards and the new classifier. CAISI confirmed that the updated safeguards are, in its assessment, extraordinarily strong.

The cost of the new classifier is more false positives. Because the classifier is trained specifically to block the framing technique used in the Amazon report, it is more conservative in how it evaluates related requests. Routine coding and debugging tasks that involve security-adjacent language may now trigger the fallback to Opus 4.8 more often than before the new classifier was deployed. Anthropic says it will continue refining the classifier to reduce false positives over time.

When Fable 5 blocks a request, the user is now notified explicitly. This is a change from the original launch behavior, which Anthropic acknowledged was miscalibrated: the initial version silently fell back to Opus 4.8 without informing the user. After public criticism of that design, Anthropic changed it to notify users of the fallback, matching the transparency standard applied to Mythos 5.

The Role of CAISI in AI Safety Evaluation

The Center for AI Standards and Innovation (CAISI) is a unit within the US National Institute of Standards and Technology (NIST), under the Department of Commerce. It was established to serve as the government’s primary point of contact with the AI industry on testing and safety evaluation, focusing on demonstrable risks in cybersecurity, biosecurity, and chemical weapons.

CAISI had an existing collaborative relationship with Anthropic before the June 12 export control order. Anthropic’s announcement of that collaboration in early 2026 noted that CAISI had evaluated multiple iterations of Anthropic’s Constitutional Classifiers on models including Claude Opus 4 and 4.1 before deployment, and that government red-teamers had identified vulnerabilities that Anthropic then addressed.

CAISI also had pre-deployment testing agreements with OpenAI, Google DeepMind, Microsoft, and xAI as of May 2026, making it the central federal body for evaluating frontier AI risks.

In the context of the Fable 5 crisis, CAISI’s independent verification of the new classifier’s strength was significant. It meant the government’s own technical body confirmed that the updated safeguards were adequate before export controls were lifted. This was not simply Anthropic asserting its own classifiers were sufficient.

Critically, days before Fable 5 launched, reports emerged that the Trump administration had told CAISI to stop making its evaluations of AI models public. This context adds nuance to the subsequent dispute: the government was simultaneously building evaluation infrastructure and restricting transparency around what that infrastructure found.

The June 2 Executive Order: Context for the Crisis

One week before the export control order, President Trump signed the Executive Order Promoting Advanced Artificial Intelligence Innovation and Security on June 2, 2026. Understanding this order is essential to understanding the government’s subsequent action.

The order directed federal agencies to establish a voluntary framework allowing AI developers to provide the government with up to 30 days of pre-release access to frontier models before general release. Earlier drafts of the order had specified 90 days; the 30-day figure in the final version reflected a compromise between national security and pro-innovation factions within the administration.

Crucially, Fable 5 was released on June 9, 2026, just seven days after the executive order was signed, without having gone through any pre-release government access period. The order’s framework had a 60-day implementation deadline, meaning the infrastructure for pre-release testing did not yet exist when Fable 5 launched. The government’s emergency export control action on June 12 can be understood in part as an assertion of authority in the absence of that framework.

The order also established an interagency cybersecurity vulnerability clearinghouse under Section 2(d), which Anthropic has since committed to participating in. NSA, Treasury, and CISA face an August 1, 2026, deadline to deliver a classified benchmark process for designating covered frontier models, which would determine which future models require pre-release government review.

The Proposed Industry Framework: A CVSS for AI Jailbreaks

One of the most substantive long-term outcomes of the Fable 5 episode is a framework Anthropic is proposing for standardizing how the AI industry assesses and communicates the severity of jailbreaks.

The analogy Anthropic draws is to the Common Vulnerability Scoring System, known as CVSS, which is the established industry standard for rating the severity of software security vulnerabilities. CVSS gives security researchers, vendors, and governments a shared vocabulary for describing how dangerous a software flaw is. Before CVSS, each organization used its own severity language, creating confusion about when to prioritize fixes and how urgently to respond.

AI jailbreaks currently lack any equivalent standard. When a researcher finds a jailbreak, there is no agreed-upon way to communicate how dangerous it is, how urgently it needs to be fixed, or whether it justifies emergency government action. The events of June 2026 illustrate precisely why this gap matters: Anthropic and the government assessed the same finding very differently, and there was no common framework to resolve the disagreement.

The Four Proposed Scoring Criteria

Criterion	What It Measures and How It Scores
Capability gain	How far beyond existing tools does the jailbreak take the user? If widely available models including weaker AI systems can replicate the behavior, the score is low. If the jailbreak unlocks unique capabilities that meaningfully accelerate even expert attackers, the score is high.
Breadth of capability gain	For how many distinct offensive tasks does the same technique work? A jailbreak that only enables one narrow attack type scores low. A jailbreak that works across multiple attack categories and techniques scores high.
Ease of weaponization	How much skilled effort does it take to turn the jailbreak into an attack? If the technique requires extensive specialized prompting and many retries, the score is low. If a single prompt works reliably on the first or second attempt, the score is high.
Discoverability	How easy is it to obtain the technique? If it requires specialist knowledge to find, the score is low. If it is already widely known and published online, the score is high.

The Amazon finding, assessed against these four criteria, would likely score low on capability gain (the same behavior was achievable with weaker models), moderate on breadth (it worked in a specific framing pattern), moderate on ease of weaponization (it required a specific prompting approach), and potentially high on discoverability once published.

For the most severe jailbreaks, those being actively exploited against critical infrastructure, Anthropic commits to immediately deploying mitigations upon confirmation of severity. The company is also creating a 24/7 monitoring team for jailbreak submission channels.

The HackerOne Program

Alongside the proposed framework, Anthropic launched a new disclosure program on HackerOne specifically for cybersecurity jailbreaks in Fable 5. This is a disclosure program, not a traditional bug bounty with monetary rewards. Anthropic is focused on what it considers high and critical severity findings: techniques that cause Fable 5 to produce functional exploit code or working malware, prompting approaches that extract domain-expert-level offensive guidance, and bypasses that work at scale across multiple offensive task categories.

The program applies the same four-criterion scoring framework as the proposed industry standard, making it a live test of that framework’s utility in practice.

Partners in the proposed framework include Amazon, Microsoft, Google, and other Project Glasswing partners. Anthropic is inviting other industry partners and model providers to join.

Expanded Government Collaboration: The Four Commitments

The Fable 5 episode produced not just a technical resolution but a set of formal commitments from Anthropic to the US government that are designed to prevent similar situations in the future.

Pre-Release Government Access and Evaluation

For future models that materially advance the capability frontier in areas relevant to national security, Anthropic will provide designated government partners with expanded early access to both the models and their accompanying safeguards before broad release. Government partners can then run independent capability evaluations and test guardrails. Anthropic technical staff will work alongside government evaluators during these testing periods.

This commitment addresses the core friction in the June 2026 episode: Fable 5 was released without prior government review, and the government acted through export controls after the fact. The new arrangement creates a channel for earlier engagement.

Rapid Information Sharing on Safeguards

When significant jailbreaks or misuse patterns are identified, Anthropic will investigate, triage, and notify appropriate government counterparts promptly. New safeguards built in response to findings will be shared so they can be independently tested by the government. Anthropic will also provide government partners with threat intelligence reporting before public publication and will participate in the interagency cybersecurity vulnerability clearinghouse established by the June 2 Executive Order.

Dedicated Resources for Joint Research

Anthropic is substantially scaling up joint work with government partners on AI security. This includes dedicated Anthropic teams working on shared government priorities, a significant compute allocation for government testing and research, and making safety and red-teaming expertise available to advance AI evaluation methods.

A Common Industry Security Standard

Anthropic commits to working with the government and industry peers toward a shared voluntary security and evaluation standard for frontier model providers. The company will contribute evaluations, tooling, and best practices that the government can apply across the industry.

Anthropic characterizes these commitments as the beginning of a template for effective global coordination on frontier AI risks. The company calls for these standards to be codified in regulation applied equally across all frontier model developers, arguing that voluntary and inconsistent standards create unfair competitive dynamics.

What Changed on June 30: The Terms of Return

Fable 5 Access Conditions

Export controls on Fable 5 were lifted on June 30. Global access began on July 1 across Claude.ai, the Claude Platform, Claude Code, and Claude Cowork. Re-enablement on AWS, Google Cloud, and Microsoft Foundry is pending as of the announcement.

For Pro, Max, Team, and select Enterprise plans, Fable 5 is available for up to 50% of weekly usage limits through July 7. After July 7, Fable 5 access requires usage credits.

Plan	Through July 7	After July 7
Pro, Max, Team	Up to 50% of weekly usage limits at no extra cost	Usage credits required
Enterprise (Standard seats)	No included Fable 5 allowance. All usage via credits. If credits not enabled, Fable 5 will not work.	Same: credits required
Enterprise (Premium seats)	Fable 5 included in subscription through July 7. Draws from seat usage at no additional cost.	Credits required. If credits not enabled, Fable 5 will not work for your users.

Mythos 5 Access Conditions

Mythos 5 was partially restored on June 26 following specific US government approval for a set of approved US organizations. Anthropic continues to coordinate with the government to expand access to a broader set of domestic and international partners in the Glasswing program.

Mythos 5 remains restricted to Project Glasswing partners and is not available for general purchase. Its gradual, verified expansion through Glasswing continues.

What the New Classifier Means in Practice

The most tangible change for everyday Fable 5 users is the new cybersecurity classifier and the increased false positive rate it introduces. Security researchers, developers working on penetration testing tools, incident response professionals, and anyone working in adjacent fields may find that certain queries are now routed to Opus 4.8 more frequently than before the ban.

Anthropic has been explicit that this is a tradeoff and that the false positive rate will be refined over time. The company says the trade is worthwhile: the new classifier ensures that the Amazon technique is blocked in more than 99% of cases, giving the government and users confidence that the specific vulnerability has been addressed.

Broader Implications: What This Episode Means

For the AI Industry

The Fable 5 episode established that the US government is willing to use export control powers against commercially deployed AI models served via cloud APIs. This is a significant expansion of the deemed export doctrine and creates a precedent that will affect how every frontier AI lab evaluates pre-release governance risk going forward.

Any AI model with capabilities that the government could characterize as having national security implications, including advanced cybersecurity, biology, chemistry, or military applications, is now potentially subject to emergency export controls without prior notice. The voluntary pre-release access framework in the June 2 Executive Order is designed to prevent this by creating an orderly channel for review, but the Fable 5 case shows that in the absence of such a channel, the government will act unilaterally.

OpenAI’s response to the same environment was instructive. Days after the Fable 5 export controls were lifted, Axios reported that OpenAI had previewed GPT-5.6 to a small, government-approved group rather than the public, explicitly citing the dual-use risk of a model capable enough to help defenders also being capable enough to help attackers. The Fable 5 episode appears to have accelerated voluntary pre-coordination across the industry.

For Researchers and Security Professionals

The new classifier will produce more false positives on legitimate security research. Professionals who routinely interact with Fable 5 on topics adjacent to cybersecurity should expect occasional fallbacks to Opus 4.8 for work that would not have triggered the original classifier.

The HackerOne disclosure program creates a formal channel for submitting jailbreak findings. This is important because it provides security researchers with a legitimate path that neither requires publishing findings publicly (creating risk) nor leaves them with no recourse when they discover a potential safety issue.

For Enterprise Customers

The 19-day global shutdown exposed the inadequacy of standard enterprise contracts for AI services. Most service agreements lacked specific provisions for instant regulatory model standdowns, failover requirements, or nonperformance indemnities in the event of government-mandated shutdowns.

Enterprise procurement teams should review their AI service agreements in light of this episode. Key questions include: What happens to committed usage if a model is suspended? What fallback provisions exist? What notification requirements apply? How are data retention obligations handled during a shutdown period?

For Frontier AI Governance

The Fable 5 episode is likely to accelerate the formalization of pre-release government review for frontier AI models. The June 2 Executive Order created the framework in theory; the June 12 export control order demonstrated the consequence of operating outside it. The 30-day pre-release access window now has a concrete case study illustrating what happens when it is absent.

The proposed four-criterion jailbreak severity framework, if adopted across the industry, could provide exactly the common language that was missing when Anthropic and the government assessed the Amazon finding differently. Whether it gains adoption among competitors and government agencies remains to be seen.

For China and Global AI Competition

Several technology executives and investors raised concern during the 19-day shutdown that restricting Anthropic’s rollout was providing Chinese AI developers with valuable time to close the gap. CNBC noted that Chinese open-source models were proving capable enough to replicate some of the same behaviors that triggered the export controls in the first place.

This creates a genuine policy tension. Export controls designed to prevent foreign access to advanced US AI capabilities may simultaneously slow the US commercial AI advantage while doing little to restrict what is achievable with existing Chinese open-weight models. Anthropic’s own testing showed that Kimi K2.7 could produce the same behaviors the Amazon report identified in Fable 5, without being subject to any equivalent export restriction.

Analysis: What Anthropic Did Well and What Remains Unresolved

What Anthropic Did Well

The transparency of the June 30 announcement was notable. Rather than treating the episode as an embarrassment to minimize, Anthropic published a detailed technical account of what the Amazon report found, how its testing confirmed the finding was not unique to Fable 5, how the new classifier works, and what the tradeoffs are. That level of disclosure, including explicit acknowledgment that the new classifier increases false positives, sets a standard for post-incident AI safety communication.

The proposed jailbreak severity framework is a constructive response to a genuine industry gap. Even if the framework proves imperfect in practice, initiating the conversation and publishing a concrete proposal is more productive than waiting for consensus to emerge organically. The CVSS analogy is apt: CVSS took years to become an established standard and has itself been updated multiple times.

Anthropic’s decision to immediately shut down both models globally, despite the significant commercial cost, demonstrated that the company treats export compliance as a genuine obligation rather than a negotiating position. The decision to do so rather than attempt partial compliance that might have maintained some revenue was the correct response given the legal circumstances.

What Remains Unresolved

The deeper question of how to govern Mythos-class capabilities in a world where similar capabilities will proliferate to other developers, including those without comparable safety commitments, has not been answered by the Fable 5 episode. Anthropic acknowledged this explicitly in its June 2 Glasswing expansion announcement: within six to twelve months, other AI companies will likely have Mythos-class models, and some may release them without the safeguards that distinguish Fable 5 from Mythos 5.

The voluntary nature of the pre-release government access framework is a significant limitation. A company without Anthropic’s safety culture and regulatory relationships faces no formal requirement to provide the same pre-release access, yet its models could present equivalent risks. The call in Anthropic’s redeployment announcement for strong regulation applied equally across frontier model developers reflects the genuine structural problem.

The nationality verification problem has also not been solved. If a future export control order were applied to a model, Anthropic would face the same operational impossibility of verifying user nationality in real time. The 18-day shutdown demonstrated that the only practical compliance measure is a total global takedown. This is a critical operational risk for any enterprise building on frontier AI models.

Common Misconceptions

Misconception: The Amazon jailbreak unlocked unique Mythos-level cyber capabilities.

This is incorrect. Anthropic’s testing confirmed that the behaviors produced by the Amazon technique were replicable using models including Claude Opus 4.8, GPT-5.5, and Kimi K2.7, none of which were subject to the same export controls. The technique accessed borderline defensive cybersecurity work within the safety margin, not the advanced offensive capabilities unique to the Mythos class.

Misconception: The export controls were applied because Fable 5 was found to be too dangerous.

The export controls were applied because the government determined that a jailbreak had been demonstrated, creating concern that Fable 5’s safety classifiers could be bypassed. The subsequent investigation and classifier update resolved that concern. The controls were lifted after the new safeguards were independently verified by CAISI, not because the model was made less capable.

Misconception: Fable 5 and Mythos 5 are different models.

They are the same underlying model with different safety surfaces. Fable 5 includes safety classifiers routing high-risk requests to Opus 4.8. Mythos 5 has those classifiers either lifted or reduced in specific domains. The model weights themselves are identical.

Misconception: The new classifier eliminates jailbreak risk.

It does not. The new classifier specifically targets the technique described in the Amazon report and blocks it in more than 99% of cases. Other jailbreak techniques may exist. Anthropic acknowledges that it is probably impossible to make any AI model fully impervious to jailbreaks and that expert safety researchers continue to red-team Fable 5.

Misconception: The HackerOne program is a traditional bug bounty with monetary rewards.

It is a disclosure program, not a paid bounty. Anthropic does not offer monetary rewards for cybersecurity jailbreak submissions under this program. It focuses on high and critical severity findings and uses the proposed four-criterion scoring framework.

Key Takeaways

Claude Fable 5 and Mythos 5 were launched June 9, suspended June 12, and restored June 30, 2026, a 19-day interruption triggered by an emergency US government export control order.
The trigger was an Amazon research report describing a technique to bypass Fable 5’s safety classifiers, causing it to identify software vulnerabilities and produce exploit code in one case.
Anthropic’s testing confirmed the same behaviors were achievable with multiple other models not subject to export controls, including Claude Opus 4.8, GPT-5.5, and Kimi K2.7.
The legal mechanism was the deemed export doctrine: providing a foreign national access to a controlled technology via a cloud API is legally equivalent to physically shipping it to their home country.
Anthropic could not verify user nationality in real time and responded with a total global shutdown, affecting all customers regardless of nationality.
A new safety classifier was trained specifically to block the Amazon technique, achieving more than 99% effectiveness as confirmed by CAISI independent testing, at the cost of higher false positive rates on legitimate coding and security work.
Anthropic is proposing a four-criterion jailbreak severity framework (capability gain, breadth, ease of weaponization, discoverability) in partnership with Amazon, Microsoft, Google, and Project Glasswing partners.
Anthropic made four formal commitments to the government: pre-release model access for national security frontier models, rapid jailbreak information sharing, dedicated joint research resources, and a common voluntary security standard.
A new HackerOne disclosure program accepts cybersecurity jailbreak reports for Fable 5 with no monetary rewards.
The episode establishes a precedent for government use of export controls against commercially deployed AI and accelerated voluntary pre-release review practices across the AI industry.
Mythos 5 remains restricted to approved Project Glasswing partners. Fable 5 is globally available as of July 1 with usage credit requirements after July 7.

Frequently Asked Questions

What is Claude Fable 5?

Claude Fable 5 is the first publicly available Mythos-class AI model from Anthropic, launched June 9, 2026. It shares the same underlying model as Claude Mythos 5 but includes safety classifiers that route high-risk cybersecurity, biology, chemistry, and model distillation requests to Claude Opus 4.8 as a fallback. It is Anthropic’s most capable publicly released model.

What is Claude Mythos 5?

Claude Mythos 5 is the same underlying model as Fable 5 but with safety classifiers either lifted or significantly reduced in specific domains. It is not available for general purchase and is restricted to approved Project Glasswing partners for defensive cybersecurity use under US government collaboration.

What is Project Glasswing?

Project Glasswing is Anthropic’s controlled deployment program for Mythos-class AI models, launched April 7, 2026. It gives vetted partner organizations access to Mythos-level capabilities for defensive cybersecurity work. Partners include critical infrastructure operators, financial institutions, healthcare organizations, technology companies, and government agencies across more than 15 countries. By early June 2026, approximately 200 organizations had access, and partners had collectively identified more than 10,000 high-severity vulnerabilities.

Why did the US government ban Fable 5 and Mythos 5?

The government banned access for foreign nationals after learning of a report by Amazon researchers describing a technique to bypass Fable 5’s safety classifiers. The concern was that if the classifiers could be defeated, a consumer-facing model would effectively provide access to Mythos-level cybersecurity capabilities without the restrictions designed to prevent misuse. The government used emergency export control powers under the Bureau of Industry and Security.

Why did all users lose access if the ban only applied to foreign nationals?

Anthropic had no reliable way to verify user nationality in real time across its global platform. Attempting partial compliance risked violating the order. The company concluded that a total global shutdown was the only operationally feasible and legally compliant response.

What exactly did the Amazon jailbreak do?

According to Anthropic’s account, the Amazon technique involved prompting Fable 5 in a specific way that caused it to identify software vulnerabilities. In one case, the model produced code demonstrating how a vulnerability could be exploited. Anthropic characterized the behavior as routine defensive cybersecurity work within the safety margin, not a unique Mythos-level capability. Testing confirmed that models including Claude Opus 4.8, GPT-5.5, and Kimi K2.7 could produce the same behaviors using the same technique.

What changed with the new safety classifier?

Anthropic trained a new classifier specifically targeting the prompting technique described in the Amazon report. The new classifier blocks that technique in more than 99% of cases, as independently verified by CAISI. The tradeoff is a higher rate of false positives on legitimate coding and security work, which Anthropic says it will reduce over time.

What is CAISI and why does it matter?

CAISI is the Center for AI Standards and Innovation within the US National Institute of Standards and Technology. It is the government’s primary body for testing and evaluating commercial AI systems for national security risks. CAISI independently tested both Anthropic’s original Fable 5 safeguards and the new post-shutdown classifier. Its confirmation that the updated safeguards are extraordinarily strong was a prerequisite for lifting the export controls.

What is a minor jailbreak versus a universal jailbreak?

A minor jailbreak edges past a safety classifier’s boundary but remains in the safety margin, a zone of probably benign behavior the classifier blocks as a precaution. It does not access genuinely harmful capabilities. A universal jailbreak removes the safety constraints across an entire class of harmful behaviors at once, making it the most dangerous category. As of June 30, 2026, no universal jailbreak had been discovered for Fable 5.

Is Fable 5 available now?

Yes, as of July 1, 2026. Fable 5 is available globally on Claude.ai, the Claude Platform, Claude Code, and Claude Cowork. For Pro, Max, Team, and select Enterprise plans, up to 50% of weekly usage limits are included at no extra cost through July 7, after which usage credits apply. Re-enablement on AWS, Google Cloud, and Microsoft Foundry is pending.

What is the proposed jailbreak severity framework?

Anthropic, with Amazon, Microsoft, Google, and Project Glasswing partners, is developing a framework that scores AI jailbreaks on four criteria: capability gain (how much the jailbreak exceeds existing tools), breadth of capability gain (how many offensive task types it enables), ease of weaponization (how much effort to convert into an attack), and discoverability (how easily the technique can be found). The framework is analogous to CVSS for software vulnerabilities and is designed to give developers and governments a common language for assessing jailbreak severity.

What does this mean for other AI companies?

The episode establishes a precedent for government use of export controls against commercially deployed AI models served via cloud APIs. Any frontier AI model with capabilities the government could characterize as nationally significant in cybersecurity, biology, or related domains may be subject to the same mechanism. The voluntary pre-release access framework in the June 2 Executive Order is designed to prevent future emergency actions by creating an orderly review channel. OpenAI’s subsequent decision to preview GPT-5.6 to a small government-approved group before broader release reflects the episode’s influence on industry practice.

About the Author

I’m Sanwal Zia, an SEO strategist with more than six years of experience helping businesses grow through smart and practical search strategies. I created Optimize With Sanwal to share honest insights, tool breakdowns, and real guidance for anyone looking to improve their digital presence. You can connect with me on YouTube, LinkedIn, Facebook, Instagram, or visit my website to explore more of my work.

Disclaimer

All information published on Optimize With Sanwal is provided for general guidance only. Users must obtain every SEO tool, AI tool, or related subscription directly from the official provider’s website. Pricing, regional charges, and subscription variations are determined solely by the respective companies, and Optimize With Sanwal holds no liability for any discrepancies, losses, billing issues, or service-related problems. We do not control or influence pricing in any country. Users are fully responsible for verifying all details from the original source before completing any purchase.