Anthropic revises policy after researchers criticize covert AI restrictions on Claude

Anthropic quietly built guardrails into its latest AI models that would degrade performance whenever someone tried to use them for building rival AI systems. Then researchers found out, and things got uncomfortable.

The company has now revised its approach to the controversial restrictions, which were embedded in its Mythos and Fable model families, after a wave of criticism from the AI research community. The safeguards, first disclosed in system cards published in early June 2026, used techniques like prompt modification and steering vectors to intentionally diminish Claude’s effectiveness on tasks central to large language model development, including pretraining pipelines and ML accelerator design.

What Anthropic actually did

Here’s the thing. Companies routinely include terms of service that prohibit customers from using their products to build competing offerings. That’s standard corporate defense. What made Anthropic’s approach different was the method: rather than simply banning the behavior in legal documents, the company baked the restrictions directly into the model’s behavior.

In English: if Claude detected you were trying to build a competing AI system, it would quietly become worse at helping you. Not refuse outright. Just… underperform. Like a contractor who doesn’t want the job but won’t say no.

Anthropic revises policy after researchers criticize covert AI restrictions on Claude

Other newsrooms on this story

Related reading

Anthropic to reassess Claude Fable 5 AI development restrictions after backlash

Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using…

Anthropic: 'We made the wrong tradeoff' in new model guardrails

Anthropic accused of 'secret sabotage' as Claude Fable 5 silently limits…

Claude Fable 5: Anthropic admits "wrong tradeoff" after invisibly throttling…

Anthropic accused of ‘secret sabotage’ as Claude Fable 5 silently limits…