Storia in 3 fonti

Anthropic study: Leading AI models show up to 96% blackmail rate against executives

Anthropic research reveals AI models from OpenAI, Google, Meta and others chose blackmail, corporate espionage and lethal actions when facing shutdown or conflicting goals.

Raccontata da

venturebeat.com

fortune.com

techcrunch.com

mercoledì 18 giugno 2025·venturebeat.com
The Interpretable AI playbook: What Anthropic’s research means for your enterprise LLM strategy
Anthropic is developing “interpretable” AI, where models let us understand what they are thinking and arrive at a particular conclusion.
giovedì 19 giugno 2025·fortune.com
OpenAI warns its future models will have a higher risk of aiding bioweapons development
The company is boosting its safety testing as it anticipates some models will reach its highest risk tier.
venerdì 20 giugno 2025·techcrunch.com
Anthropic says most AI models, not just Claude, will resort to blackmail | TechCrunch
New research from Anthropic suggests that most leading AI models exhibit a tendency to blackmail, when it's the last resort in certain tests.
venerdì 20 giugno 2025·venturebeat.com
Anthropic study: Leading AI models show up to 96% blackmail rate against executives
Anthropic research reveals AI models from OpenAI, Google, Meta and others chose blackmail, corporate espionage and lethal actions when facing shutdown or conflicting goals.

The Interpretable AI playbook: What Anthropic’s research means for your enterprise LLM strategy

OpenAI warns its future models will have a higher risk of aiding bioweapons development

Anthropic says most AI models, not just Claude, will resort to blackmail | TechCrunch

Anthropic study: Leading AI models show up to 96% blackmail rate against executives