Storia in 1 fonti

METR says it can barely measure Claude Mythos, Palo Alto Networks warns of autonomous AI attackers

METR can barely measure Claude Mythos Preview with its current test suite. Only five out of 228 tasks cover the relevant capability range. Meanwhile, Palo Alto Networks reports that frontier models autonomously chain vulnerabilities, shrinking the time from initial access to data exfiltration to just 25 minutes. Evaluation methods are growing more slowly than the models themselves, and that may be the bigger problem.

Raccontata da

the-decoder.com

Timeline cronologica

domenica 10 maggio 2026·the-decoder.com
METR says it can barely measure Claude Mythos, Palo Alto Networks warns of autonomous AI attackers
METR can barely measure Claude Mythos Preview with its current test suite. Only five out of 228 tasks cover the relevant capability range. Meanwhile, Palo Alto Networks reports…