Cloudflare acknowledged that Mythos Preview represents a massive technological leap forward. However, the company stressed that this breakthrough is a double-edged sword

The worst fears of cybersecurity experts about the flipside of advanced artificial intelligence (AI) have come true. A study by global cybersecurity major Cloudflare on Anthropic’s new ‘Mythos Preview’ model has revealed that the AI can autonomously weaponise cyberattacks by chaining minor software bugs into devastating exploits.While evaluating various large language models (LLMs) to identify vulnerabilities in its own infrastructure, Cloudflare discovered a stark distinction in Mythos: its uncanny ability to group separate, low-severity bugs into a single, multi-stage attack path.“A real attack rarely uses one bug. It chains several small attack primitives together into a working exploit... Mythos Preview can take several of these primitives and reason about how to combine them into a working proof. The reasoning it shows along the way looks like the work of a senior researcher rather than the output of an automated scanner,” Cloudflare noted.By stitching these minor, otherwise ‘invisible’ vulnerabilities into severe attacks, the model effectively mimics a senior researcher — giving hackers a precise blueprint to automate complex cyberattacks.To test these capabilities, Cloudflare pointed the model at more than 50 of its own code repositories to analyse what it could find and how it operates. While other frontier LLMs did find some of the same underlying software bugs, they fell short at the finish line. Traditional models typically identify a bug, describe why it matters, and stop — leaving the actual chain unfinished. Mythos, however, successfully closes that gap on its own.Unpredictable GuardrailsThe study also highlighted a bizarre quirk in the AI’s behaviour: its internal safety boundaries are completely erratic. In one instance, the model initially refused to conduct vulnerability research on a project, but then agreed to perform the exact same research after an unrelated change was made to the project’s environment.In another case, the model identified serious memory flaws in a codebase but refused to write a demonstration exploit to prove it.“The same request, framed differently, got a different answer, and even the same request can produce different outcomes across runs due to the probabilistic nature of the model,” the company stated, warning that these organic refusals are not consistent enough to serve as a reliable safety boundary.Cloudflare acknowledged that Mythos Preview represents a massive technological leap forward, making apples-to-apples comparisons to earlier models difficult. However, the company stressed that this breakthrough is a double-edged sword.“We also recognise this topic cuts both ways,” Cloudflare warned. “The same capabilities that helped us find bugs in our own code will, in the wrong hands, accelerate the attack side against every application on the Internet.”Published on May 20, 2026