AI + ML
Mythos and GPT-5.5 muscle out the competition
Sure, AI agents such as Mythos can find security vulnerabilities in software, but the bigger question is whether they can turn those flaws into functional exploits that work in the real world. After all, many AI-discovered bugs prove minor or difficult to weaponize. New research, however, suggests frontier models can indeed develop working exploits when directed to do so.To better understand the rapidly changing security landscape, computer scientists from UC Berkeley, Max Planck Institute for Security and Privacy, UC Santa Barbara, Arizona State University, Anthropic, OpenAI, and Google decided to build ExploitGym, a benchmark for evaluating the exploitation capabilities of AI agents.
This is not an entirely disinterested set of investigators – Anthropic, OpenAI, and Google all sell AI services. And both Anthropic and OpenAI have talked up the risk of leading models Claude Mythos Preview and GPT-5.5 while selling access to government partners.
Since Anthropic announced Mythos in early April, the security community has been critical of the company's approach, described by some as fear-mongering. And various security experts have made the case that even commercially available AI models can find security flaws.














