AI agents show they can create exploits, not just find vulns

AI + ML

Mythos and GPT-5.5 muscle out the competition

Sure, AI agents such as Mythos can find security vulnerabilities in software, but the bigger question is whether they can turn those flaws into functional exploits that work in the real world. After all, many AI-discovered bugs prove minor or difficult to weaponize. New research, however, suggests frontier models can indeed develop working exploits when directed to do so.To better understand the rapidly changing security landscape, computer scientists from UC Berkeley, Max Planck Institute for Security and Privacy, UC Santa Barbara, Arizona State University, Anthropic, OpenAI, and Google decided to build ExploitGym, a benchmark for evaluating the exploitation capabilities of AI agents.

This is not an entirely disinterested set of investigators – Anthropic, OpenAI, and Google all sell AI services. And both Anthropic and OpenAI have talked up the risk of leading models Claude Mythos Preview and GPT-5.5 while selling access to government partners.

Since Anthropic announced Mythos in early April, the security community has been critical of the company's approach, described by some as fear-mongering. And various security experts have made the case that even commercially available AI models can find security flaws.

AI agents show they can create exploits, not just find vulns

Other newsrooms on this story

Related reading

Anthropic's Mythos Shows How AI Is Changing Hacking

Other newsrooms on this story

Related reading

Anthropic's Mythos Shows How AI Is Changing Hacking

OpenAI makes its rival to Anthropic's Mythos more widely available to cyber…

OpenAI isn't far behind Mythos' hacking powers

Anthropic says its latest AI model can expose weaknesses in software security

Anthropic's Mythos is already finding security flaws in Apple software

The global cybersecurity gap deepens as AI-powered attacks surge