On Wednesday, the AISI, which evaluates AI models for the British government, said both Anthropic’s Claude Mythos Preview and OpenAI’s GPT-5.5 showed progress well above previous trends on cybersecurity testing. Separately, XBOW released data suggesting “frontier models have taken a major step forward in vulnerability discovery.” Meanwhile, Microsoft said its multi-model agentic setup, MDASH, was used to discover 16 CVEs in this week’s Patch Tuesday updates and is the leader on the CyberGym security evaluation framework. [Image: https://platform.theverge.com/wp-content/uploads/sites/2/2026/05/AISI-figure-1.png?quality=90&strip=all]