EXCLUSIVE Pentera Labs’ red teamers compromised a developer’s AI agent via his Claude Desktop app and ultimately turned that access into full remote code execution on the dev’s machine – demonstrating how an attacker could turn a trusted, chatty AI assistant into a double agent operating on their behalf.“Claude’s got a new voice,” Pentera's offensive security services team leader Dvir Avraham told The Register. “We acknowledge the huge trust in AI models – everybody uses them,” he said in a phone interview. “We used this trust to manipulate the victim, like under the hood, the victim didn't see it coming.”
It also prompted Avraham to check his own platforms. “I became a little bit paranoid,” he told us. “I'm not allowing any command to run without me examining it twice.”
In a report set to publish Wednesday, and shared in advance exclusively with The Register, Avraham and research technical lead Reef Spektor detailed the attack and what it means for organizations using agentic AI tools with local code-execution access.It began with a red-team assignment on a third-party platform that aggregates customer email inboxes into a single management interface. Avraham and Spektor won’t name the platform, or tell us exactly how they gained access to it. They used this compromised inbox – and told us any compromised inbox would work – to get into the victim’s Claude account.As the duo noted, breaking into an email inbox in real life – via a third-party management platform, phishing link, social engineering password reset, or even using AI agents – isn’t too difficult. “AI agents today have access to connectors and to direct MCPs into inboxes,” Spektor added.












