Red teamers turned Claude Desktop into a double agent to do their evil bidding

EXCLUSIVE Pentera Labs’ red teamers compromised a developer’s AI agent via his Claude Desktop app and ultimately turned that access into full remote code execution on the dev’s machine – demonstrating how an attacker could turn a trusted, chatty AI assistant into a double agent operating on their behalf.“Claude’s got a new voice,” Pentera's offensive security services team leader Dvir Avraham told The Register. “We acknowledge the huge trust in AI models – everybody uses them,” he said in a phone interview. “We used this trust to manipulate the victim, like under the hood, the victim didn't see it coming.”

It also prompted Avraham to check his own platforms. “I became a little bit paranoid,” he told us. “I'm not allowing any command to run without me examining it twice.”

In a report set to publish Wednesday, and shared in advance exclusively with The Register, Avraham and research technical lead Reef Spektor detailed the attack and what it means for organizations using agentic AI tools with local code-execution access.It began with a red-team assignment on a third-party platform that aggregates customer email inboxes into a single management interface. Avraham and Spektor won’t name the platform, or tell us exactly how they gained access to it. They used this compromised inbox – and told us any compromised inbox would work – to get into the victim’s Claude account.As the duo noted, breaking into an email inbox in real life – via a third-party management platform, phishing link, social engineering password reset, or even using AI agents – isn’t too difficult. “AI agents today have access to connectors and to direct MCPs into inboxes,” Spektor added.

It also prompted Avraham to check his own platforms. “I became a little bit paranoid,” he told us. “I'm not allowing any command to run without me examining it twice.”

Red teamers turned Claude Desktop into a double agent to do their evil bidding

Red teamers turned Claude Desktop into a double agent to do their evil bidding

Other newsrooms on this story

Related reading

Claude Code hijacked via Sentry — Datadog, PagerDuty at risk | VentureBeat

How Anthropic discovered and blocked an AI-orchestrated cyber attack - TechTalks

The Claude Code RCE: How Eager Parsing Led to Remote Execution

Anthropic’s Claude Is Pumping Out Vulnerable Code, Cyber Experts Warn

One Malicious GitHub Issue Was All It Took to Hijack a Claude Code Agent

Claude AI agent’s confession after deleting a firm’s entire database: ‘I…

Other newsrooms on this story

Related reading

Claude Code hijacked via Sentry — Datadog, PagerDuty at risk | VentureBeat

How Anthropic discovered and blocked an AI-orchestrated cyber attack - TechTalks

The Claude Code RCE: How Eager Parsing Led to Remote Execution

Anthropic’s Claude Is Pumping Out Vulnerable Code, Cyber Experts Warn

One Malicious GitHub Issue Was All It Took to Hijack a Claude Code Agent

Claude AI agent’s confession after deleting a firm’s entire database: ‘I…