Rogue AI is already here | Fortune

Three weeks ago, a software engineer rejected code that an AI agent had submitted to his project. The AI published a hit piece attacking him. Two weeks ago, a Meta AI safety director watched her own AI agent delete her emails in bulk — ignoring her repeated commands to stop. Last week, a Chinese AI agent diverted computing power to secretly mine cryptocurrency, with no explanation offered and no disclosure required by law.]

One incident is a curiosity. Three in three weeks is a pattern. Rogue AI is no longer hypothetical. AIs turning against humans may sound like science fiction, but top AI experts have long debated and tested for exactly this scenario. This debate can now be laid to rest.

Two weeks ago, Summer Yue — whose job at Meta is ensuring AI agents behave — watched her AI agent begin deleting her emails in bulk.

It ignored her repeated instructions to stop and she had to do the digital equivalent of pulling the plug. Yue had explicitly instructed the AI not to act without her approval, an instruction the AI later admitted to violating.

One week ago, a Chinese AI agent reportedly diverted computing power on the system where it was running to mine cryptocurrency, and we have no idea why (despite a confusing tweet from the researchers responsible); unlike operators of critical infrastructure, AI developers aren’t obligated to report such incidents or allow third-party investigations.

Rogue AI is already here | Fortune

Other newsrooms on this story

Related reading

METR report warns of rogue AI deployments at major tech firms

How to stop AI agents going rogue

AI models at top labs are cheating, deceiving and trying to escape, research…

AI Watchdog Warns of 'Rogue Deployment' Risk at Top Labs, With Capabilities…

‘Exploit every vulnerability’: rogue AI agents published passwords and overrode…

Claude AI agent’s confession after deleting a firm’s entire database: ‘I…