TL;DRAI

Anthropic disclosed 31.5% pre-safeguard hijack rate for Opus 4.8 browser agent; post-safeguard drops to ~1%. For AI-integrated crypto platforms and agents parsing untrusted content, this vulnerability metric and transparency gap vs competitors should shape security decisions.

Nearly one in three attempts to hijack Anthropic’s newest AI browser agent succeeded before safeguards kicked in. That is not a rumor from a red-team Slack channel. It is a number Anthropic printed in its own system card.

The company released the Claude Opus 4.8 system card on May 28, spanning 244 pages and covering four agentic surfaces. The pre-safeguard hijack rate for the browser agent clocked in at 31.5%. To put that in plain terms: if a malicious actor pointed a prompt injection attack at the model while it was browsing the web, the attack worked roughly a third of the time, assuming no defensive layers were active.

The transparency gap across frontier labs

Here’s the thing. That 31.5% figure looks bad in isolation. But Anthropic is the only frontier lab that actually gave security professionals a concrete number to work with this spring.

OpenAI published a prompt injection disclosure that covered only one surface: connectors. Google moved the entire subject out of its model card and into a broader safety framework document, effectively diluting the specificity. Meta shipped no closed-model card at all.

cryptobriefing.com

Anthropic reveals 31.5% hijack rate for Opus 4.8 browser agent before safeguards

Anthropic's Opus 4.8 system card reveals a 31.5% browser agent hijack rate before safeguards. Here's why the transparency gap across AI labs matters for crypto.

lunedì 1 giugno 2026 New tab

TL;DRAI

555 words~3 min read

The transparency gap across frontier labs

Here’s the thing. That 31.5% figure looks bad in isolation. But Anthropic is the only frontier lab that actually gave security professionals a concrete number to work with this spring.

Anthropic reveals 31.5% hijack rate for Opus 4.8 browser agent before safeguards

Anthropic reveals 31.5% hijack rate for Opus 4.8 browser agent before safeguards

Other newsrooms on this story

Related reading

Anthropic’s newest model excels at finding security vulnerabilities, but raises…

Anthropic launches Claude for Chrome in limited beta, but prompt injection…

Anthropic Debuts More Honest AI Model As Competition Intensifies

Anthropic Opus 4.8: the AI Lab is Paying Attention to Customers

Anthropic’s Claude Is Pumping Out Vulnerable Code, Cyber Experts Warn

How to try Claude Opus 4.8, the 'honest' Anthropic AI

Other newsrooms on this story

Related reading

Anthropic’s newest model excels at finding security vulnerabilities, but raises…

Anthropic launches Claude for Chrome in limited beta, but prompt injection…

Anthropic Debuts More Honest AI Model As Competition Intensifies

Anthropic Opus 4.8: the AI Lab is Paying Attention to Customers

Anthropic’s Claude Is Pumping Out Vulnerable Code, Cyber Experts Warn

How to try Claude Opus 4.8, the 'honest' Anthropic AI