Researchers at the AI security firm NeuralTrust have discovered a critical prompt injection vulnerability in OpenAI’s new AI-powered browser, Atlas. The flaw allows a malicious, URL-like string to be interpreted as a trusted command, potentially tricking the browser’s AI agent into performing harmful actions on a user’s behalf. This discovery highlights a fundamental security challenge in the emerging category of agentic browsers, where the line between user intent and untrusted content can become dangerously blurred.
Atlas, recently launched for macOS, is OpenAI’s entry into the browser market, designed to integrate web navigation with a conversational AI. The browser features a unified omnibox that accepts both traditional URLs for navigation and natural-language prompts for its AI agent. This agent can perform multi-step tasks for the user, such as creating a grocery list from a recipe page, by interacting with websites directly. The browser aims to create a more seamless experience by allowing users to chat with the AI on any webpage, leveraging browsing history to personalize interactions.
A deeper look at the omnibox jailbreak
The vulnerability stems from how Atlas processes ambiguous input in its omnibox. An attacker can craft a string that appears to be a URL (e.g., by starting it with https://) but is intentionally malformed so it fails standard validation. When Atlas fails to parse the string as a navigable URL, it defaults to interpreting the entire text as a natural-language prompt for its AI agent. Because this input originates from the omnibox, the system treats it as trusted, first-party user intent, subjecting it to fewer safety checks than content sourced from a webpage.









