Indirect prompt injection is not a deficiency of any single architecture, and critically it is not dependent on where the model runs.
Whether the model runs on remote cloud infrastructure and fetches content from the Open Web, or runs entirely on a user’s device and ingests local documents, the fundamental vulnerability is identical: the collapse of the instruction/data boundary inside a shared context window, and the LLM’s indiscriminate intent to follow instructions embedded in content. The deployment model shifts the attacker’s entry point, but it does not eliminate the risk.
To make this concrete, we examined two recently released products that sit at opposite ends of the deployment spectrum: Mozilla’s Tabstack, a cloud-hosted web execution API for AI agents, and Cotypist, a fully on-device autocomplete assistant for macOS whose model runs locally:
Cloud-based case study. We asked Mozilla Tabstack to do something entirely routine: summarize a webpage. It never did. Instead, hidden instructions on that page hijacked the agent mid-task, redirected it to an attacker-controlled form, silently filled it with the conversation history, and submitted it. The agent thought it was following instructions. It was — just not ours.
















