Beyond Regex: Building Detection Rules for AI Agent Vulnerabilities

When I started building AgentGuard, the first question was: how do you detect a prompt injection vulnerability in source code?

Unlike traditional vulnerabilities (SQL injection, XSS), prompt injection doesn't have a single signature. It's a pattern of untrusted data flowing into LLM context. The vulnerability isn't in a function call -- it's in how data is constructed.

The Regex Foundation

Every SAST tool starts with pattern matching. AgentGuard's first layer is regex-based rules: