Shipping an agent demo takes an afternoon. Shipping one that survives a quarter in production is a different job — and the gap is almost never the model. It's three boring things that are usually missing entirely.
I maintain an open, MIT-licensed Agentic Product Standard, and v2.0 was mostly about turning those three things from advice into code you can run. Here they are, with the actual code.
Security is structural, not a filter
The most common mistake is treating safety as a guardrail — an input/output filter near the edge. The problem is that filters have a ceiling. The best content classifiers run around 97% accuracy, which means ~3% of prompt-injection attempts land by design. That's not a bug you tune away; it's the nature of filtering.
Real safety comes from architecture. The check I reach for first is Simon Willison's lethal trifecta: an agent becomes an exfiltration tool the moment it has all three of —






