When teams start building AI agents, most of the early energy goes into prompts, models, and tool definitions. Which model should we use? How do we structure the tool-calling loop? What's the right retry strategy?
These are all reasonable questions. But there's another question that usually shows up late — often too late — and shapes everything else:
Where should your AI agent actually run?
The execution environment isn't just an infrastructure detail. It determines what your agent can and can't access, how sensitive data moves (or doesn't), what hardware costs look like at scale, and how much your users are willing to trust the system. Get this decision right early, and a lot of other choices fall into place naturally. Get it wrong, and you're refactoring core architecture six months in.
Let's walk through the three main approaches.














