In a previous post on the agent execution loop, I showed how agents work: a while loop that calls a model, executes tools, and iterates until done. That loop is the engine or harness. But engines need control routines that govern behaviour.When you deploy an agent to production, new questions emerge:How do I log every model call and tool execution?How do I block malicious prompts before they reach the LLM?How do I rate limit users to control costs?How do I redact PII from outputs before they reach users or broader telemetry systems?How do I create audit trails for compliance?One way to do this is middleware - routines that intercept agent operations at key points in the execution loop (e.g., before/after a model call, before/after tool calls etc). If you’ve built web applications with Express, Django, or FastAPI, you know this pattern. And it works just as well for agents.Note: This post is adapted from my book Designing Multi-Agent Systems, where Chapter 4 walks you through building a complete Agent class from scratch. The agent class is part of PicoAgents : a minimal, hackable multi-agent framework which the reader gets to build across the sections of the book. While we will be using the picoagents sample in this post, the same patterns apply across frameworks like LangChain, Microsoft Agent Framework, and others (see examples at the end of the post).Digital PDF: buy.multiagentbook.comPrint on Amazon: amazon.com/dp/B0G2BCQQJYMiddleware source: picoagents/_middleware.pyMiddleware intercepts agent operations before and after they execute. When an agent prepares to call the LLM or execute a tool, that operation first passes through a middleware chain. Each middleware can:Inspect the operation (inputs, context, metadata)Modify inputs or outputsBlock the operation entirely by raising an exceptionLog what happened for observabilityHere’s the mental model:User Task