I spent a couple of weeks asking people a pretty basic question. If you are actually running agents, past the demo, in something resembling production, how do you handle memory?
I was expecting a handful of tips. What I got instead was the same frustration over and over, and a problem that, as far as I can tell, nobody has cleanly solved yet. So I am writing it down, because if you build with agents you are going to run straight into it.
The thing everyone starts with
Most agent memory works the same way. Embed everything the agent has seen, store the vectors, and when a new task shows up, pull back whatever is closest and drop it into context.
That is fine right up until it isn't. The catch is that "closest in vector space" really means "sounds related," and sounding related is not the same as having worked last time.








