Notes from building a memory layer that forgets on purpose.
Most "memory-enabled" agents don't remember anything. They re-read.
Every turn, the whole conversation gets pasted back into the prompt, and we call that memory because the model can answer questions about earlier turns. It's a good trick. I used it for months. It also falls apart the moment real people start using the thing, and it falls apart in three separate ways.
The first is the one everyone notices: it's expensive and noisy. You re-send every prior turn on every request. The single line you actually care about - "I'm allergic to peanuts" - is buried under a thousand lines of small talk, and you pay for all of it, every time.
The second is quieter and worse. Transcript-stuffing has no idea what stale means. If someone told your agent "I'm vegetarian" in March and "I eat fish now" in May, you've just handed the model both facts with equal weight. Now it has to guess which one is current. Sometimes it guesses wrong, and there's nothing in the system that even thinks that's a problem.






