There's a quiet assumption baked into a lot of agent code: that a bigger context window means a better memory. Vendors ship 200K, then 1M, then 2M token windows, and the implied promise is "just put everything in and the model will remember." After building agents that run for weeks, I've come to think this conflates two things that are not the same — and treating them as the same is exactly why long-running agents get dumber over time.
The context window is working memory. Real memory is what survives when the window is gone. Mixing them up is like confusing your desk with your filing cabinet.
Two different clocks
Working memory (the context window) lives for one session, maybe one turn. It's fast, expensive, and volatile. It's where reasoning happens right now.
Durable memory lives across sessions. It's slow, cheap, and persistent. It's what the agent knows when it wakes up tomorrow with an empty window.






