Every session,

the LLM starts fresh. The user re-explains their role, their constraints, their preferences, what they were doing last time. Then the session ends, and next time: same thing.

The industry has diagnosed this correctly — statelessness is a real limitation. But the solutions being built mostly share the same premise: that memory is a service you connect to. I think that premise is wrong, and it shapes everything downstream.

The actual cost of statelessness

This isn't just a UX annoyance. A 2026 study by Pichay measuring 857 production AI sessions found that 21.8% of input tokens are "structural waste" — context that has to be re-established on every session because nothing persists. Nearly a quarter of your token budget, on every call, going toward re-explaining what should already be known.