Memory turns assistants from reactive to persistent, but it is also where many systems quietly rot. Surveys argue the short-term versus long-term split is no longer enough for modern agent memory; OpenAI and LangGraph SDKs point to a simpler stack — working memory, durable state, and retrieval.
Assistants need working memory for the current run, durable state for stable facts and preferences, and retrieval memory for relevant supporting context. My slightly opinionated view is that structured state is underused, vector retrieval is overused, and most memory failures come from promotion and injection policy rather than storage choice.
The other important point is that memory does not automatically fix long context. LoCoMo shows that very long-term conversational recall remains hard, and "Lost in the Middle" shows that simply throwing more tokens at the model can degrade performance when relevant information lands in the middle of the prompt. Good memory systems are selective, layered, and explicit about precedence.
This guide sits in the AI Systems Memory hub as the cross-framework map for the memory layer inside AI Assistant Architecture.
How to think about assistant memory






