RAG and Long Context Aren't Enough for Agent Memory. δ-mem Is a Third Option

An 8×8 online state lifted Qwen3-4B from 46.79% to 51.66%, with the backbone untouched. δ-mem...

domenica 31 maggio 2026 New tab

2,487 words~11 min read

An 8×8 online state lifted Qwen3-4B from 46.79% to 51.66%, with the backbone untouched.

δ-mem stores an LLM’s conversation history inside an 8×8 matrix and uses it to steer attention.

The backbone stays frozen. No prompt growth. No fine-tuning.

On Qwen3-4B-Instruct, that small matrix lifts the average score across five benchmarks from 46.79% to 51.66%, with 4.87M trainable parameters (0.12% of the model).

The adapter is public on Hugging Face under CC-BY-4.0. The arXiv paper landed on May 12, 2026.

RAG and Long Context Aren't Enough for Agent Memory. δ-mem Is a Third Option

RAG and Long Context Aren't Enough for Agent Memory. δ-mem Is a Third Option

Other newsrooms on this story

Related reading

LLM agent memory at 0.12% of model parameters

Memory beats full context on LongMemEval — and the wins we don't get

MeMo's memory model lets teams upgrade their LLM without retraining it — and…

I Built a Memory API That Beats Mem0 on LongMemEval Without Using a Single LLM…

AI memory framework MeMo skips LLM retraining

MIT's MeMo framework boosts LLM performance by 26% without retraining

Other newsrooms on this story

Related reading

LLM agent memory at 0.12% of model parameters

Memory beats full context on LongMemEval — and the wins we don't get

MeMo's memory model lets teams upgrade their LLM without retraining it — and…

I Built a Memory API That Beats Mem0 on LongMemEval Without Using a Single LLM…

AI memory framework MeMo skips LLM retraining

MIT's MeMo framework boosts LLM performance by 26% without retraining