Knowledge that isn't pruned starts to mislead you
Part of the ForgeFlow series — building a coding agent that runs its execution loop locally on an M5 Max, and writing down what actually breaks. Planning runs through a separate planning step; code generation runs on a local model via Ollama, test-driven inside a Docker sandbox.
For months, my agent got better by accumulating rules.
Every time a project failed in a way I understood, I'd write the lesson down as a rule and feed it back in — don't do this, always do that — so the agent wouldn't repeat the mistake. To be clear about what "rules" means here: not fine-tuning, not weights. A plain, human-readable rulebook the system injects into the agent's context, built up from past failures. For a long time, adding to it worked. Autonomy went up. The same mistakes stopped coming back.
Then the agent started getting worse, and the rulebook was the reason.






