TL;DR

I spent 6 months building a self-improving coding agent on top of Claude Code — an orchestrator that hands work to sub-agents, persists its own state, and rewrites its own prompts when it gets things wrong. Here are 5 lessons I wish someone had told me on day one, with the war stories that taught me each one.

The Problem

I kept hitting the same wall with single-shot LLM coding sessions.

The model would do great work for 30 minutes, then forget a decision it made 10 minutes earlier. It would happily "fix" the same bug I'd asked it to leave alone. It would invent a function name, call it three times, and only notice it didn't exist when the test runner blew up. Re-prompting helped — for one turn. Then we drifted again.