This is part thirteen in a series about managing the growing pile of skills, scripts, and context that AI coding agents depend on. Part ten covered the full improve pipeline — all five phases and how they connect. Part fourteen covers what 48 runs per day looks like in practice, including hardware benchmarks and the reliability bugs that surface at that frequency.
The reflect pass inside akm improve has three execution modes. Most installs are still running the slowest one.
Agent mode — the original — spawns an opencode or claude subprocess for each reflect call. The subprocess starts cold, acquires a session, assembles context, makes its LLM call, and exits. That cold-start overhead is real: each call takes approximately 30 seconds on a quiet machine. Run akm improve against a 69-ref stash and the reflect phase alone costs about 35 minutes.
SDK mode eliminated the subprocess. The reflect call runs in-process, cutting per-call latency to 10–15 seconds. A 69-ref run drops to 12–17 minutes — better, but still bounded by round-trip overhead that the reflect task does not actually need.
LLM mode removes the round trip entirely. The context for reflect is statically pre-assembled — no live tool calls, no file reads, no external context needed. A direct HTTP call to the LLM endpoint is sufficient, and it costs 6–10 seconds per call. A 69-ref run completes in 8–10 minutes.







