The promise of an always-on AI agent is embarrassingly simple: you describe the work, you go to sleep, and it gets done. No "let me circle back," no PTO, no Monday ramp-up. For a solo operator that isn't a productivity hack — it's the difference between running a business and being run by one.

I bought the promise. Then I spent 30 days actually living with it: one OpenClaw agent pointed at my back office — inbox triage, lead research, drafting follow-ups, a couple of recurring reports. Not a fleet. One agent, one operator, real work that real money depended on.

It worked. Mostly. But the ways it didn't work taught me more about running agents in production than any benchmark ever has. Here's the part nobody puts on the landing page: none of the failures were the model being dumb. Every single one was an operations problem wearing a trench coat. These are the three that actually bit me — and what I'd tell you to do about each.

1. The slow amnesia (a.k.a. context rot)

The first week was magic. By the second, my agent started making confidently wrong calls — replying to a thread as if an earlier decision hadn't happened, re-researching a lead it had already qualified, quietly contradicting instructions I'd given it that morning.