Last Tuesday my API gateway caught a NullReferenceException, streamed it to a dashboard in real-time, and pushed a draft code fix to the browser tab of the on-call engineer — before I finished reading the error myself. That sentence used to be vendor marketing. Now it's just my Program.cs.
This is the architecture post-mortem. I built it on weekends. It runs in Docker. It cost me exactly $0 in LLM credits during development because Groq's free tier is generous and Ollama works as a swap-in. The repo is here — issues and PRs welcome.
The problem most .NET teams have
Production errors are caught, logged to a file, and forgotten. Engineers find out from a Slack ping twenty minutes later, if at all. By the time someone looks, the original request context is gone, the user's session has expired, and the stack trace is buried four layers deep in System.* calls.
"Self-healing" is a word vendors use to mean "auto-restart the pod." I wanted something better. The actual ask:






