A 27B model on an AMD mini-PC fixed a bug in our operator. Then it overreached.

Originally published at llmkube.com/blog/operator-fixed-its-own-bug-on-amd. Cross-posted here for the dev.to audience.

(LLMKube is an open-source, Apache-2.0 Kubernetes operator for self-hosted LLM inference across NVIDIA, Apple Silicon, and AMD. Foreman is its agentic harness.)

Last time Foreman built a feature for itself. This is the sequel, and it is a better story: this time it fixed a bug I had just shipped, the model doing the fixing ran on a consumer AMD box on my desk, and the most useful moment in the whole run is where the model got it wrong.

TL;DR

Wiring up Claude Code against one of my own local models surfaced a real bug in the operator: a hardcoded 60-second timeout that silently capped every request regardless of what you configured.

Originally published at llmkube.com/blog/operator-fixed-its-own-bug-on-amd. Cross-posted here for the dev.to audience.

(LLMKube is an open-source, Apache-2.0 Kubernetes operator for self-hosted LLM inference across NVIDIA, Apple Silicon, and AMD. Foreman is its agentic harness.)

TL;DR

Wiring up Claude Code against one of my own local models surfaced a real bug in the operator: a hardcoded 60-second timeout that silently capped every request regardless of what you configured.

A 27B model on an AMD mini-PC fixed a bug in our operator. Then it overreached.

A 27B model on an AMD mini-PC fixed a bug in our operator. Then it overreached.

Related reading

Trust the harness, not the model: a weekend of local agents building their own…

A local model opened 41 of our pull requests in five weeks. The model is the…

Making a fleet of self-hosted LLM agents trustworthy

Fault-injecting our LLM provider to trust Bifrost fallbacks

The agent that fixes bugs by running the code

I Revived a Broken MLOps Platform — Now It's Self-Service, Policy-Guarded, and…

Related reading

Trust the harness, not the model: a weekend of local agents building their own…

A local model opened 41 of our pull requests in five weeks. The model is the…

Making a fleet of self-hosted LLM agents trustworthy

Fault-injecting our LLM provider to trust Bifrost fallbacks

The agent that fixes bugs by running the code

I Revived a Broken MLOps Platform — Now It's Self-Service, Policy-Guarded, and…