Originally published on kuryzhev.cloud
We thought setting up a self-hosted Ollama homelab for DevOps assistance would take an afternoon. Three OOM crashes, one exposed API endpoint, and a silent CPU fallback later, here's what actually works — and what we'd do differently from day one.
Context: Why We Wanted a Local LLM for Homelab Automation
Our homelab runs a fairly typical self-hosted DevOps stack: Gitea for source control, Drone CI for pipelines, Proxmox for VM management, and a handful of Docker Compose services scattered across two physical hosts. Day-to-day tasks involve writing Ansible playbooks, debugging systemd units, generating Dockerfiles, and occasionally reverse-engineering someone else's Terraform. All of that involves a lot of back-and-forth with language models.
The problem with cloud-based LLM APIs — ChatGPT, Claude, Gemini — is that they require pasting real infrastructure configs into a browser. Hostnames, internal IP ranges, secrets that slipped into a playbook comment, service account names. None of that should leave the network. We also have air-gapped lab segments that physically can't reach the internet, and we wanted consistent tooling across both environments.






