Honestly, i Cut My LLM Bill 40x: A Backend Engineer's Migration Notes
Last month I opened my OpenAI invoice and did a double-take. Not because the number was huge, but because I'd been too lazy to optimize it. My little side project — a RAG pipeline that scrapes docs and answers questions — was chewing through GPT-4o at $10.00 per million output tokens. That's not insane if you're shipping enterprise software. It's pretty dumb if you're running a hobby app on a Hetzner box in your basement.
So I did what any self-respecting backend engineer does: I opened a spreadsheet, ran the numbers, and migrated the whole stack in an afternoon. Here's what I learned, what broke, and what I'd do differently next time.
Fwiw, this isn't a "10x AI productivity" LinkedIn post. It's notes from someone who actually has to pay the bill.
The Math That Made Me Move






