The problem: I loved AI summaries until I got the bill

Last month I was working on a SaaS product that needed to summarize long articles for users. Think of it like a TL;DR generator. I built a first prototype using GPT-4 with a straightforward prompt: "Summarize this article in 3 bullet points." It worked beautifully. The summaries were crisp, accurate, and users loved them.

Then the API bills arrived. One month of moderate usage cost me over $1,200. That's not sustainable for a side project. I had to fix it or kill the feature.

What I tried that didn't work

First, I tried switching to GPT-3.5-turbo. The price dropped dramatically, but the quality tanked. Summaries became vague, sometimes missing key points. I tried prompt engineering — adding "be specific" or "include numbers" — but nothing reliably matched GPT-4's output.