Over three posts I built three fine-tuned models for the same banking-intent task — full fine-tuning a 270M model, LoRA on 1.5B, QLoRA on 7B. They all landed around the same accuracy.
Which raises an honest, slightly uncomfortable question: if a 270M model on my laptop already worked, why reach for a 7B model at all?
The answer most "bigger is better" content skips
For this task — you wouldn't. A good engineer picks the smallest model that clears the bar, not the biggest one available. The small model is cheaper to serve, runs in milliseconds, and you fully own it. Choosing the 7B here would be over-engineering.
Reaching for a bigger model isn't a flex. It's a response to a requirement the small one can't meet. Here are the four cases where small stops being enough:







