AI Engineer World's Fair Coverage
Workshops at this year's AI Engineer World's Fair shifted noticeably away from RAG and prompt engineering toward evals and open models. This transition highlights a broader change in our industry — an increased focus on how to measure and trust AI outputs. It's an important problem to tackle.
The more immediate issue is that most engineers simply refuse to consider any model other than the latest and most powerful frontier for their day-to-day tasks. I spend an inordinate amount of time trying to convince people that frontier models are not always necessary. Engineers tend to default to them even for trivial tasks like checking the weather. There's a change in developer behavior and belief that needs to occur before fast models can succeed.
We don't collectively seem ready to trust AI yet. The tendency to default to the most powerful option is a safe hedge. But today's fast models are equivalent to what the frontier was six months ago. Sonnet 4.6 performs comparably to Opus 4.1, Gemini Flash 3.5 competes with Gemini Pro 3.1, and GPT-5.4 Mini matches the performance of GPT-5.1. Fast models are a fraction of the cost, and they're substantially faster than waiting for a max-thinking response.







