Recently, I came across an essay titled "On the Death of Scaling" by Sara Hooker (Co-founder of Adaption Labs). In this essay, Sara explains the shortcomings of the simple path followed by frontier labs to lead the market. She discusses where the notion of "scaling is death" comes from and what to consider next.
In the last decade, where LLMs are emerging as the ideal path to attaining AI, or what experts call AGI, they have been found to be not so accurate. All LLM-based labs are following one brute force rule of adding more and more weights with more compute to outperform other available models, and up to a certain point, it is helping. Using more compute and data, LLMs are outperforming their predecessors and competitors. But now, the landscape is changing. It has been found that much smaller, latest models (<13B) are now outperforming previous models with enormous parameters. For example, Falcon 180B is easily outperformed by models like Llama 3 8B, Command R 35B, and Gemma 3 27B. Additionally, Aya 23 8B and Aya Expanse 8B have outperformed BLOOM 176B with 94% less weights.
From the above image of the HuggingFace OpenLLM Leaderboard, it is shown that smaller models are significantly outperforming larger ones, both reaching a performance plateau (as transformers' performance reaches its plateau). Hence, it is proven that a bigger size does not always guarantee better performance.








