(Image credit: Framework)
For heavy AI users, the economics of the current boom are starting to bite. Over the past year, major labs have nudged prices upward while tightening the screws on usage — whether through stricter rate limits, smaller context windows on lower tiers, or the gradual reshuffling of features behind more expensive plans. Even where per-token costs have fallen in headline terms, the reality for users is more complicated: higher volumes, more complex workflows, and new tooling expectations mean monthly bills are creeping up, not down.At the same time, open-weight models have improved rapidly, consumer hardware has become more capable, and tools like LM Studio, Ollama, and llama.cpp have made local deployment far more accessible than it was even a year ago. The result is a renaissance in running models on your own machines.
Chris Stokel-Walker is a Tom's Hardware contributor who focuses on the tech sector and its impact on our daily lives— online and offline. He is the author of How AI Ate the World, published in 2024, as well as TikTok Boom, YouTubers, and The History of the Internet in Byte-Sized Chunks.















