Anadolu / Getty Images
Google $GOOGL -2.10% announced several changes to Gemini's usage limits after subscribers complained about hitting caps with only a handful of prompts.
The company switched to a compute-based usage system at I/O 2026 last week, moving away from the previous prompt-based limits. Rather than counting individual prompts, the system weighs factors like request complexity, tool usage, and conversation length to determine how quickly a user's quota is consumed.
Among the changes taking effect now, Google said it is placing a ceiling on the amount of quota any single prompt can consume when using Gemini 3.1 Pro, a response to users finding that complex requests with large files rapidly drained their allowances. According to Gemini lead Josh Woodward $WWD +0.01%, the fix addresses reports of complex requests draining allowances rapidly: the company is now "capping the amount of quota a single prompt can use so you get more out of the Pro model."
Interactions using Flash-Lite have been carved out of the quota system altogether, so those exchanges no longer draw down a user's allowance. The company also clarified that failed requests will not be charged against a user's quota.










