Google is tightening the spigot on its Gemini AI platform as demand from developers, enterprises, and fellow tech giants threatens to overwhelm available capacity. The move comes as Gemini API requests more than doubled between March and August 2025, forcing Google to rethink how it allocates one of the most sought-after resources in tech: raw AI compute.

Among those feeling the squeeze is Meta, which has been in discussions with Google Cloud about leveraging Gemini models for its advertising business. The situation illustrates a strange new dynamic in Silicon Valley, where bitter rivals are quietly becoming each other’s customers in the AI arms race.

What Google actually changed

Starting May 17, 2026, Google imposed compute-based usage limits on Gemini Apps. Think of it like a cellular data plan: instead of unlimited requests, users now operate under rolling 5-hour refresh windows and weekly caps.

The limits apply broadly, not just to one company. Google has documented rate limits and spending tiers designed to ensure fair API usage across all customers during what the company characterizes as a rapid growth phase.