Google adjusts Gemini’s new usage limits in response to complaints

At I/O 2026 last week, the Gemini app switched to compute-based usage limits. In response to “feedback about hitting limits too quickly,” Google today announced some changes.
The new “compute-used” approach (5-hour refresh until weekly limit is met) to usage is meant to take into account the complexity of prompts, what tools are used, and chat length. Google last week noted how “a simple text prompt uses far less compute than a complex video or coding prompt.” In the future, Google will let Gemini app users buy pay-as-you-go top-up AI credits.
When using Gemini 3.1 Pro, Gemini lead Josh Woodward today shared that Google is “capping the amount of quota a single prompt can use so you get more out of the Pro model.” This is in response to complex prompts with large files quickly depleting limits.
Google clarified that errors don’t count against the limits: “If a request fails, you won’t be charged. Our system mistakes are on us, not you. Your quota is used only for successful completions.”
Heavy tasks like Deep Research “require more compute,” so Google is going to provide “more detailed usage breakdowns and notifications to help you maximize your limits.” As is, the gemini.google.com/usage dashboard only provides a high-level overview.
Meanwhile, 3.1 Flash-Lite prompts are now “free and won’t count against your quota.” Google also notes how:
When you select a specific model, we remember that choice across all future sessions. It will only change if you manually adjust it or hit a cap that triggers an automatic fallback to a lighter model.
Finally, Google has addressed a bug where “just one or two Omni videos” would drain quotas for “certain people.” Google AI Ultra users now have doubled the number of Omni generations.
We fixed this and will continue to look for opportunities to increase the amount of Omni you get.
FTC: We use income earning auto affiliate links. More.



