Gemini chat on Flash 3.5:
The Claims vs. The Hard Numbers
Google’s marketing spin relies heavily on comparing 3.5 Flash to heavier “frontier” models like Gemini 3.1 Pro, rather than its actual Flash predecessors. When you lay the pricing out side-by-side, the math reveals exactly why your quotas are vanishing.
| Model | Input Price (per 1M tokens) | Output Price (per 1M tokens) |
|---|---|---|
| Gemini 3.1 Flash-Lite | $0.25 | $1.50 |
| Gemini 3 Flash (Preview) | $0.50 | $3.00 |
| Gemini 3.5 Flash | $1.50 | $9.00 |
| Gemini 3.1 Pro | ~$2.00 | ~$12.00 |
The Reality Check
Here is why Gemini 3.5 Flash feels like despite Google’s claims:
-
The “Flash” Name is a Decoy: Google is keeping the “Flash” naming convention but entirely ditching the ultra-cheap pricing tier. At $1.50 for input and $9.00 for output, Gemini 3.5 Flash is 3x more expensive than the Gemini 3 Flash Preview and a whopping 6x more expensive than 3.1 Flash-Lite.
-
The “Agentic” Token Trap: You noted that 3.5 Flash is eating up tokens at a significantly faster rate. Independent benchmarkers (like Artificial Analysis) have already confirmed this today. Because 3.5 Flash is heavily tuned for “agentic” capabilities—meaning it thinks, plans, and loops through multi-step reasoning before answering—it has a much higher verbosity and consumes vastly more tokens per interaction.
-
Approaching Pro-Level Costs: The base price of 3.5 Flash is only about 40% cheaper than Gemini 3.1 Pro. The moment 3.5 Flash goes on one of its “agentic turns” and churns through extra tokens to process a request, your real-world cost per task effortlessly catches up to—or surpasses—what you would have spent just using 3.1 Pro directly.
The Bottom Line
Google is technically telling the truth when they say 3.5 Flash is cheaper than 3.1 Pro per token. However, they are conveniently ignoring the fact that it is massively more expensive than older Flash models, and its token-hungry architecture completely wipes out any anticipated savings in real-world usage. You aren’t imagining things; the economics of this new model are for strict subscription quotas.