The blue spike is Gemini 3 Flash (unexplained cost explosion), and the orange is Gemini 3.1 Flash Lite Preview (forced migration from 2.5 Flash Lite, which also comes with a massive price increase).
Hi everyone,
I’m experiencing a sudden and unexplained cost explosion with Gemini 3 Flash Preview (gemini-3-flash-preview) over the last few days.
My setup
-
Several production apps using the Gemini API for image analysis
-
Thinking is explicitly disabled (
thinking: false) -
Each call: one optimized image + text prompt → structured JSON response
-
Nothing has changed in my code, prompts, or image processing
What I’m seeing
-
March 16: €1.96 for ~409 generations
-
March 17: €4.39 for ~556 generations (+36% volume, but +124% cost)
-
March 18: costs continuing to climb
The cost per generation has roughly doubled, going from ~€0.0048 to ~€0.0079 per call, with no change on my end. Output sizes are stable (~1,700–2,100 chars JSON). At this rate I’m heading toward €200/month instead of my usual €30/month.
Additional context
I already migrated from Gemini 2.5 Flash Lite to 3.1 Flash Lite Preview: that’s the huge orange bar appearing on the graph. This migration is FORCED since Google WILL DELETE 2.5 Flash Lite on March 31, and the new model is x3.75 MORE EXPENSIVE on output . But the cost increase on gemini-3-flash-preview is separate and unexplained — this model wasn’t announced as changing.
This looks familiar
Back in August 2025, a similar billing bug was reported where Google’s metering system was miscategorizing “thinking” tokens as “image output” tokens, causing 5–40x cost spikes. I really hope this isn’t the same issue happening again.
Questions
-
Has the pricing or token counting for
gemini-3-flash-previewchanged recently? -
Is anyone else seeing similar cost increases on this model?
Any help appreciated. This is becoming unsustainable for a small indie developer.
Update: Per-token price analysis
After digging into my billing CSV and comparing with the official pricing docs, it seems like the gemini-3-flash-preview model was silently updated to a version with this new (much higher) pricing structure, without any notice or migration period. The issue is that the pricing page says “including thinking tokens” for output, there is no separate cheaper rate for non-thinking output anymore.
Combined with the forced migration from 2.5 Flash Lite (€0.34/M output) to 3.1 Flash Lite Preview (€1.27/M output, x3.75), the total bill has gone from ~€30/month to a projected ~€200/month with no code changes on my end.
