Since March 16–17, 2026, I have noticed a significant increase in the costs for API calls using the gemini-3-flash-preview model. We do not use the caching system or the Google search system.
Our cost calculations (in terms of user credits UCR) based on the tokens (promptTokenCount, thoughtsTokenCount, and candidatesTokenCount) no longer match the associated costs displayed in Google AI Studio since March 16–17, 2026. The number of requests is valid and has not changed.
Before March 16, we had approximately 100k UCR available for a $1 spend.
On March 16, we saw this value change; we went down to 35k UCR for $1.
From March 17 to today, we are now at 8k UCR for $1.
Actually, I get the impression that before March 16, we weren’t charged the actual rate for the gemini-3-flash-preview model, but a lower rate. I just checked the actual cost based on the tokens sent and consumed, and actually, the price today seems fair. It was the price displayed before March 16 that was much lower than the actual price we should have paid for this model.
Have you calculated the actual expected cost based on the tokens sent and used, using the current rate schedule?
No it seems not. When I analyze my SKUs on Google Cloud from march 16th, the flash 3 model used way more output tokens that it did before, like a x4/x5 compare to the previous period. Same thing for the 3.1 flash lite preview model