Outputs get cut off/truncated on gemini 2.5 - flash, even on paid Tier 1

Hi,

Given the recent quota decrease for free tier and our need to ship our product quickly we decided to run our final tests on paid tier.

Yet I must say the quality on paid tier is much worse than it was a week ago on free tier.
And the output speed is around 3 times slower, taking around 10 seconds for a simple request.

To limit and test our needs I reduced gemini-flash-2.5 token and RPD capabilities by 90% in google cloud console, yet it is still more then enough for our needs, as apart from embedding with latest gemini-embedding-001 I see we barely used up any remaining tokens for prompts.

Every api call returns with a truncated message.

I’d rather not test further unless the issue is resolved as I don’t want to use up more tokens unnecessarily.

I would really appreciate if someone could look into this issue.

We need to ship, and the recent changes really made us reassess our strategy.

Sincerely,

Hi @P_S , Thanks for reaching out to us.

Could you please share the exact model name and parameters you are using, along with one example of the truncated response? Additionally, could you check your current quota settings in Google Cloud Console, especially the token and RPD limits?