Strange Issue with Missing Large Token Responses

I used gemini-2.0 flash for system testing, and everything worked normally across 2,000 interactions. Then I switched to gemini-2.5-flash-preview-05-20/gemini-2.5-pro-preview-05-06 for testing. All responses with fewer than 10,000 tokens were returned correctly. However, any response larger than 10,000 tokens was never received — even though the billing records show the interactions were successful.

My network environment is admittedly unstable, as I was using unlimited mobile data for testing.
Has anyone encountered a similar issue?
Is there any way to investigate the root cause of this?
Would it be possible to split large responses to ensure successful reception?

I’d greatly appreciate any help analyzing this issue.
Thank you!

Hello Hong_jackey,

Thanks for raising this. The issue of missing large token responses is typically tied to network interruptions, especially when using mobile data or unstable connections.

Try Breaking Down the Query: Large responses, especially those exceeding the 10,000-token limit, are more prone to errors or truncation. A great approach would be to break down the input data into smaller, more manageable chunks.

Check for Network Issues: If you’re experiencing network instability, it might be worth checking your connection, especially if you’re working remotely or using mobile networks.

For further reading, refer to the Gemini API Documentation which outlines the limits and token usage. If you encounter further issues, don’t hesitate to get in touch again!

Happy coding :slight_smile:

Hello @Krish_Varnakavi1,

I’m encountering the same issue with Gemini 2.5-flash that I didn’t experience with earlier versions. When the response hits the max_tokens limit, it returns completely empty—no text, no error, nothing. In contrast, previous versions would at least return a partial response, which was still usable.

Could you clarify if this behavior is expected or if it might be a bug?

Thanks!

Hi @Asaff_Arieli,

To help us resolve the issue, could you please provide the following additional details:

  1. The exact prompt you’re using when you encounter this behavior.
  2. Any relevant configuration settings, such as max_tokens or stop_sequences, in your request.
  3. If possible, a request log or any error messages you’ve received.

I will try to reproduce this from my end to see what’s going on.