Thanks for sharing @Oscar_Hoffmann.
So till now i have encountered various versions of this issue :
. it doesnât return any text or return empty response
. it returns couple of words then ------ till token ends, this is worse as no error is detected
. it directly returns -------
. But mostly it returns empty response , sometimes after couple of minutes wait.
we cant use either 2.5 flash or 2.5 pro in production
they both seemed very promising but not reliable at all.
been months being a visitor here in hope it will get resolved but still dissapointed, i thought after IO 2025 it would be different. Very market immature LLM.
response:
GenerateContentResponse(
done=True,
iterator=None,
result=protos.GenerateContentResponse({
âcandidatesâ: [
{
âcontentâ: {
âroleâ: âmodelâ
},
âfinish_reasonâ: âSTOPâ,
âindexâ: 0
}
],
âusage_metadataâ: {
âprompt_token_countâ: 518,
âtotal_token_countâ: 6628
},
âmodel_versionâ: âmodels/gemini-2.5-flash-preview-05-20â
}),
)
same issue, empty response when using 2.5 flash. makes it very hard to rely on
Getting the same issue here. One thing to add on is that this problem just become so dominant(like 2/3) after i add mcp tools support to my framework.
When I change back to the stable model instead of using the preview one, problem seems to be solved.
@GUNAND_MAYANGLAMBAM Any updates on this? The seemingly random, yet frequent occurrence of this issues is hard to work around in production. Currently we are seeing this issue in about 1 out of 4 calls, exclusively with search grounded calls.
Added: Switching to Gemini 2.5 stable models has not resolved the situation for us, empty responses still occur, roughly with same frequency.
yes problem doesnât seems to exist or supressed in stable models but they are too underperforming when compared to latest preview model both flash and pro.
Problem still exists for us with the stable models. Iâve been using flash a lot and the problem seems worse there than with pro (which still has the issue)
Problem seems to have got worsen.
Did you implement the retry logic suggested by @Bryan_Hughes? This is a proper workaround for now. We increased the retry limit to 10 (!!!) and this seems to at least always generate a valid output at some point. The problem for our use-case is that IF Gemini 2.5 Pro generates output its by far superior AND the cheapest compared to all other LLMâs out there. This thread is getting pretty long and persistant. I expect more involvement and communication from Google.
yes i did but still sometimes it returns empty response and sometime even gives garbage out such as -------- or few words and ----------, these cases makes it difficult to even catch the error, i am using it to transcript handwritten texts then making those texts into some structured outputs.
i am still facing this issue with âgemini 2.5-flashâ model, is there any solution for this?
Ran into the same issue. After some prompt engineering, I managed to get results out of Gemini.
Still facing this issue on a regular basis. For additional information:
- only happens with grounded search (Python google.genai library)
- when it fails for a specific prompt, it typically fails repeatedly for that prompt
- the same exact prompt (with same model, settings and sys message) will work without issues in AI Studio
- despite no response text being returned, we are still seeing token charges in billing console
I can confirm that we are getting same random issue in Node.js using @google/genai.
Same issue hitting 2.5-flash with searchTools using ruby
HTTP Status Code: 200
HTTP Status Message: OK
Response Headers:
content-type: application/json; charset=UTF-8
vary: Origin, X-Origin, Referer
server: scaffolding on HTTPServer2
x-xss-protection: 0
x-frame-options: SAMEORIGIN
x-content-type-options: nosniff
connection: close
transfer-encoding: chunked
Response Body Encoding (before force_encoding): ASCII-8BIT
Response Body (raw, potentially with encoding issue): nil
GeminiService error: â\xC3â from ASCII-8BIT to UTF-8
Hallo everyone,
Sincere apologies for the inconveniences you all are experiencing. Please be advised that all priorities are being given to solve ongoing errors. The infrastructure is at its earliest stage of development and we are doing everything we can to make your experience less stressful. Thank you for Continued comments. They do help us identify the areas requiring immediate response.
Thank you
Thanks for the update. Do you have any response to the fact that for many of us getting empty responses back, that it seems like we are still be charged. Unfortunately because Google Cloud does not do billing on a per prompt basis, it is hard to isolate other that it feels like I am getting charged. We are in production and can not just experiment.
Cheers,
Bryan
For me, the model suddenly began returning an empty text with finishReason set to MAX_TOKENS.
Switching to the lite model temporarily resolve the issue, but it started failing again the next day.
{
"Response Body": {
"candidates": [
{
"content": {
"parts": [
{
"text": ""
}
]
},
"role": "model",
"finishReason": "MAX_TOKENS",
"index": 0
}
],
"usageMetadata": {
"promptTokenCount": 316,
"totalTokenCount": 715,
"promptTokensDetails": [
{
"modality": "TEXT",
"tokenCount": 316
}
],
"thoughtsTokenCount": 399
},
"modelVersion": "gemini-2.5-flash"
}
}
It turns out, cursor had renamed âgenerationConfigâ to âconfigâ and I had missed this update.
The issue was very confusing as the LLM call seemed to work initially despite misconfiguring it.
const response = await this.gemini.models.generateContent({
model: this.model,
contents: fullPrompt,
// generationConfig: {
config: {
temperature: temperature,
maxOutputTokens: maxTokens
}
});
Iâve been using the same prompt setup with 2.5 Pro for several weeks. Just started getting this same problem of a response with no content but finish_reason stop a few days ago. I have a retry loop already set up, and it works, but yeah, itâs getting worse, and are we being charged? I suspect that this is traffic related, since simply retrying works. Probably I should add an exponential backoff to space out the attempts, to avoid getting as many (pricey) failuresâŚ