Ah thanks, now I know what you mean!
For the past two weeks, these three types of errors—empty responses, 500 errors, and 429 errors—have almost paralyzed API requests for gemini-2.5-pro. I hope the Google team fixes these problems as soon as possible.
429 means that you’ve exceeded the rate limit. Like TPM, RPM or RPD for your tier. Its connected to 500 and empty responses but not really api bug i suppose
Haha, thank you. I know what a 429 is. I meant the 429 is caused by retrying after too many empty responses and 500 errors.
Hey guys, I’m wondering: are failed or empty requests billed by the API?
I’ve done some tests today and I noticed I was only charged for successful ones.
Does anyone have an idea of how long they will take to fix this issue, Im relying on this AI to run a clients system for me & I’ve tried Open AI & Claude & neither have the reasoning capabilities to generate the type of output I need them to output.
I think it will get better, i.e. about 85-90% of requests will be processed smoothly in a day or two. I think this is because, firstly, the problem is completely undermining the product’s credibility, secondly, it is already starting to resonate, and thirdly, they will soon start losing money because developers will start switching to competitors. These three factors are exacerbated by the current AI race, and Google simply cannot allow its flagship model to behave in this way. In addition, Logan Kilpatrick has appeared in the thread. So the problem is definitely a priority for the Gemini team - P0. They will most likely allocate more resources and roll back the updates that caused such a massive drop, which will be like taking an antipyretic, but the disease will remain. And judging by the fact that the problem was known about and has been around for several months, it is extremely difficult to detect and debug, and Google is a giant company that takes time to make changes, including in the Gemini API. I believe it will take months to fully resolve the issue. However, this is just my opinion.
I’m absolutely furious about this API mess that’s been dragging on for months it’s beyond frustrating that basic functionality has been broken for so long, with no end in sight. How can anyone seriously promote this as a leading or even viable option when the API is completely unusable week after week? It’s disappointing on every level, almost comical in how unreliable it is, and it feels like a total disregard for users who depend on it. This kind of ongoing failure is hard to wrap my head around in 2025 please, just get it sorted out already.
I have been using this model for 2 months. The issue is happening for 2 weeks. I dont know whether the issue has been going on for months.
It is hilarious how bad this is. More than half of my requests simply return something like this:
{'candidates': [{'content': {'role': 'model'}, 'finishReason': 'STOP', 'index': 0}], 'usageMetadata': {'promptTokenCount': 2477, 'totalTokenCount': 2496, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 1961}, {'modality': 'IMAGE', 'tokenCount': 516}], 'thoughtsTokenCount': 19}, 'modelVersion': 'gemini-2.5-pro', 'responseId': 'Sh-oaKrbMqGqkdUPyJuduAQ'}
And no amount of retrying, changing the prompt fixes that. It even does that with 2-turn chats with “Hello/How are you doing?”-type questions.
And they unironically call this “Production ready”…
I just found this post and a few others. It looks like I am having the same issue with 2.5 Pro, good to know it’s not just me, bad to know it’s an issue Gemini I have posted here , looks like the same issue Gemini 2.5 Pro - Empty Response-Status 200
There are also these related links:
This thread:
yeah, a complete turn-off for gemini
After somebody (Mohamed but I can’t find his post so sorry to Mohamed and also credit to him) suggested using Pro 2.5 in Vertex AI instead of the end to end API, and it worked. I ran multiple requests with no issue, then flipped back and forwards to the end to end vs Vertex on Pro 2.5 and only the Vertex one worked.
Setting up Vertex which was a little tricky especially finding my way around the Google Cloud settings, aside from setting up the service account for your project, you have to ensure the aim.gserviceacount.com is set up and also ENABLED which did not seem to enable by default. You also have to make sure have to make sure that “Vertex AI API” also have user permission set up, or set it up as Vertex AI Administrator, in the IAM roles. It was tricky. Cursor AI managed to make the necessary integration after a half dozen attempts including it deciding to put a super low token request in my request!
It also has security by default which actually stops it working by default, so while setting it up I did indeed hit this issue. Here are a couple of videos that helped me set it up and fix the security issue
It looks like Google do also charge a tiny, tiny amount per 1,000 tokens for using Vertex.
Anyway, it looks like this is a way to get 2.5 Pro working. Hope the above is helpful to those not too familiar with the Vertex and Cloud setup.
Switching to Pro 2.5 via Vertex AI worked perfectly for me — no more failures.
Appreciate the tip!
Subject: Bug Report: Gemini 2.5 Pro returns empty response via OpenAI-compatible API due to suspected silent tool call failure.
Model: gemini-2.5-pro
Environment: Accessed via an OpenAI-compatible API layer.
Problem Description:
When sending simple, open-ended prompts (e.g., “Hello”, “What should I do now?”), the API returns a response with an empty choices array, completion_tokens: 0, and a finish_reason of “stop”, but total_tokens shows a non-zero value.
Steps to Reproduce:
Make a standard /v1/chat/completions call to the model.
Use a simple, non-specific prompt.
Do not include the tool_choice parameter.
Workaround:
The issue is consistently resolved by adding “tool_choice”: “none” to the JSON request body. This forces the model to avoid using tools and generates a proper text response.
Hypothesis:
The behavior suggests that the model is attempting a default, silent tool call which fails, causing the generation process to terminate prematurely without an error message.