I have been using 2.5 Pro API endpoint for well over a month now with no issues. Around 3 days ago it stopped working, it’s returning a status 200, but the content is totally empty. If I ask it what 1+1 is, it’ll give me the answer, but much beyond that it is empty. I have asked it why it is doing this and sometimes I get an answer, with it suggesting it was filtering as the prompt contained PII, however if that question with no PII, literally just a question about why I might get an empty response is empty.
However Gemini 2.5 Flash is working fine. I am using the same API keys. It is on a free account for testing so I have definately not reached a usage limit, to be 100% sure I set up a new account and have the same issue.
I have attached a screenshot of the response, it’s not giving me any clues. Anybody have any ideas?
For few days, during that time we had 10-20% failures per day(now it is complete 100% failure) I thought there was some issue with my prompt. Oh boy, then realized there are lot of people complaining.
Oh that is really good to know! I never thought of that. Now I just need to figure out how to integrate vertex instead of my standard end to end. Is it the same process just change the end point?
Yes, that worked. I have a feeling that people are misusing gemini API, and probably Gemini can’t handle too much workload. Vertex comes under paid tier, so they may have fixed there first.
@Mohamed_Amine I still see some failures with vertex AI, do you see the issue is completely gone for your use case? Or I am just curious because I am processing audio files and this issue happens still.
no failures on my part it works perfectly on vertex i have been coding for like 4 hours no errors but i noticed that gemini does make alot of stupid mistakes that he didnt make when i was on the gemini api like failed edits/wrong code in general
not working for me and i noticed something when using vertex ai i think its more expensive then the gemini api or something because i swear any little task i make on vertex i get billed like 2$ when on the gemini api i rarely get billed 2$
After Mohameds suggection of using Pro 2.5 in Vertex AI instead of the end to end API, I tried and it worked. I ran multiple requests with no issue, then flipped back and forwards to the end to end vs Vertex on Pro 2.5 and only the Vertex one worked.
Setting up Vertex which was a little tricky especially finding my way around the Google Cloud settings, aside from setting up the service account for your project, you have to ensure the aim gserviceacount is set up and also ENABLED which did not seem to enable by default. You also have to make sure have to make sure that “Vertex AI API” also have user permission set up, or set it up as Vertex AI Administrator, in the IAM roles. It was tricky. Cursor AI managed to make the necessary integration after a half dozen attempts including it deciding to put a super low token request in my request!
It also has security by default which actually stops it working by default, so while setting it up I did indeed hit this issue. Here are a couple of videos that helped me set it up and fix the security issue
It looks like Google do also charge a tiny, tiny amount per 1,000 tokens for using Vertex.
Anyway, it looks like this is a way to get 2.5 Pro working. Hope the above is helpful to those not too familiar with the Vertex and Cloud setup.
I have been using vertex since my post with no complaints from my small group of end user testers. I do have a failover to other llm if it fails so i will double check the logs tomorrow