Same, very frequent 499 occurrences on the 2.0 Flash multimodal API!
Around here I implemented a retry method for errors 499, but sometimes I get the 503. I wanted to know if they have any relationship.
We also started getting a mix of 499s and 503 (service unavailable errors). Google support relayed an answer from the product team saying that the 499 means the server is exhausted and it should be treated as a 429 error. Iām surprised that this happened so suddenly without us increasing throughput. Weāre already sending requests across regions.
@Logan_Kilpatrick can you confirm that this is the case and we should expect that this error rate is normal?
Thank you @Logan_Kilpatrick - Any update on this? I keep randomly receiving it.
Hey folks, Team is actively working on it. Will be fixed soon.
How can a bug like this stay open for 14 days. We donāt need a new model every other week so you can say āwe beat the benchmarksā.
We need stable models. I now started receiving empty answers, no code changes. Itās just unusable.
Hey guys, Iām running an experiment for a school paper and Iāve included Gemini models via API. In total I needed to generate 128 results with Gemini 3 Pro and 128 results with Gemini 3 Flash. Iām on tier 1 billing and was able to generate most of the responses throughout yesterday afternoon but at one point yesterday the API started responding with either 499 operation cancelled or 504 deadline exceeded. It was happening intermittently with repeated request usually getting a proper response but now with only 16 results needed remaining I am not able to finish the data set.
Naturally during testing of my script and tweaking the variables needed for the experiment the total number of requests was much higher than 256 but I checked the dashboard and I have not reached any of the limits, neither RPM/TPM/RPD (alhough I am close with 225/250 on RPD on gemini 3 pro).
As I am writing this, Iām trying 4 remaining results for gemini 3 pro and only 1 came back successful, 2 returned 504 and 1 returned 499. With Gemini 3 Flash I am not able to get a successful response at all. The timeout is set to 1 minute, is there something I can do from my end?
Two things, the issue is happening stilll as of 2/24/2026 and I noticed you still get billed for API usage even if you get 499 errored. Iām on Tier 3 billing.
Error: API Error: {āerrorā:{āmessageā:ā{\n āerrorā: {\n ācodeā: 499,\n āmessageā: āThe
operation was cancelled.ā,\n āstatusā: āCANCELLEDā\n }\n}\nā,ācodeā:499,āstatusā:āClient
Closed Requestā}}]
We are also still seeing this (especially on newer models). Someone else recently opened an issue on Github that suggests weāre not the only ones: HTTP 499 CANCELLED should be in the default retryable status codes Ā· Issue #2506 Ā· googleapis/python-genai Ā· GitHub