Hello,
I are building a consumer facing product and rely on several LLM calls. I really like the intelligence/latency trade-off of Gemini under good conditions. However, every day at around 9am PST, the latency of the models I use (gemini 3 flash and gemini 3.1 flash lite) more than triples, so I have to swap to another provider. The degradation lasts several hours. This has been ongoing for a while but has worsened in the last week.
Question for the Gemini API team: are you aware of this, and are planning to fix it? or is the non-vertex api meant to be used only for testing, non-production use cases, and the recommended route for those is vertex? I am also not 100% sure since I have also seen vertex api degrading at around the same timeframe (although by not as much).