Gemini 2.5 pro and 1.5 pro APIs take forever to respond if input tokens/min > 15K to 16K

15K is much lesser than the tokens per minute specified here for gemini 2.5 pro and 1.5 pro-> Rate limits  |  Gemini API  |  Google AI for Developers but API calls stall if number of input tokens is more than approx 15K/min. No error/exception is returned and the code just keeps waiting for the Gemini API to return a response

Code is running in a docker environment on EC2 (t3.large)

1 Like

Hey @shivanis , just getting back to you, are you still seeing the same issue?