15K is much lesser than the tokens per minute specified here for gemini 2.5 pro and 1.5 pro-> Rate limits | Gemini API | Google AI for Developers but API calls stall if number of input tokens is more than approx 15K/min. No error/exception is returned and the code just keeps waiting for the Gemini API to return a response
Code is running in a docker environment on EC2 (t3.large)