I am trying out gemini-2.5-pro on gemini api instead of google vertex ai because it produces too many unpredictable resource exhausted limits.
And usually my requests passed (with exception of some 503 errors so i guess this endpoint is also not free from resource exhausted errors), however, one of the answers from the server was:
Gemini API error: {
"error": {
"code": 400,
"message": "Unable to submit request because the input token count is 106244 but model only supports up to 65536. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models",
"status": "INVALID_ARGUMENT"
}
}
I am also getting the same error when using gemini-2.5-pro via Vertex AI (paid plan). However, I am not getting the error when using gemini-2.5-flash with the exact same input tokens
I’m also hitting this error when using videos understanding. I assumed the error was just miswritten but reducing the FPS input (and thus the input tokens) made the error go away.
We were unable to reproduce the error using gemini-2.5-pro or any other model. According to the documentation, gemini-2.5-flash-image has an input token limit of 65,536. Could you please confirm if gemini-2.5-flash-image was inadvertently used instead of gemini-2.5-pro?
If you continue to experience this issue, please provide the following details to assist us in our investigation:
Complete Error Message
Model name
Sample code snippet with prompts used to reproduce the error
@Siddharth_Naik yes, i was using gemini-2.5-pro. it is exactly the same code that worked and only randomly threw this message
i have provided you with exact response message and the model, but if you i can pull exact example failing request from server logs for you again:
model: gemini-2.5-pro
response:
Failed to analyze video with Gemini API. HTTP code: 400. {
"error": {
"code": 400,
"message": "Unable to submit request because the input token count is 106244 but model only supports up to 65536. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models",
"status": "INVALID_ARGUMENT"
}
}