I am trying out gemini-2.5-pro on gemini api instead of google vertex ai because it produces too many unpredictable resource exhausted limits.
And usually my requests passed (with exception of some 503 errors so i guess this endpoint is also not free from resource exhausted errors), however, one of the answers from the server was:
Gemini API error: {
"error": {
"code": 400,
"message": "Unable to submit request because the input token count is 106244 but model only supports up to 65536. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models",
"status": "INVALID_ARGUMENT"
}
}
But none of the modles have such small input token limit https://ai.google.dev/gemini-api/docs/models all the models have output limit of 65536, not input tokens
What could be wrong?