```
Error: {“error”:{“code”:400,“message”:“The input token count (135538) exceeds the maximum number of tokens allowed (131072).”,“status”:“INVALID_ARGUMENT”}}
```
So Odd. All of the docs boast and point me to the 1M context window. But on input I am getting 400 Errors. 2.5 Pro works just fine.
I thought I had just resolved most of my issues with 3.1 Preview. Looks like I will be degrading this model on my platform.
The only similar errors that I see are from 2.5 back in October.
Is anybody else seeing this error? Would love to know I am not alone.
I have a custom limit on VertexAI- so other rate limits are not the issue here.
I just recently added the header for thinking-level: medium, but that was just to speed up the model. It was working fine last week on these high input tasks.
EDIT: Looks like I am not the only one
I have a feeling the github comment may be correct. gemini-3.1-flash-live-preview has a token limit of 131,072.
One interesting thing I’ve noticed is the input token size is a lot lower than I’d expect. I’m wondering if some kind of pre-processing step using gemini-3.1-flash-live-preview is being used? Total guess, of course.
Workaround for me is reverting back to Gemini 2.5 Pro for the time being, unfortunately.
Yes this is a painful issue , 3.0 Pro preview was fine its the 3.1 that caused the issue. Right now falling back to 3.0 flash instead of 2.5 pro. Do we know when the next fix release cycle is expected