Thinking ate all the tokens and hit MAX_TOKENS

,

Is this bug or it can happen?

```json
{
“candidates”: [
{
“content”: {
“parts”: ,
“role”: “model”
},
“finishReason”: “MAX_TOKENS”,
“safetyRatings”: [
{
“category”: “HARM_CATEGORY_SEXUALLY_EXPLICIT”,
“probability”: “NEGLIGIBLE”,
“blocked”: null
},
{
“category”: “HARM_CATEGORY_HATE_SPEECH”,
“probability”: “NEGLIGIBLE”,
“blocked”: null
},
{
“category”: “HARM_CATEGORY_HARASSMENT”,
“probability”: “NEGLIGIBLE”,
“blocked”: null
},
{
“category”: “HARM_CATEGORY_DANGEROUS_CONTENT”,
“probability”: “NEGLIGIBLE”,
“blocked”: null
}
],
“citationMetadata”: {
“citationSources”:
},
“tokenCount”: null,
“index”: 0,
“avgLogprobs”: null,
“groundingAttributions”: ,
“groundingMetadata”: null,
“logprobsResult”: null,
“urlRetrievalMetadata”: null
}
],
“promptFeedback”: null,
“usageMetadata”: {
“promptTokenCount”: 17759,
“candidatesTokenCount”: null,
“totalTokenCount”: 83294,
“cachedContentTokenCount”: null,
“toolUsePromptTokenCount”: null,
“thoughtsTokenCount”: 65535,
“promptTokensDetails”: [
{
“tokenCount”: 17759,
“modality”: “TEXT”
}
],
“cacheTokensDetails”: ,
“candidatesTokensDetails”: ,
“toolUsePromptTokensDetails”:
},
“modelVersion”: “gemini-2.5-pro”
}
```

Hi @NeonByteNomad , Thanks for reaching out to us.

For Gemini 2.5 Pro, You can explicitly set lower thinkingBudget like thinking_budget=1024in your request configuration to prevent the model’s internal reasoning from consuming all the tokens.

Additionally, you can also try Gemini 3 pro, which has thinking_level parameter, to control the maximum depth of the model’s internal reasoning process. You can set it to low to minimize token usage.