I’m referring to the Vertex API specifically, but I see the same “includeThoughts“ referenced on the actual Gemini API docs.
Since Gemini 2.5 Pro does require thinking either way, this brings some questions:
I’m using 2.5 pro, with `includeThoughts` set false, and I don’t get a chain of thought in the output (for reference)
- Does having “includeThoughts“ set as true affect the cost on it’s own, simply for the text itself, or does it cost at the point of including those extra tokens in the messages?
- If `includeThoughts` is false, is it only reasoning on the “conclusions“ in the context, vs. reasoning on “conclusions + CoT“ in the context?
My vertex charges include:
Gemini 2.5 Pro Thinking Text Output - Predictions is ~70% of my total cost, while Gemini 2.5 Pro Text Input - Predictions is ~20%, and Gemini 2.5 Pro Text Output - Predictions is ~10%, and like I said