Issue
Gemini API’s OpenAI compatible layer does not expose reasoning_tokens. This is making it difficult for applications to control reasoning.
To replicate, make any request to a reasoning model (e.g. Gemini 2.5 Flash) via the OpenAI compatible layer.
The usage field does not directly mention reasoning tokens
"usage": {
"completion_tokens": 18,
"prompt_tokens": 15,
"total_tokens": 175
}
Compare this to a response from the same model in Vertex AI:
"usage": {
"completion_tokens": 21,
"completion_tokens_details": {
"reasoning_tokens": 78
},
"extra_properties": {
"google": {
"traffic_type": "ON_DEMAND"
}
},
"prompt_tokens": 14,
"total_tokens": 113
}
Suggested solution
Match the convention used by Vertex AI