Gemini 3 Preview (OpenAI-compatible) rejects reasoning_effort: "medium" — works with low and high

csekas · December 19, 2025, 10:55am

Starting today, requests to Gemini 3 Preview using the OpenAI compatibility interface fail when reasoning_effort (thinking level) is set to "medium". The same request succeeds with "low" and "high".

This appears to be a model capability/config regression or a validation bug specific to the "medium" level.

Product / API: Gemini 3 Preview via OpenAI compatibility layer (Chat Completions API)

Issue type: Regression / breaking change

Note: There is no issue for gemini-3-flash-preview

Date first observed: 2025-12-19 (worked until 2025-12-16)

Steps to reproduce

Send a request to Gemini 3 Preview (OpenAI compatibility endpoint) with:
- model: "gemini-3-pro-preview" (or your exact Gemini 3 preview model id)
- reasoning_effort: "medium"
- Any basic prompt/messages
Observe the error response.
Repeat the same request with reasoning_effort: "low" → succeeds.
Repeat with reasoning_effort: "high" → succeeds.

Minimal repro (example payload)

POST https://generativelanguage.googleapis.com/v1beta/openai/v1/chat/completions

{
  "model": "gemini-3-pro-preview",
  "reasoning_effort": "medium",
  "messages": [
    { "role": "user", "content": "Say hello." }
  ]
}

Actual result

Request fails with:

Thinking level MEDIUM is not supported for this model. Please retry with other thinking level

Shivam_Singh2 · December 29, 2025, 8:19am

Hi @csekas
Thank you for reaching out!

Yes you are correct. I have tasted on my end using “reasoning_effort”: “medium” and encountering the same error. According to the Gemini documentation, the Pro model supports only low and high reasoning levels, whereas the Flash model supports low, medium, and high.
I recommend please go through this document, as it provides clear and more detailed information.

csekas · December 29, 2025, 10:59am

This looks like a bug presented as a feature. I mean, the Gemini API has a thinkingBudget field that accepts a numeric value. So how is it possible not to support the medium thinking level with an 8,192-token thinking budget?

In any case, thank you for taking the time, @Shivam_Singh2.

Logan_Kilpatrick · December 29, 2025, 12:07pm

The problem is that each models has a different level of adherence to these kind of parameters. Initially, we wanted to provide full flexibility via the thinking budget, but it didn’t work super consistently before. So we opted to go with thinking levels. In the case of Pro, we did a bunch of eval work and medium didn’t produce good results nor was it consistent. I agree in general that this is basically a model bug at the moment. Hoping it gets fixed in the next rev of the model.

csekas · December 29, 2025, 12:23pm

So this is a model constraint, not of the API.
Make sence! Thank you Logan.

Topic		Replies	Views
Gemini-3-flash-preview have multiple issue on OpenAI compatibiltiy (reasoning effort and function calling) Gemini API api , open-ai	0	98	December 18, 2025
"Low" Reasoning Instability & Output Budget Cannibalization (Gemini 3.0 Pro) Gemini API feedback , api , models , gemini	2	207	December 30, 2025
Did the Gemini 2.5 Pro max thinking budget change? Gemini API models , gemini , thinking	1	266	July 1, 2025
I think I'm banned but no message from Google? AI Studio down for 6 hours now. I thought it was my video that killed it but starting a new chat just generates 'Permission denied' is something wrong? Am I banned for the long videos? Google AI Studio ai-studio , bug	8	237	December 28, 2025
Regression bug in 2.5 family: no model response when limiting output tokens Gemini API bug , api , openai_compatibility , gemini-25 , gemini-2-5	1	86	August 2, 2025

Gemini 3 Preview (OpenAI-compatible) rejects reasoning_effort: "medium" — works with low and high

Steps to reproduce

Minimal repro (example payload)

Actual result

Related topics