How can I get Thinking Content from Gemini-2.5-pro or flash with OpenAI SDK

So, I am working on a project from a very long time ago, I used OpenAI SDK from very far, but now I am integrating Gemini Models and I want to include Gemini-2.5-pro in my Application but as my complete app revolves around “OpenAI SDK” and I am unable to find “reasoning/thinking” contents of responses of Gemini-2.5 models, even I set reasoning_efforts to high, can any one tell me am I doing something wrong or Google is again late.

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: "API_KEY",
    baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/"
});

async function main() {
    const completion = await openai.chat.completions.create({
        model: "gemini-2.5-pro",
        reasoning_effort: "high",
        messages: [
            { "role": "system", "content": "You are a helpful assistant." },
            { "role": "user", "content": "Hello!" }
        ],
        stream: true,
    });

    for await (const chunk of completion) {
        console.log(JSON.stringify(chunk));
    }
}

main();

Hi,
in your request you need to add the extra_body field, such as:

response = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Explain to me how AI works"}],
    extra_body={
      "extra_body": {
        "google": {
          "thinking_config": {
            "thinking_budget": 800,
            "include_thoughts": True
          }
        }
      }
    }
)

About the response, In stream mode the thinking content starts in the first chunk, with the XML tag <thought> and ends in another chunk, depending on how long the thought is, with the XML tag </thought>. Then follows the response.

Here is the thinking documentation for the OpenAI API: OpenAI compatibility  |  Gemini API  |  Google AI for Developers

Ciao

1 Like