It is not possible to disable thinking in flash-preview-09-2025 even using thinking_config=genai_types.ThinkingConfig(thinking_budget=0).
This only happens in preview-09-2025.
It is not possible to disable thinking in flash-preview-09-2025 even using thinking_config=genai_types.ThinkingConfig(thinking_budget=0).
This only happens in preview-09-2025.
Hello
Welcome to the Forum !!
are you still running into this issue with thinking_budget=0 being ignored? If so, would you mind sharing a small snippet of reproducible code? It would really help in investigating this further.
Thanks
I’ve battled against this for some time now. A have a snippet from earlier since I have reported this before.
from google import genai
from google.genai import types
import PIL.Image
from pydantic import BaseModel, Field
from typing import Optional
class StructuredOutputSchema(BaseModel):
main_entry: str = Field(description="The main entry of the catalog card. The main entry is the text on the first line of the card, often underlined")
bibliographic_information: str = Field(description="The full OCR text extracted from the image, including the main entry, excluding subject headings and shelf mark.")
subject_headings: Optional[str] = Field(description="A list of subject headings found on the card. The subject headings are handwritten notes in the top left corner of the card.")
shelf_mark: Optional[str] = Field(description="The shelf mark or call number of the book. The shelf mark or call number is a handwritten note in the top right corner of the card.")
API_KEY="API_KEY_HERE"
generation_config = types.GenerateContentConfig(
temperature = 0.1,
top_p = 0.95,
top_k = 40,
max_output_tokens = 2000,
response_mime_type = "application/json",
response_json_schema = StructuredOutputSchema.model_json_schema(),
system_instruction = "You are an OCR interpreter. Your task is to extract all text from the provided image of a library catalog card. Ensure that the extracted text is accurate and complete, preserving the original formatting as much as possible. Also, ensure that all characters are captured and returned correctly, even those outside the standard ASCII range. The card contains bibliographic information such as title, author, publication year, and location. The information can appear in various languages which may include special characters.",
http_options = {"timeout": 60000},
thinking_config = types.ThinkingConfig(thinking_budget=0)
)
image_path = "003_00009.jpg"
# model = "gemini-2.5-flash"
model = "gemini-2.5-flash-preview-09-2025"
image = PIL.Image.open(image_path)
prompt = "Return the extracted text in JSON format according to the specified schema."
contents = [image, prompt]
client = genai.Client(api_key=API_KEY)
result = client.models.generate_content(model=model, contents=contents,config=generation_config)
print(result)
The image referenced:
Response:
sdk_http_response=HttpResponse(
headers=<dict len=11>
) candidates=[Candidate(
content=Content(
parts=[
Part(
text="""{
"main_entry": "Achrelius, Daniel",
"bibliographic_information": "Achrelius, Daniel Memoria amplissimi viri Enevaldi Svenonii, s.s. theologiæ doctoris, professoris ejusdem facultatis primarii... solenni oratione, ab oblivione & tenebris vindicata, postridie ex-eqviarum. 8:o /Åbo/ 1689.",
"subject_headings": "<Svenonius, Enevald>",
"shelf_mark": "(Bn) Biogr. Sw."
}"""
),
],
role='model'
),
finish_reason=<FinishReason.STOP: 'STOP'>,
index=0
)] create_time=None model_version='gemini-2.5-flash-preview-09-2025' prompt_feedback=None response_id='l4pBabbPMb70xN8PyqfP2Q0' usage_metadata=GenerateContentResponseUsageMetadata(
candidates_token_count=135,
prompt_token_count=365,
prompt_tokens_details=[
ModalityTokenCount(
modality=<MediaModality.TEXT: 'TEXT'>,
token_count=107
),
ModalityTokenCount(
modality=<MediaModality.IMAGE: 'IMAGE'>,
token_count=258
),
],
thoughts_token_count=606,
total_token_count=1106
) automatic_function_calling_history=[] parsed={'main_entry': 'Achrelius, Daniel', 'bibliographic_information': 'Achrelius, Daniel Memoria amplissimi viri Enevaldi Svenonii, s.s. theologiæ doctoris, professoris ejusdem facultatis primarii... solenni oratione, ab oblivione & tenebris vindicata, postridie ex-eqviarum. 8:o /Åbo/ 1689.', 'subject_headings': '<Svenonius, Enevald>', 'shelf_mark': '(Bn) Biogr. Sw.'}
Thoughts_token_count is 606.
A related issue, at least for me:
Switching to gemini-2.5-flash results in endless newline characters when the model tries to generate the letter “Å” (I have thousands of examples of this). I was told this was solved with improvements in structured output in the 09-preview version. But, and what this thread is about, the preview version ignores thinking_budget.