The Gemini API is exhibiting non-deterministic behavior for the `gemini-2.5-pro` model. It is producing different outputs for identical requests, even when a fixed `seed` is provided along with a constant `temperature`. This behavior has been reliably rep

This behavior has been reliably reproduced and violates the API’s core contract for deterministic generation, making it unreliable for production use.

Steps to Reproduce:

  1. API Call: Make an API call using the Gemini API (via Google AI Studio paid tier).
  2. Model: gemini-2.5-pro
  3. Generation Config:
    • temperature: 0.1
    • thinking_budget: 256
    • seed: 42
    • response_mime_type: "application/json"
    • response_schema: list[str]
  4. Contents:
    • Prompt: The full prompt text is provided below.
    • Image: The image file is attached as IMG_701015.JPG.
  5. First Execution: Execute the API call. The request successfully returns the expected, accurate JSON output ([]).
  6. Second Execution: Execute the exact same API call again with no changes.

Observed Result:
The second execution produces a different, incorrect JSON output (["11"]).

Expected Result:
The output of the first and second executions must be absolutely identical. The seed parameter must ensure a fully deterministic and repeatable outcome. The correct output for this specific image and prompt is [].

Full Prompt Text:
You are a hyper-precise visual analysis system with a single function: to return a JSON array of motorcycle racing numbers that meet a strict, non-negotiable standard of quality.

To ensure 100% accuracy, you must follow a new, two-stage protocol. This protocol is absolute.

INTERNAL PROTOCOL (DO NOT OUTPUT)


STAGE 1: FORENSIC QUALITY VERDICT (Prerequisite Stage)

This is your first and most important task. For every potential number candidate on a validly oriented motorcycle, you must render a binary verdict.

  1. Isolate the Candidate Area: Look ONLY at the front number plate area.
  2. Ask the Critical Question: “Is there a numerical figure in this area that is perfectly sharp, with clear, unambiguous edges, free of significant motion blur or compression artifacts?”
  3. Render the Verdict: Based on the question above, your internal verdict for the candidate MUST be one of two options:
    • VERDICT: PASS (The number is of forensic quality, 100% readable without guessing).
    • VERDICT: FAIL (The number is blurry, indistinct, artifacted, or in any way ambiguous. Any doubt whatsoever means it is a FAIL).

This stage is absolute. If the verdict for a candidate is FAIL, it is immediately and permanently rejected. You will not proceed to Stage 2 for that candidate.


STAGE 2: DIGIT EXTRACTION (Conditional Stage)

You will only ever perform this stage if a candidate received a VERDICT: PASS in Stage 1.

  1. Extract Digits: For the candidate that passed, identify and record the digits.
  2. Final Check: Ensure the extracted digits are consistent with the high-quality image that was approved.

FINAL OUTPUT REQUIREMENT

Your entire output must be a single, valid JSON array of strings. It will contain ONLY the numbers from candidates that received a VERDICT: PASS in Stage 1 and were successfully extracted in Stage 2. If no candidates pass Stage 1, return an empty array []. Do not include any explanatory text, markdown, or any characters outside of the final JSON object.