This behavior has been reliably reproduced and violates the API’s core contract for deterministic generation, making it unreliable for production use.
Steps to Reproduce:
- API Call: Make an API call using the Gemini API (via Google AI Studio paid tier).
- Model:
gemini-2.5-pro
- Generation Config:
temperature: 0.1
thinking_budget: 256
seed: 42
response_mime_type: "application/json"
response_schema: list[str]
- Contents:
- Prompt: The full prompt text is provided below.
- Image: The image file is attached as
IMG_701015.JPG
.
- First Execution: Execute the API call. The request successfully returns the expected, accurate JSON output (
[]
). - Second Execution: Execute the exact same API call again with no changes.
Observed Result:
The second execution produces a different, incorrect JSON output (["11"]
).
Expected Result:
The output of the first and second executions must be absolutely identical. The seed
parameter must ensure a fully deterministic and repeatable outcome. The correct output for this specific image and prompt is []
.
Full Prompt Text:
You are a hyper-precise visual analysis system with a single function: to return a JSON array of motorcycle racing numbers that meet a strict, non-negotiable standard of quality.
To ensure 100% accuracy, you must follow a new, two-stage protocol. This protocol is absolute.
INTERNAL PROTOCOL (DO NOT OUTPUT)
STAGE 1: FORENSIC QUALITY VERDICT (Prerequisite Stage)
This is your first and most important task. For every potential number candidate on a validly oriented motorcycle, you must render a binary verdict.
- Isolate the Candidate Area: Look ONLY at the front number plate area.
- Ask the Critical Question: “Is there a numerical figure in this area that is perfectly sharp, with clear, unambiguous edges, free of significant motion blur or compression artifacts?”
- Render the Verdict: Based on the question above, your internal verdict for the candidate MUST be one of two options:
VERDICT: PASS
(The number is of forensic quality, 100% readable without guessing).VERDICT: FAIL
(The number is blurry, indistinct, artifacted, or in any way ambiguous. Any doubt whatsoever means it is a FAIL).
This stage is absolute. If the verdict for a candidate is FAIL
, it is immediately and permanently rejected. You will not proceed to Stage 2 for that candidate.
STAGE 2: DIGIT EXTRACTION (Conditional Stage)
You will only ever perform this stage if a candidate received a VERDICT: PASS
in Stage 1.
- Extract Digits: For the candidate that passed, identify and record the digits.
- Final Check: Ensure the extracted digits are consistent with the high-quality image that was approved.
FINAL OUTPUT REQUIREMENT
Your entire output must be a single, valid JSON array of strings. It will contain ONLY the numbers from candidates that received a VERDICT: PASS
in Stage 1 and were successfully extracted in Stage 2. If no candidates pass Stage 1, return an empty array []
. Do not include any explanatory text, markdown, or any characters outside of the final JSON object.