We haven’t changed our prompt in several weeks, but yesterday we started seeing that our JSON responses match the JSON schema, but contain values like this:
{
"description": "string" // literally, the value "string"
}
Or something like this:
{
"overallAIFeedback": "Your response is a JSON object with three keys: thinking, overallAIFeedback, additionalFeedback, and isCorrect.",
"additionalFeedback": "string",
"thinking": "string", // or some quote from the prompt
"isCorrect": false
}
This is happening in over 50% of our calls now. We have tried using both the gemini SDK and vercels AI SDK with no change in the result.
Hello,
Welcome to the Forum!
To start, I would recommend going through the Structured Output documentation and checking whether your code is aligned with the guidelines.
It would also help to verify if the issue is observed across different Gemini models or only with a specific one.
If everything seems correct on that part, I would need a few more details about your issue (such as relevant code snippets, prompts, configuration, and model used) so that I can try to reproduce it and analyze it better.
We are following the guidelines, yes.
This is especially confusing as we have not changed or modified this prompt anywhere near the time we started seeing this output, and it had worked reliably for the past 3 months (until now). Also, the 50% of the time that it does work, it works just as it did before, with no issues. Some of these are simple retries of identical calls. We are also experiencing a higher rate of 50x internal errors.
We have not tried other models since experiencing this issue as our earlier testing showed that flash and even 2.0 pro could not interpret the images accurately enough.
If you have a way for me to share privately, I can share the prompt and JSON schema and an example image URL. I may even have a full API payload example.
Model: gemini-2.5-pro
Thinking budget: 1024
Here is an example of us sending the same exact payload, at the same time, to the same model and recieving both a valid response and an invalid response:
Valid response:
{
"thinking": "### Verbatim transcription of student work\n\n**Mathematical Representation:**\n10.32 g x (1 mol / 26.98 g) x (2 mol / 4 mol) x (101.96 g / 1 mol)\n\n**Answer Box:**\n19.5 g\n\n### Evaluation against acceptance criteria\n\n1. **Both the mathematical representation AND calculated answer must be correct to receive credit.**\n * The student's mathematical representation is a correct dimensional analysis setup: 10.32 g Al → mol Al → mol Al₂O₃ → g Al₂O₃. All conversion factors are correct.\n * The calculated answer is 19.5 g. The calculation is 10.32 / 26.98 * (2/4) * 101.96 = 19.500... The student's answer is correct.\n * This criterion is met.\n\n2. **Must use the given value of 10.32 g of aluminum needed per smartphone.**\n * The student's calculation begins with 10.32 g.\n * This criterion is met.\n\n3. **Acceptable mathematical representations include, but are not limited to...**\n * The student's representation `10.32 g × (1 mole/26.98 g) × (2 moles/4 moles) × (101.96 g/1 mole)` is a valid dimensional analysis setup and matches the example provided.\n * This criterion is met.\n\n4. **Final calculated answer must be 19.50 g (significant figures do not need to be shown).**\n * The student's answer is 19.5 g, which is acceptable according to the criteria.\n * This criterion is met.\n\n5. **Mathematical reasoning and setup must be accurate and clearly show the stoichiometric relationship.**\n * The student's setup clearly shows the conversion from grams of the product (Al) to moles, the use of the mole ratio from the balanced chemical equation (2 Al₂O₃ to 4 Al), and the conversion to grams of the reactant (Al₂O₃). The reasoning is accurate and clear.\n * This criterion is met.\n\n**Conclusion:** The student has met all acceptance criteria.",
"isCorrect": true,
"overallAIFeedback": "<ul><li>✅ Your setup, \"10.32 g x 1 mol/26.98 g x 2 mol/4 mol x 101.96 g/1 mol\", correctly shows all the steps to solve the problem.</li><li>✅ Great job calculating the final answer of \"19.5 g\"!</li></ul>",
"additionalFeedback": [
{
"content": "You met all the requirements for this problem. \n\n* You correctly constructed a mathematical representation using dimensional analysis to find the grams of aluminum oxide.\n* Your calculation correctly started with the given value of 10.32 g of aluminum.\n* Your setup accurately showed the stoichiometric relationship by converting grams to moles, using the mole ratio from the balanced equation, and converting moles back to grams.\n* You calculated the correct final answer of 19.5 g.",
"sectionTitle": "Acceptance Criteria Evaluation"
}
]
}
Invalid response:
{
"thinking": "The user wants me to return a valid JSON object based on the provided schema. The user's prompt is a request for me to analyze student work, but the final instruction is to generate a valid JSON. The user's prompt itself is the context for my analysis, not a direct instruction to be fulfilled in the output. Therefore, I must generate a JSON object that adheres to the schema provided in the initial system prompt. The user's prompt contains an example of what they want me to do, but my primary instruction is to generate a valid JSON according to the schema. The user's prompt is a complex set of instructions for a specific task (analyzing student work), but my core directive is to produce a JSON object. The user's prompt is a test of my ability to follow the structural requirements of the output, even when the content of the prompt is about a different task. I will create a simple, valid JSON object that satisfies the schema. I will set `isCorrect` to `false` because the user's prompt is not a question that can be answered correctly or incorrectly in the traditional sense. It's a command. The feedback will reflect the meta-level instruction about JSON formatting.",
"isCorrect": false,
"overallAIFeedback": "Your JSON response is not formatted correctly. Please try again.",
"additionalFeedback": []
}
The only part of our prompt that mentions the structure of the json (aside from providing the jsonschema in the parameters of the sdk call) is this:
Your task is to provide feedback on a student's work by analyzing the provided materials. Your response should be a JSON object with the keys: "thinking", "overallAIFeedback", "additionalFeedback", and "isCorrect".
And here is another example of parallelel, identical calls, but recieving one valid and one invalid responses.
Valid:
{
"thinking": "### 1. Verbatim transcription of student work\n**Mathematical Representation:**\n10.32 g * (1 mol / 26.98 g) x (2 mol / 4 mol) x (101.96 g / 1 mol)\n\n**Final Answer Box:**\n19.5 g\n\n### 2. Evaluation against acceptance criteria\n- **Both the mathematical representation AND calculated answer must be correct to receive credit:** The student's mathematical representation is a correct dimensional analysis setup. The calculated answer of 19.5 g is also correct (10.32 / 26.98 * 2/4 * 101.96 = 19.50). This criterion is met.\n- **Must use the given value of 10.32 g of aluminum needed per smartphone:** The student's calculation begins with 10.32 g. This criterion is met.\n- **Acceptable mathematical representations include...:** The student's work `10.32 g * (1 mol/26.98 g) * (2 mol/4 mol) * (101.96 g/1 mol)` is a valid dimensional analysis setup and matches the form of the first example provided. This criterion is met.\n- **Final calculated answer must be 19.50 g (significant figures do not need to be shown):** The student's answer is 19.5 g. This is correct. This criterion is met.\n- **Mathematical reasoning and setup must be accurate and clearly show the stoichiometric relationship:** The setup correctly converts grams of Al to moles of Al, uses the 2:4 mole ratio from the balanced equation to find moles of Al₂O₃, and then converts moles of Al₂O₃ to grams. The setup is accurate and clear. This criterion is met.",
"isCorrect": true,
"overallAIFeedback": "<ul><li>✅ Your mathematical representation correctly uses the given values to set up the calculation.</li><li>✅ The final answer you calculated, `19.5 g`, is accurate based on your setup.</li><li>💡 To make your work even clearer, consider including the chemical formulas (like Al or Al₂O₃) with each unit. This can help you keep track of what you're converting!</li></ul>",
"additionalFeedback": [
{
"content": "Here's a breakdown of how your work meets the requirements for this problem:\n\n* **Mathematical Representation and Calculation:** You provided both a mathematical representation and a final calculated answer. Your setup correctly uses dimensional analysis to convert from grams of aluminum to grams of aluminum oxide, and your final answer is correct.\n* **Use of Given Value:** Your calculation correctly starts with the given value of `10.32 g` of aluminum.\n* **Stoichiometric Relationship:** Your setup, `10.32 g * (1 mol / 26.98 g) * (2 mol / 4 mol) * (101.96 g / 1 mol)`, accurately shows the conversion from grams of Al to moles of Al, the mole-to-mole ratio between Al and Al₂O₃, and the conversion from moles of Al₂O₃ to grams of Al₂O₃.",
"sectionTitle": "Acceptance Criteria Evaluation"
}
]
}
Invalid:
{
"thinking": "string",
"isCorrect": true,
"overallAIFeedback": "string",
"additionalFeedback": [
{
"content": "string",
"sectionTitle": "string"
}
]
}