We utilize the gemini-1.5-pro model through the Generative AI SDK to conduct audio analysis, following specific guidelines. The model operates with a temperature of 0 and is instructed to deliver responses in JSON format according to a specified schema.
However, when invoking the API repeatedly with identical input and prompts, we observe inconsistent outputs. While some outputs align with the expected outcome, others deviate significantly.
What strategies can we implement to ensure consistent outputs across all API calls?
2 Likes
Hi @srinivas_mummidi, Welcome to the forum.
Have you tried using the gemini-2.0-flash-exp
model with the temperature set to 0?
Hi @GUNAND_MAYANGLAMBAM, We tried using the 2.0 flash model but found that the audio analysis and reasoning capabilities of the 1.5 Pro were better at that time.
When I tested with 2.0 flash model, I observed consistency in the output generation when the temperature was set to 0. Though, I didn’t actually check the audio analysis and reasoning capability.
If possible, could you share the prompt you’re using so I can reproduce it on my end?
Sadly you won’t have a way to make the model deterministic. Temperature set to 0, top P/K to 1 could help a bit but it will never be completely deterministic as in the end there’s still a lot in randomness in how those models work.