I’m creating an LLM application with Google models, and crafting the system prompt from Google AI studio. This gives very good results in my use case with both gemini-2.5-flash-preview-04-17 and gemini-2.5-pro-preview-04-17. But when I try this from the google-adk framework, both models misbehave terribly.
I have matched the temperature, topP, topK and max_output_tokens between both systems. My system prompt is quite large, since I’m tesitng the whole setup before I split this into agents. What I experience is both models losing track of the coherence of the output and coming up with garbage like python programs, or stories about growing up in Afghanistan.
Anyone else experiencing this? Any workarounds?
Even when parameters like temperature, topP, topK, and max_output_tokens are matched, subtle differences in environment configurations, token handling, or prompt processing can lead to varied results.
I would love to get an insight from google developer team on how different these models are in the backend.
In the meanwhile, try these steps:
- Experiment with slight adjustments to parameters like temperature or max_output_tokens to see if they influence output coherence
- Consider using the streaming mode in the Gemini API to receive incremental responses, which might help in maintaining context.
- Check the logs for any warnings or errors that might indicate issues during processing.
Let me know if the issue is resolved using above steps.