Currently, in Gemini 2.5, topK
is fixed at 64
(based on official documentation and personal experiments).
Lowering the temperature to increase output accuracy and consistency significantly exacerbates repetition, a chronic issue with LLMs. In contrast, by setting topK
to 2 and then increasing the temperature, it’s possible to get stable, more diverse responses with less repetition than by just lowering the temperature.
The current topK
is too large, which has forced us to rely solely on adjusting the temperature. However, I am continuously receiving complaints from users of my service that raising the temperature significantly compromises the accuracy of the answers, while lowering it leads to repetition or other anomalies in the responses.
Please reconsider the decision to force the topK
parameter to 64
.