Which Gemini API Models Support top_k Sampling?

I’m trying to determine which Gemini API models support top_k sampling. The official documentation at Google AI API states:

"For Top-k sampling.

Top-k sampling considers the set of topK most probable tokens. This value specifies default to be used by the backend while making the call to the model. If empty, indicates the model doesn’t use top-k sampling, and topK isn’t allowed as a generation parameter."

However, it’s unclear from the documentation which specific Gemini models allow the top_k parameter in generation requests.

Could anyone clarify which models currently support top_k, or if there is an alternative method for achieving similar behavior in models that do not? Thanks in advance!

Hi @user1365

Welcome to the forum.

Did you check the list of models? The TopK is documented per model, if available.
Here an example.

    {
      "name": "models/gemini-2.0-flash-thinking-exp-1219",
      "version": "2.0",
      "displayName": "Gemini 2.0 Flash Thinking Experimental",
      "description": "Gemini 2.0 Flash Thinking Experimental",
      "inputTokenLimit": 1048576,
      "outputTokenLimit": 65536,
      "supportedGenerationMethods": [
        "generateContent",
        "countTokens"
      ],
      "temperature": 0.7,
      "topP": 0.95,
      "topK": 64,
      "maxTemperature": 2
    },

Cheers