In the latest gemini 3 preview models (3 flash and 3.1 pro) any request with candidateCount > 1 returns a 400 INVALID_ARGUMENT error with the message: "Multiple candidates is not enabled for this model".
This is a significant regression for creative writing and brainstorming applications.
By forcing candidateCount to 1, we are forced to make multiple sequential or parallel API calls, which:
- Increases perceived latency for the end-user.
- Increases overhead and cost (since the prompt prefix is re-processed multiple times).
- Breaks the efficiency of the “generate many, select one” paradigm that LLMs are traditionally great at.
Are there plans to re-enable multiple candidates for these models, or is this a permanent architectural shift? If it’s the latter, it severely impacts the viability of Gemini for creative assistive tools compared to other providers like OpenAI that still support n > 1.
Looking forward to your clarification.