Response Generation Stops Prematurely

Hello everyone,

I am using the new Gemini API (google-genai) for Python to send chat message using the send_message_streaming function of the chat class. During the conversation, the model suddenly starts to send incomplete responses. The response structure is correct, but the candidate text contains either a word or just the first syllable.

When this happens, if I reply it like “What?”, “Continue.”, or “Go on” etc., the model keeps sending incomplete responses. But if I reply with an irrelevent message or question, the model stops acting like this and send a complete response.

In the chunk data, the finising reason is normal. There are no errors or no quota-related problems. The model is gemini-2.0-flash. What could be the problem?

What an inactive forum it is. Anyway, it looks like switching to the OpenAI library has solved my problem. I’m still using the Gemini 2.0 Flash model, but with OpenAI’s Python package. Don’t use the google-genai package’s chat class for streamed responses, because it is broken.