How to robustly handle images in the streaming API to avoid Chunk too big errors?

Hi everyone,

I’m running into a challenging issue while using the Gemini streaming API to implement a feature that streams mixed text and images.

My Goal:
I want the model to stream text for immediate user feedback, while also generating and displaying images inline as part of the response, like a real-time tutorial.

The Problem:
The text streaming works perfectly, but as soon as the model attempts to send an image within the stream, the program crashes because the underlying HTTP client throws a Chunk too big error. This makes the streaming API feel unreliable for any use case that involves image generation.

My Questions:

  1. Is this a known design limitation? Is streaming inherently unsuitable for handling large, monolithic data chunks like images?

  2. Aside from abandoning the streaming experience entirely (by using the non-streaming API, which has poor UX) or implementing a complex “Tool Calling” / “Function Calling” pattern to handle images separately, is there a more direct or officially recommended approach to solve this?

I’ve considered manually increasing the buffer size in the underlying library (like aiohttp), but this feels like a temporary workaround that doesn’t address the root cause and introduces memory risks.

I’m very interested to hear how others in the community are handling this common scenario, or if there’s a best practice recommended by the official team. Thanks