Gemini API returning inconsistent streaming responses with multimodal input

KRows · January 26, 2025, 9:50pm

I’m working with the Gemini API and noticed that when streaming responses with both text and image inputs, the chunks sometimes arrive out of order. My code works fine with text-only inputs, but multimodal requests seem to trigger this behavior.

Current implementation in TypeScript:

const response = await model.generateContentStream({
  contents: [
    {text: userPrompt},
    {inlineData: {
      mimeType: "image/jpeg",
      data: base64Image
    }}
  ]
});

for await (const chunk of response.stream) {
  console.log(chunk. Text());
}

Has anyone found a reliable way to handle streaming with mixed content types? Looking for best practices or workarounds!

GD_Coders · January 27, 2025, 7:22am

let textChunks = “”;
let imageChunks: Array = ;

for await (const chunk of response.stream) {
// Check if the chunk is text or binary (image) and process accordingly
if (chunk.Text) {
textChunks += chunk.Text(); // Collect text data
} else if (chunk.Binary) {
imageChunks.push(chunk.Binary()); // Collect binary image data
}

// Process both text and image when both parts are ready
if (textChunks && imageChunks.length) {
// Here you can handle the complete text and image data
console.log(“Text:”, textChunks);
console.log(“Image Data:”, imageChunks);
textChunks = “”; // Reset text after processing
imageChunks = ; // Reset image data after processing
}
}

KRows · January 27, 2025, 4:43pm

Ok thanks for the reply! I tried it and completely worked! Thanks again!

Topic		Replies	Views
Gemini Live Not Responding Correctly to Text Gemini API api , models	5	78	April 28, 2025
Bulk Processing Images Without Batching Gemini API api , gemini-api	3	207	October 25, 2024
Problems with Live API Audio Streaming and Function Responses Gemini API api	0	86	March 30, 2025
Gemini returning unusual code Gemini API gemini-api , gemini	3	145	September 17, 2024
There is Lag when using the MultiModal API from the open source code Gemini API api , models	1	71	February 25, 2025

Gemini API returning inconsistent streaming responses with multimodal input

Related topics