Instructions are being ignored today

dw1 · June 19, 2025, 6:29am

I’m using the openai endpoint to summarize videos like:

{
  "model": "gemini-2.5-flash",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Provide a very short summary on one line. Keep it short!"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "data:video/mp4;base64,<ENCODED_FILE>"
          }
        }
      ]
    }
  ]
}

Yesterday it was working fine and the result was a very short summary. Today the same request, with the same video, ignores the instructions and provides a very long, multi-line, response describing every part of the video. It seems like a bug.

Lalit_Kumar · June 19, 2025, 9:44am

Hello,

We’ve looked into the issue you reported, and it appears to be working correctly on our end.

For your specific situation, we recommend trying a modification to your prompt and setting the temperature to zero to get a similar response every time.

dw1 · June 19, 2025, 10:02am

Thanks for looking at it. I just made this request in Postman which shows the error. id is XulTaNK3J8OHz7IPpOjdkQ8

dw1 · June 20, 2025, 4:30am

I did a little more testing and can say gemini-2.0-flash and gemini-1.5-flash appear to follow the instructions properly and the issue only happens with gemini-2.5-flash. See above for screenshot and request ID. Thanks

dw1 · June 21, 2025, 7:52am

It seems to fix it if the two inputs are separated into two separate messages, instead of as two content items within one message:

But as noted, it was working the first way before, which continues to work with gemini-2.0-flash and gemini-1.5-flash. Also, the documentation says the first way should be valid. I guess something is causing the content array to collapse so that only one item is seen by the model.

Lalit_Kumar · June 23, 2025, 5:58am

Hello,

My apologies, I recreated your issue using the Gemini framework.
It seems your current configuration is set up to understand images. And if I am understanding correctly, you’re looking for the model to explain video content instead. Is that accurate?

Topic		Replies	Views
Video Understanding response cut off at token ~= 2k Gemini API bug , api , video	2	70	June 23, 2025
Getting Youtube video summary via Gemini AI API Gemini API api	3	736	March 31, 2025
Gemini-2.5-flash api cannot process video input Gemini API gemini-flash , video	16	399	June 4, 2025
API periodically ignoring multiple documents Gemini API gemini-15 , api , gemini-api	9	225	October 1, 2024
Understanding Long YouTube Videos with Gemini Gemini API prompt , video	3	112	May 19, 2025

Instructions are being ignored today

Related topics