How to do batch Inference on Prompt Image pairs with Gemini API without getting errors

Welcome to the forum!

The problem, I believe, is this specification:

You cannot get max_output_tokens past the system limit (which you get from list_models()). In the case of the 1.5 Gemini (both pro and flash) that value is 8192. The one million context window applies to input tokens.

That effectively forces you to partition your input to chunks of probably 2, and will require that you use enough generate_content() requests to get through the dataset.

2 Likes