Continuous Generation Through Gemini 1.5 Flash API Calling

vijay_brock · September 10, 2024, 12:32pm

For my input to the model Gemini Flash 1.5, output is definitely more than 8196 tokens, So the model has to give me multiple responses to complete my request. How to do that, is there any option in API docs to do that, for every iteration model has to start the output from where it left off in the previous response.

OrangiaNebula · September 10, 2024, 10:03pm

The usual approach to generating longer content is to structure an outline and then fill in the outline buckets. This cookbook example shows one way to do it: Google Colab

vijay_brock · September 11, 2024, 3:16am

Hi @OrangiaNebula Thanks a lot for your reply and intention to help me.
In my case, I am converting XML from informatica mapping to SQL query, so that the output varies based on the XML and its size. I think we can’t decide the structure of the outline.
As the output tokens are 8192, not enough for the output we needed. the next iteration for my initial input has to be automatically generated without manual intervention.
So looking for the solution to do that kind of continuous generation.

Topic		Replies	Views
How to expand Gemini output window Gemini API help-request , new-features	6	1564	October 9, 2024
Tips on how to increase token output size in GenerateContentResponse? Gemini API gemini-15 , api , models	1	516	September 28, 2024
Handling Token Limits in Gemini-1.5-Flash API Responses Gemini API gemini-15 , api	1	165	September 27, 2024
Output tokens limit for the finetuned gemini flash 1.5 Gemini API fine-tuning	12	2597	October 12, 2024
Resuming structured output after MAX_TOKENS cut-off Gemini API gemini-15	2	171	March 3, 2025

Continuous Generation Through Gemini 1.5 Flash API Calling

Related topics