Tips on how to increase token output size in GenerateContentResponse?

Kevin_Dragan · September 27, 2024, 10:13pm

Does anyone have any tips or tricks on how to increase output generation size for infernces on gemini-1.5-pro or flash? Trying to generate a large report using a couple of input files for context and a well designed prompt that should have no problem genrating large output, but i cant seem to get above 1800 tokens, no matter how large the input tokens are. I’ve tried lots of prompting tricks including variables for a target_word_count variable etc, but doesnt really seem to budge it. I cant really get any large generations over 1200 words (despite stated 8K token output window) except for doing things like batch processing of image descriptions.

With the new gemini-1.5-pro-002 model specifically touting its reduced verbosity, this has gotten even worse.

Has anyone had better success with some tips or tricks?

fhsp · September 28, 2024, 8:03am

Well… Yes… Its a model and the cap is suposedly 8k so at the very least it does that if well prompted.

In fact, you can get it to go way further heh
Lemme make a fast example

https://photos.app.goo.gl/bWCFZdFxBVGHh4Fk6

Here you go

Topic		Replies	Views
Maximum Output Tokens from Tuned Models Google AI Studio gemini-15 , training	2	646	October 8, 2024
Output tokens limit for the finetuned gemini flash 1.5 Gemini API fine-tuning	12	2620	October 12, 2024
I cannot find a way to make Pro 1.5 output more than about 1000 words at a time Google AI Studio gemini-15	2	309	December 13, 2024
Disappointment with 8192 Output Length Limit for Powerful AI Models Google AI Studio gemini-15 , models , ai	8	2234	October 7, 2024
Gemini 2.0 thinking model returning truncated response with a blob of whitespace Gemini API gemini-20	6	1105	January 25, 2025

Tips on how to increase token output size in GenerateContentResponse?

Related topics