Does anyone have any tips or tricks on how to increase output generation size for infernces on gemini-1.5-pro or flash? Trying to generate a large report using a couple of input files for context and a well designed prompt that should have no problem genrating large output, but i cant seem to get above 1800 tokens, no matter how large the input tokens are. I’ve tried lots of prompting tricks including variables for a target_word_count variable etc, but doesnt really seem to budge it. I cant really get any large generations over 1200 words (despite stated 8K token output window) except for doing things like batch processing of image descriptions.
With the new gemini-1.5-pro-002 model specifically touting its reduced verbosity, this has gotten even worse.
Has anyone had better success with some tips or tricks?