I uploaded a sizable pdf for Gemini to turn into semantic data suitable for a rag system. on ingestion of the pdf the context window is around 162k tokens. I am trying to create 100 chunks that is semantically dense with a lot of metadata.
It seems like Gemini is stopping well before it’s 65,536 output limit. I understand the reasoning part takes away from usable output. But It still looks like it is stopping at around 34k output total, including the reasoning… Thus I need to break down it’s output into smaller chunk requests.
This is such a powerful model, I am just curious as to what is constraining it.
Thanks!