2.5 pro output length soft limit?

Shoshi · March 27, 2025, 3:35pm

I uploaded a sizable pdf for Gemini to turn into semantic data suitable for a rag system. on ingestion of the pdf the context window is around 162k tokens. I am trying to create 100 chunks that is semantically dense with a lot of metadata.

It seems like Gemini is stopping well before it’s 65,536 output limit. I understand the reasoning part takes away from usable output. But It still looks like it is stopping at around 34k output total, including the reasoning… Thus I need to break down it’s output into smaller chunk requests.

This is such a powerful model, I am just curious as to what is constraining it.

Thanks!

Jak234 · March 28, 2025, 1:51am

Very interesting..can you share the prompt?
I am compressing literature myself and need a rag to experiment.

Topic		Replies	Views
Disappointment with 8192 Output Length Limit for Powerful AI Models Google AI Studio gemini-15 , models , ai	8	1848	October 7, 2024
I cannot find a way to make Pro 1.5 output more than about 1000 words at a time Google AI Studio gemini-15	2	237	December 13, 2024
Tips on how to increase token output size in GenerateContentResponse? Gemini API gemini-15 , api , models	1	364	September 28, 2024
Gemini translates only a part of the text Gemini API api , help-request , generative-ai	2	135	October 6, 2024
Gemini API large PDF file upload limited tokens? Gemini API api , prompt	1	146	March 7, 2025

2.5 pro output length soft limit?

Related topics