Python SDK Support for Detecting Output Length Overrun

Does the Python SDK support raising exceptions when the output is cut off due to the current 8192 token limit?

I’m trying to convert PDFs to text. Some files are big; while they fit into the input context window, the output gets truncated. I expect the SDS to raise an exception in such cases.

Hi @farhanhubble, At present I think Gemini will truncate the output when the output token count exceeds the max limit instead of raising an exception. If you want the exception to be raised you write a code for this case. for example,

from google import genai

client = genai.Client(api_key=api)

response = client.models.generate_content_stream(
    model="gemini-2.0-flash",
    contents=["explain about ai"]
)
for chunk in response:
    print(chunk.text, end="")
    total_tokens=chunk.usage_metadata.total_token_count
    if total_tokens > 100:
      raise("token limit exceeded")

Thank You.

Hi @farhanhubble

Welcome to the forum.

Look for the finishReason in the response object.

Cheers