Gemini API Empty Response Bug Report

I am extracting structured data from 50 scanned historical PDFs (German corporate directories, 1930s) using the Gemini API via the google-genai Python SDK with generate_content_stream. 45 files extract correctly, but 5 specific files (36-43 pages, 6-12 MB each) consistently return empty responses—no error is raised, the stream completes normally, but no text chunks are yielded. The same files with the same prompt work correctly in AI Studio web interface and return the expected extracted data. We have tried: reducing concurrent workers from 50 to 5 to 1, retrying multiple times, increasing timeouts, and re-splitting the PDFs to ensure page counts are under 45. None of these attempts resolved the issue, the same 5 files fail every time via API while working in AI Studio.

Actual Code

from google import genai
from google.genai import types

client = genai.Client(api_key=API_KEY)

# Read PDF as bytes (no file upload API used)
with open(pdf_path, "rb") as f:
    pdf_data = f.read()

contents = [
    types.Content(
        role="user",
        parts=[
            types.Part.from_bytes(
                mime_type="application/pdf",
                data=pdf_data,
            ),
            types.Part.from_text(text="Extract all companies from this PDF following the schema."),
        ],
    ),
]

generate_config = types.GenerateContentConfig(
    temperature=0.0,
    system_instruction=[types.Part.from_text(text=system_prompt)],
)

# Stream response
response = client.models.generate_content_stream(
    model="gemini-3-pro-preview",
    contents=contents,
    config=generate_config,
)

extracted_text = ""
for chunk in response:
    if chunk.text:
        extracted_text += chunk.text

# Result: extracted_text is empty string for 5 specific files
# No exception raised, stream completes normally

Kicked off a thread with the team to investigate this, will follow up. We should ~never return an empty response, it should always be a correct response or an error. Stay tuned for the fix!

Hello Logan, thank you for the investigation. I think I have figured out the issue. I had accidentally set temperature to 0 for Gemini 3.0 Pro model and it is plausible there are times where thinking tokens are so long that it exceeds the output (65536) and therefore it automatically returns an empty response. I had only realized this when AI Studio warned that 0 temperature API calls can degrade reasoning ability and it seems like in my particular instance led to excessively long thinking without response. Also my particular instruction and prompt is necessarily very complicated that it leads to generally ~4-5 minutes of thinking before response. Thank you for your time!!!

Furthermore, if I can suggest something, it would be fruitful to provide a distinct API Error message for max output tokens have reached / cutoff unnaturally rather than empty response (or perhaps I am not seeing the errors correctly)?