Issue Summary
I’m experiencing intermittent timeout issues when calling the Gemini API using the gemini-2.0-flash
model with PDF files encoded as base64. The timeouts occur after 600 seconds, followed by 503 GOAWAY
errors, despite eventually receiving successful responses.
Technical Details
- Model:
gemini-2.0-flash
- Input: PDF files sent as base64 encoded data
- Framework: LangChain with LangSmith for monitoring
- Error:
RetryError: Timeout of 600.0s exceeded, last exception: 503 GOAWAY received
- Observed Duration: ~950 seconds (logged in LangSmith) when issue occurs
- Frequency: Intermittent - not every request
Request Pattern
Using LangChain to send a GenerateContent call to the Gemini API with PDF files converted to base64 and included in the request payload.
Key Observations
- Not quota-related: The errors don’t correlate with “GenerateContent input token count limit per model per minute” quota issues
- Eventually succeeds: Responses are eventually received despite the timeout errors
- Consistent duration: When the issue occurs, LangSmith consistently logs ~950 seconds duration
- PDF-specific: Issue seems to occur specifically with PDF base64 inputs
- Retry behavior: Most notably, when retrying the exact same PDF just a minute after the first request, it receives a response in just a few seconds as expected
- Not size-dependent: The issue doesn’t necessarily correlate with large prompts - it occurred, for example, with a request containing ~12,000 prompt tokens and generating only 49 completion tokens
Questions
- Is this a known issue with PDF processing in
gemini-2.0-flash
? - Could this be related to PDF processing latency on Google’s backend?