Gemini 2.0 Flash API: Long Response Times and 503 GOAWAY Errors with PDF Base64 Input

Issue Summary

I’m experiencing intermittent timeout issues when calling the Gemini API using the gemini-2.0-flash model with PDF files encoded as base64. The timeouts occur after 600 seconds, followed by 503 GOAWAY errors, despite eventually receiving successful responses.

Technical Details

  • Model: gemini-2.0-flash
  • Input: PDF files sent as base64 encoded data
  • Framework: LangChain with LangSmith for monitoring
  • Error: RetryError: Timeout of 600.0s exceeded, last exception: 503 GOAWAY received
  • Observed Duration: ~950 seconds (logged in LangSmith) when issue occurs
  • Frequency: Intermittent - not every request

Request Pattern

Using LangChain to send a GenerateContent call to the Gemini API with PDF files converted to base64 and included in the request payload.

Key Observations

  1. Not quota-related: The errors don’t correlate with “GenerateContent input token count limit per model per minute” quota issues
  2. Eventually succeeds: Responses are eventually received despite the timeout errors
  3. Consistent duration: When the issue occurs, LangSmith consistently logs ~950 seconds duration
  4. PDF-specific: Issue seems to occur specifically with PDF base64 inputs
  5. Retry behavior: Most notably, when retrying the exact same PDF just a minute after the first request, it receives a response in just a few seconds as expected
  6. Not size-dependent: The issue doesn’t necessarily correlate with large prompts - it occurred, for example, with a request containing ~12,000 prompt tokens and generating only 49 completion tokens

Questions

  1. Is this a known issue with PDF processing in gemini-2.0-flash?
  2. Could this be related to PDF processing latency on Google’s backend?
1 Like

@naktyl ,

welcome to the community, Thank you very much for bringing this to the community,

I was trying to reproduce this but was not able (both natively from google-genai aswell as lang chain )
do you still see this issue?

also gemini suggest to use “files api” if you are using files >20KB

I still see this issue happening regularly.
Here is the code I’m using to call Gemini using Langchain

self.model = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    temperature=0,
    timeout=60
)
initial_system_message = SystemMessage(
            content="""The system message""")
initial_human_message = HumanMessage(
            content=[
                {
                    "type": "media",
                    "source_type": "base64",
                    "data": pdf_content,
                    "mime_type": "application/pdf"
                },
                {
                    "type": "text",
                    "text": "The human message"
                }
            ]
        )
all_messages = [initial_system_message, initial_human_message]
response = self.model.invoke(all_messages)

@naktyl ,

I have tried this over many a times with a PDF which is about 120K tokens and still not able to reproduce this.

if this is critical to your workflow, I would suggest you to DM me your email or projectID so i can request the team to haev a look at logs for your specific case.

1 Like

@naktyl ,

also have you tried this with the latest 2.5 Flash endpoint . It was rolled out recently(6/17).
this is supposed to be better in general with all the necessary updates/betterments from the previous models.

Hey, came across this while researching the same issue, I’ve been hitting these random timeout and 503 GOAWAY errors too.
From what I’ve figured out, It’s likely a backend overload or queueing issue on Google’s side, not something wrong with the request itself. Especially common on a free tier or during peak times.
For now, the best workaround seems to be catching this particular error and retrying the request after a short delay.
Still looking into it, but figured I’d share what I’ve found so far,
hope it helps!

Thanks for sharing! It stopped happening for us since July 2nd although we didn’t change anything. So I’m guessing something was fixed on Google’s side.

ohkay, but I’m still facing the issue, I’m using the free tier, so as far as i understand, that’s likely the reason. Am i mistaken ?

Yes probably, I’m on a paid tier