500 Errors for Gemini Flash Lite with multi turn tool calling

We’ve recently swapped our Agentic Tool decision model from Kimi K2 to Gemini Flash Lite.

Basically, the model has access to 5 available tools

  • librarySearch - internal rag
  • academicSearch - exa search for papers
  • internetSearch - exa for internet
  • youtubSearch - exa for youtube
  • stop - this is an actual tool that the LLM can decide to call to stop the response

We typically encounter error 500 in the following scenario.

1st run → model decides a tool to call → executes tool
2nd run → tool call and tool result are added to history → 500 error sporadically

Unrelated - We’ve also discovered separately that implicit caching is unreliable in these cases.