Good afternoon, I’m experiencing difficulties in developing my project. I’m unable to reduce the latency related to web searches. Currently, I’m using the Gemini 2.5 Flash model. My system is built with LangGraph, where there is an orchestrator agent that routes decisions and tool usage, and finally a node for synthesizing responses. Could someone suggest any techniques or methods to reduce latency?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Extreme latency on gemini-1.5-flash API | 3 | 639 | January 6, 2025 | |
| Unexpected Delay in Gemini-1.5-Flash API Responses | 2 | 711 | November 21, 2024 | |
| Gemini API so slow . Am i doing something wrong? | 7 | 5715 | November 21, 2024 | |
| Persistent High Latency with `gemini-2.5-pro` | 4 | 807 | July 26, 2025 | |
| Gemini-2.5-pro accessed over https://generativelanguage.googleapis.com/v1beta/openai/ has dramatic latency increase | 10 | 571 | July 21, 2025 |