Hello ! I am currently building an automated classification and data extraction workflow using the Gemini API (v1beta), calling the models via HTTP requests (using n8n). I am encountering two major issues with gemini-3-pro-preview and gemini-3-flash-preview.
Issue 1: Extreme Latency. sometimes I have no problem, but yesterday every request take 3-5min This happens even with very small payloads (e.g., a 60kb image) but with the tool google search activated To avoid this latency, I tried falling back to the 2.5 models. While they are much faster, I cannot use responseSchema to force a strict JSON output when the Google Search tool is enabled. The model often adds conversational text, which breaks my automated pipeline.
Questions: Are there known latency issues with the v3 preview endpoints ?
Is there a way to get both the speed of 2.5 and strict JSON formatting while using web search?