If you’re seeing dropped connections or “hung” requests with the @google/genai Node.js SDK, check your Node.js fetch timeouts before assuming it’s a Gemini problem.
Node 18+ and Node 20 use undici internally to power the global fetch() API, which is what the SDK uses for all its requests. The default undici settings are tuned for fast API calls — not for long-running generative AI requests:
-
headersTimeout: 30 seconds — undici gives up waiting for response headers after 30 seconds and silently drops the connection. Image generation and editing with Gemini can take 60–120+ seconds before the model starts sending a response back. If you’re on Node 18/20 and not overriding this, you’re almost certainly hitting this limit on longer jobs. -
bodyTimeout: 300 seconds — less of an issue but still worth setting explicitly for large image responses.
The SDK’s httpOptions.timeout field is separate — it controls an AbortController at the request level — but it doesn’t touch undici’s connection-level timeouts. You need to configure the global dispatcher directly.
Fix:
npm install undici
import { Agent, setGlobalDispatcher } from "undici";
setGlobalDispatcher(new Agent({
headersTimeout: 330_000, // 5.5 minutes
bodyTimeout: 660_000, // 11 minutes
keepAliveTimeout: 60_000,
connections: 20,
}));
Put this at the very top of your server entry point, before any SDK clients are initialized. It applies to all fetch() calls in the process, including the SDK’s internal ones.
Also set an explicit SDK timeout so requests don’t hang indefinitely if Gemini itself stalls:
const ai = new GoogleGenAI({
apiKey: process.env.GEMINI_API,
httpOptions: { timeout: 300_000 }, // 5 minutes
});
Symptoms of the undici headersTimeout issue:
-
Jobs fail consistently at or just after the 30-second mark
-
Errors surface as hung/stuck requests rather than clean error messages
-
The failures cluster — multiple in-flight requests fail simultaneously when they all hit the timeout at the same time
-
Retrying the same request usually succeeds because the retry is submitted at a less congested moment
This is not a Gemini API bug — it’s a Node.js default that made sense before LLM workloads existed.