I’m seeing intermittent 400 INVALID_ARGUMENT errors when passing YouTube URLs as video input to gemini-2.5-flash via Vertex AI (Python SDK, us-central1). The error is transient — the exact same request succeeds on retry — but the API returns it as a 400.
Error response:
{“error”: {“code”: 400, “message”: “Request contains an invalid argument.”, “status”: “INVALID_ARGUMENT”}}
REPRODUCTION — 100 identical requests, same video, same prompt:
I tested a single public YouTube video with 100 generate_content calls using two Part construction patterns. The request payload never changes between attempts.
Pattern A (FileData + VideoMetadata):
part = types.Part(
file_data=types.FileData(file_uri=url, mime_type=“video/mp4”),
video_metadata=types.VideoMetadata(start_offset=“0s”, end_offset=“30s”)
)
Pattern B (from_uri + VideoMetadata):
part = types.Part.from_uri(file_uri=url, mime_type=“video/mp4”)
part.video_metadata = types.VideoMetadata(start_offset=“0s”, end_offset=“30s”)
Pattern A (FileData + VideoMetadata): 91 OK, 6 FAIL_400, 2 FAIL_504, 1 timeout
Pattern B (from_uri + VideoMetadata): 95 OK, 3 FAIL_400, 2 timeouts
The 400s are scattered randomly (attempts 11, 15, 31, 59, 82, 88). I also ran a broader test across 95 distinct public YouTube videos and saw the same — videos that 400 on one attempt succeed on the next.
WHY THIS SHOULD NOT BE A 400:
-
The request is valid. The exact same bytes succeed 94-97% of the time. A 400 means “your request is malformed” — mine isn’t, it works on retry.
-
400 is non-retryable by convention. Every HTTP client, retry framework, and circuit breaker treats 400 as permanent. The google-genai SDK’s own retry logic skips it too.
-
It causes silent data loss. I process thousands of YouTube videos daily and was losing ~3-6% to an error that would’ve succeeded on a second attempt. I’ve had to add a workaround that special-cases this specific 400 as retryable.
-
Other transient Gemini errors use correct codes. Rate limits return 429, overload returns 503, timeouts return 504. This 400 is the odd one out.
ASK
Could this return 503 (or a different retryable status) when the failure is server-side during YouTube video ingestion? I have CSV results across 95 videos with metadata (duration, region, definition) showing no correlation between video properties and failure rate — happy to share.