YouTube URL video input returns intermittent 400 INVALID_ARGUMENT — should be 503 or retryable error code

I’m seeing intermittent 400 INVALID_ARGUMENT errors when passing YouTube URLs as video input to gemini-2.5-flash via Vertex AI (Python SDK, us-central1). The error is transient — the exact same request succeeds on retry — but the API returns it as a 400.

Error response:

{“error”: {“code”: 400, “message”: “Request contains an invalid argument.”, “status”: “INVALID_ARGUMENT”}}

REPRODUCTION — 100 identical requests, same video, same prompt:

I tested a single public YouTube video with 100 generate_content calls using two Part construction patterns. The request payload never changes between attempts.

Pattern A (FileData + VideoMetadata):

part = types.Part(
file_data=types.FileData(file_uri=url, mime_type=“video/mp4”),
video_metadata=types.VideoMetadata(start_offset=“0s”, end_offset=“30s”)
)


Pattern B (from_uri + VideoMetadata):

part = types.Part.from_uri(file_uri=url, mime_type=“video/mp4”)
part.video_metadata = types.VideoMetadata(start_offset=“0s”, end_offset=“30s”)


Pattern A (FileData + VideoMetadata): 91 OK, 6 FAIL_400, 2 FAIL_504, 1 timeout
Pattern B (from_uri + VideoMetadata): 95 OK, 3 FAIL_400, 2 timeouts

The 400s are scattered randomly (attempts 11, 15, 31, 59, 82, 88). I also ran a broader test across 95 distinct public YouTube videos and saw the same — videos that 400 on one attempt succeed on the next.

WHY THIS SHOULD NOT BE A 400:

  1. The request is valid. The exact same bytes succeed 94-97% of the time. A 400 means “your request is malformed” — mine isn’t, it works on retry.

  2. 400 is non-retryable by convention. Every HTTP client, retry framework, and circuit breaker treats 400 as permanent. The google-genai SDK’s own retry logic skips it too.

  3. It causes silent data loss. I process thousands of YouTube videos daily and was losing ~3-6% to an error that would’ve succeeded on a second attempt. I’ve had to add a workaround that special-cases this specific 400 as retryable.

  4. Other transient Gemini errors use correct codes. Rate limits return 429, overload returns 503, timeouts return 504. This 400 is the odd one out.

ASK

Could this return 503 (or a different retryable status) when the failure is server-side during YouTube video ingestion? I have CSV results across 95 videos with metadata (duration, region, definition) showing no correlation between video properties and failure rate — happy to share.