Summary
Uploading video files (MP4, WebM, etc.) to the Google AI Studio web interface consistently fails. The system either gets permanently stuck on “Extracting…” or returns a 500 Internal Server Error. This makes the advertised multimodal video analysis feature completely unusable through the web UI.
Environment
- Platform: Google AI Studio Web UI (aistudio.google.com)
-
- Browser: Google Chrome (latest stable), also reproduced on Edge
-
- OS: Windows 11
-
- Account Type: Paid tier (pay-as-you-go billing enabled)
-
- Date: April 2026 (ongoing issue for months)
-
Steps to Reproduce
-
- Go to aistudio.google.com
-
- Create a new prompt or open an existing chat
-
- Click the attachment/upload button
-
Actual Behavior
- One of the following occurs:
-
- Infinite “Extracting…” loop: The progress indicator spins indefinitely (tested waiting 30+ minutes) with no completion
-
- 500 Internal Server Error: Server returns HTTP 500 after variable processing time
-
- Silent failure: Upload appears to complete but video content is not accessible for prompting
-
Impact
-
- This is a critical bug for paying users who rely on multimodal video analysis
-
- The same video files work correctly when processed via the Gemini API directly (using
google.generativeaiPython SDK withupload_file()) -
- This confirms the issue is specific to the AI Studio web frontend, not the underlying Gemini model infrastructure
-
Community Reports
- Multiple developers on Reddit (r/GoogleGeminiAI, r/singularity) and this forum have reported identical symptoms since late 2025, indicating this is a systemic, unresolved issue rather than an isolated case.
- The same video files work correctly when processed via the Gemini API directly (using
-
Workaround
- Using the Gemini API Python SDK directly (
genai.upload_file()) successfully processes the same video files. However, this defeats the purpose of having a web-based interface for quick prototyping and testing.
-
Request
- Please prioritize fixing the AI Studio web UI’s video processing pipeline. As a paying customer, the inability to use a core advertised feature through the primary interface is unacceptable.