Undocumented Behavior: Gemini 2.5 Flash Can Parse DOCX Files and Arbitrary Binary Data

Nils_Koehler · January 22, 2026, 9:16am

Description

According to the official Vertex AI documentation for Gemini 2.5 Flash, the only supported MIME types for document inputs are:

application/pdf
text/plain
[docs.cloud…google.com]

Additionally, a Google representative stated in the forum that unsupported file types are extracted as pure text but without preserving structure.
[discuss.ai…google.dev]

Observed Behavior (Potential Issue)

While testing the API, I discovered:

I can upload a DOCX file as raw binary data, and Gemini 2.5 Flash is able to answer questions about its content, even though DOCX is not listed as a supported MIME type.
I tested further by uploading the raw binary of a compiled C program (.out / ELF).
The model was able to extract and return the main() function signature, even though:
- Executable binaries are not documented as supported inputs.
- The API should normally reject unsupported MIME types, as seen in the StackOverflow report of .docx uploads causing a 400 Unsupported MIME type error.
  [stackoverflow.com]

This suggests the model backend attempts automatic text extraction from arbitrary binary data, which is not documented.

Why This May Be Important

The behavior is not described in the official documentation.
It differs from what users would expect based on the stated MIME limitations.
It may have security or privacy implications if the model automatically extracts strings from binary files.
It is unclear whether this is:
- intended behavior,
- an undocumented feature,
- or a backend oversight.

Request

Could Google clarify:

Whether extracting text from unsupported formats (including binary executables) is intended behavior?
Whether this automatic extraction should be considered safe/production‑ready?
Whether future documentation will explicitly address this behavior?
Whether the API should reject non‑PDF, non‑text files more strictly?

Thank you!

Topic		Replies	Views
Clarifying Gemini 2.5 Flash API Document Limits, Supported File Types, and Size Constraints Gemini API gemini-flash-2-5	1	138	December 31, 2025
MIME type of Rich Text Format (RTF) inconsistent / wrong Gemini API api , github	6	309	February 24, 2025
About gemini-2.5-flash API - Supported MIME types Gemini API gemini-flash , text	3	114	December 16, 2025
Upload PDF to Gemini File API Gemini API gemini-15 , gemini-api	11	1650	February 6, 2025
Did the API related to fileData change? Gemini API api , gemini-api	5	336	November 12, 2024

Undocumented Behavior: Gemini 2.5 Flash Can Parse DOCX Files and Arbitrary Binary Data

Description

Observed Behavior (Potential Issue)

Why This May Be Important

Request

Related topics