Hi,
According to the official documentation, video/quicktime
is listed as a supported MIME type for video understanding with Vertex AI models.
However, when analyzing a .mov
video recorded on an iPhone 16 Pro Max using spatial audio (which is enabled by default), the API returns a generic HTTP 400 with "invalid argument"
and no further details. This video includes an apac
audio stream, as identified by FFmpeg.
If I switch the iPhone settings to record in stereo only - or re-encode the video using ffmpeg
to remove the apac
stream - the video is accepted and analyzed successfully.
I believe this is either a validation bug or an undocumented limitation.
Could you please clarify:
- Is
apac
audio explicitly unsupported? - Should
.mov
files from iOS with spatial audio be preprocessed before being sent to Vertex? - Can the error response be improved to identify unsupported streams?
Thanks.