Hi,
According to the official documentation, video/quicktime is listed as a supported MIME type for video understanding with Vertex AI models.
However, when analyzing a .mov video recorded on an iPhone 16 Pro Max using spatial audio (which is enabled by default), the API returns a generic HTTP 400 with "invalid argument" and no further details. This video includes an apac audio stream, as identified by FFmpeg.
If I switch the iPhone settings to record in stereo only - or re-encode the video using ffmpeg to remove the apac stream - the video is accepted and analyzed successfully.
I believe this is either a validation bug or an undocumented limitation.
Could you please clarify:
- Is
apacaudio explicitly unsupported? - Should
.movfiles from iOS with spatial audio be preprocessed before being sent to Vertex? - Can the error response be improved to identify unsupported streams?
Thanks.
