Hi everyone,
I’ve been working with the Gemini API, and I appreciate that it allows uploading media files separately from the prompt input, making media reusable across multiple requests and prompts (as noted in their documentation). One great feature I’ve found is that when uploading a video file, a field called videoMetadata
is returned, which includes the video duration.
However, I noticed that when an audio file is uploaded, there’s no equivalent audioMetadata
to provide details like the audio duration
I’m curious to know if Google plans to add support for this feature in the near future, as it would greatly improve the handling of audio files in the API, and if yes
Has anyone else come across this or found a workaround?
I’d love to hear your thoughts.
Thanks in advance!
1 Like
Hi @Mamoun_Hourani
Welcome to the community.
Interesting! I will escalate this feature request.
1 Like
Thank you for your response @Susarla_Sai_Manoj !
In our use-case, we use our API servers as a proxy to forward audio and video streams from customers directly to the Google AI Files API. This setup is crucial for ensuring HIPAA and GDPR compliance by avoiding the need to store or process sensitive files on our server, thereby minimizing security risks and maintaining regulatory compliance.
We rate limit our customers requests based on the duration of media processed, rather than tokens, and having direct access to the audio duration from the API would greatly streamline our operations. Currently, because there’s no audioMetadata
provided, we are forced to store the files in object storage and add them to a queue for processing, one by one, ONLY to retrieve the audio duration. This introduces a significant overhead in terms of processing time, delaying the overall workflow.
Additionally, audio duration is key for rate limiting. Without it, we face challenges in efficiently managing our API usage to ensure we don’t overload Google’s systems with large payloads.
Having a feature similar to videoMetadata
for audio files would not only improve rate-limit accuracy and reduce processing overhead but also help us maintain optimal Google AI API usage.
Thanks again for escalating this feature request!
and big kudos to the Google team for the amazing work on the AI APIs so far! We’re really impressed with the capabilities and excited to see how they keep evolving
1 Like