Frustrated, audio analysis was working great now its not. help!

J_Ray · May 14, 2024, 1:33am

so as you can see in the screenshots, i was able to have gemini very accuratley analyze several .mp3 files, music, one was in arabic, and the last was garbled nonsense. IT was able to “hear” and answer my prompts about each file. As of a few hours ago, it wont work with any uploaded music file. Continues to say its a text only model, and says it was “role playing” and just got into it when i ask why it could analyze those files a few hours ago. It just talks in loops, ive tried this while on the same prompt as earlier music classification, new prompt, its frustrating as this was the basis on an app im working on. Anyone got any input? Its funny cause chatGPT could do this last summer, it could name a song, talk about it, anything. Then one day it said exactly this, that its a text only model. im so frustrated! @GoogleLLC

OrangiaNebula · May 14, 2024, 3:06am

Welcome aboard! I tried one of my standard test cases, the preamble to the US Constitution, and today Gemini 1.5 truncated the transcription to

We the people of the United States in order to form a more perfect union establish justice ensure domestic tranquility provide for the common defense P

It used to reliably generate the entire preamble, which is well recorded in the audio file.

Another audio file that used to work was rejected with RECITATION block.

OrangiaNebula · May 18, 2024, 6:21pm

Updating the status of this issue. The preamble test case was resolved.
This audio file used to reliably work before the May 14 model update: Arthur the Rat – Dictionary of American Regional English – UW–Madison , actual audio file at Audio file in mp3 format.

It has since the update reliably generated a RECITATION block. The audio file is useful to evaluate the model, since a known good transcript is also available Reference transcript for audio file.

afirstenberg · May 19, 2024, 1:03am

The screen image shows a red triangle next to the Model response.

If you clicked on that - what did it say?

Madi · June 5, 2024, 11:49am

It’s a Sexually Explicit language

Hosanna_Sookra · January 14, 2025, 11:15pm

before Christmas I was also able to transcribe some audio files I have personally recorded and it worked great before. now every file I input, it recognizes it as hate speech, or something bad when its just a film project for school with a little cussing here and there. Why so many pointless restrictions google?

Topic		Replies	Views
Gemini 1.5 refuses to process audio files Gemini API gemini-15 , api , web-ml	8	395	September 19, 2024
Recitation Error Issue w/ Audio Gemini API bug , api	1	70	December 29, 2024
Transcribe text to text and vice versa, speech to speech and image to text in a flutter app using gemini Gemini API	15	579	May 20, 2024
Text extraction from audio file is not working Google AI Studio gemini-15	3	117	June 3, 2024
Gemini-1.5-flash is no longer processing audio files (500 Exception) - retry does not help Gemini API gemini-15 , bug , models , audio	4	83	April 9, 2025

Frustrated, audio analysis was working great now its not. help!

Related topics