Gemini: A Constant Slew of Current Issues and Bugs. (Thoughts?)

TLDR is below. :smiling_face:

Hi there! This is my second post on the forums, and I have decided to address some issues that are quite present and albeit either very annoying or just unsatisfying when using Gemini on mobile devices… but also just in general. If you have any thoughts to capitalize off your own concerns, please push either to Google or reply here if this gets traction.


  1. Emulated Fluctuation: I don’t have as big of an issue with this since it’s not quite the standard yet, but now that Google is advertising how they have a new system for proper fluctuation and natural tonal emulation… It SHOULD be a default in the standard systems, right? Like, at least with Gemini Live(which is a 1.5 Flash model). We already have it in NotebookLM with the new interactive mode, and considering there is zero limit… I don’t really see a reason to hold it back anymore. Adding in the more fluid system would not only make the responses feel more concise and proper for the situation, but also just in general improve the experience with proper recognition being less error prone to misinterpreting the words like speech to text and text to speech does. We’re starting off a new year, and if they have the ability to implement it in their other tools with how non-limited their use is, and are pretty much the exact pieces we would want in the main Gemini app… I don’t see a reason why not(besides development time, which arguably wouldn’t be that long). The users want it, and it would absolutely tower out the other competitors in the AI scene like ChatGPT who charge a ridiculous $200 a month to have unlimited use of their advanced voice mode. I’m almost certain users would switch their average $20 a month subscriptions over to Google One for that alone.

  2. Memory Issues: Maybe on the surface it seems like a complex issue, but in reality… it’s actually quite a simple fix. I absolutely understand that a lot of these functions are supposed to be resolved when Astra is implemented within the Gemini app(with the Deepmind team boasting a 10 minute short term memory context window, and an unspecified but assumed lengthy long-term memory for specific memory retention requests), but I’m getting quite fed up with the models(Including both the 2.0 models available) losing the train of thought so fast that it’s impressive. I could be talking about one thing, switch topics very subtly, and then when I bring the subject back up… it’s like a teenage daughter that was texting on her phone while her parents were telling her what chores she needed to go do; retained barely any of it, and then hallucinates a response to try to make it sound like she was listening yet is visibly off base. This could be simply solved in a wide variety of ways, but just for right now… I’ll give one; a simple underlying priority system. It can filter information during the current subject in a storage, and order it like a hierarchy of information. If the topic is about bikes and mechanics, it can reorder that to the top… and use the top most information in the hierarchy to affect its weights. If the user briefly switches to talking about how cars work, yet still mechanics… mechanics would remain at the top as the MAIN subject, but current car information would be right below it…and then bikes right below that since it was pretty recent. The longer the conversation goes on, the highest sensitivity subject information and most on topic information will be retained, whilst the other stuff will slowly go down in the hierarchy and eventually if determined to be unnecessary just fall off. Idk, maybe this might be too complex on the surface… but in practice, even a really basic and trivial implementation of this would be better than the current system that just absolutely dissolves any awareness of the topic once it gets switched… or even in the middle of it! It seriously does, if you aren’t familiar with it go test it for yourself. For the fact it’s Google, this is quite embarrassing… If nothing but one thing gets fixed with SOME sort of a solution, let it be this so me and thousands of others can stop ripping their hair out when it forgets to tell me to add an ingredient mid way through a beef wellington recipe, and then altogether thinks we’re making pasta the next message later. No problem with pasta, but not when I’M RIGHT IN THE MIDDLE OF A F#%&ING BEEF WELLINGTON. That’s a me issue for not using more reliable sources, but my point stands. We move on.

  3. Deep Research: Honestly not a lot of issues with this. Only thing I’m actually “complaining” about is that it’s not on mobile. Seriously, it’s really handy for resource gathering and pulling together proper information. If I wasn’t making a damn beef wellington on my phone, I would’ve absolutely used this for it. It’s handy, it’s very informative and strong suited, and although there are some areas in its wording that could be a bit better tuned like the potential implementation of links and maybe embedded YouTube videos… it’s overall really impressive, and I wish I could use it out of the house more without being forced to navigate awkwardly to chrome and enable the desktop site just to use it there. Here’s to hoping it gets on mobile soon. :pray:

  4. Headphone Issue: There’s a specific issue on mobile I found that when you have headphones connected and start a Gemini Live session, the audio ends up getting ran through your call audio for some reason and the audio quality sounds like a really bad phone call. The Live function just isn’t optimized for headphone/earbud support. A slight workaround I found is if you switch the audio from headphone to speaker, and then back to headphones it works… but if you don’t have a dedicated volume button built into the headphones or earbuds themselves, then you’re stuck at whatever volume it’s at. You can’t even turn it down or up beforehand using the phone with the default speaker selected. Also, if you pause or close the live session at any time and then start it again… it reverts back to the call audio and sounds like a radio again. Honestly just something they need to fix in a patch.

  5. LIVE on Desktop: This is the last thing I’m going to touch here. Having the Live function after it gets tuned and updated a bit more would be amazing on desktop. The stream realtime function is cool in the AIStudio but has its own flaws(which I went over in my first post if you want to navigate there for more information, hopefully that bug gets fixed), but having genuine live voice chat in the desktop app would be so helpful for work. If this is truly the age of Agentic AI, and multi-modal capability where Technology and Humans finally start to work not just as user and tool but as partners… then this is definitely a step they need to make. It would help them compete more in the AI race by helping switch the focus towards them more, and overall the appeal would be widely accepted. It’s just a personal niche thing for me, but it would be really nice… especially if they introduce the fluctuation and natural tone recognition and responses.

But, that’s everything I had to say. If you have your own thoughts or criticisms, please feel free to reply! A lot of what I suggested and said is a bit off the wall, but overall I hope that the development teams see it and can get some new angles on how they could take the software and technology to new heights. Thanks for reading!

Love you guys, and Thanks for the beautiful tools, Google!


TLDR:

  • Natural Tone (like NotebookLM): Essential for Gemini Live to improve clarity and compete with expensive alternatives.
  • Memory Fix Needed: Gemini forgets context mid-conversation. A simple topic prioritization system would drastically improve memory.
  • Deep Research: Mobile Please! This useful feature needs mobile availability.
  • Headphone Audio Bug: Gemini Live on mobile has broken headphone audio, needs a patch.
  • LIVE on Desktop: Desktop voice chat would be a game-changer for agentic AI and attract users.

3 Likes