Hi everyone,
First-time poster here, though I’ve been learning from this community for a while. I’m hoping to get some advice on a perplexing issue I’m facing while integrating the Gemini API into an Android project.
The Context: I’m working on a proof-of-concept feature I’ve codenamed Merge Fellas
. The goal is to use the Gemini Pro Vision model (via the Android SDK) to analyze a real-time camera feed and suggest placement for interactive UI elements.
The Problem: The “Unlimited Shake” The core issue is a visual instability I can only describe as an “unlimited shake”. When the model returns coordinates for an overlay, instead of being stable, the element jitters rapidly in a tight loop on the screen. It feels like a runaway feedback loop between the camera input and the model’s response.
What I’ve Tried So Far: I’ve spent a good amount of time trying to isolate the cause. Here’s what I’ve done:
- Simplified the Prompt: I’ve stripped my prompt down to the bare minimum to ensure the model’s response structure isn’t overly complex.
- Throttled API Calls: I implemented a simple throttling mechanism to limit requests to one per second, but the shaking persists for each new position.
- Checked Threading: I’ve confirmed that the Gemini API calls are on a background thread and the UI updates are correctly posted to the main thread.
- Tested with Static Images: When I feed the model a static image instead of the camera stream, the returned coordinates are perfect and the UI element is completely stable. This strongly suggests the problem is in the real-time pipeline.
To keep this post tidy, I’ve put my Gemini API configuration, build.gradle
dependencies, and a sanitized Logcat snippet showing the coordinate updates into a Gist. (You will place a Gist/Pastebin link here in your actual post)
My Question: I’m starting to wonder if I’m missing a fundamental pattern for managing this kind of real-time generative feedback.
- Has anyone experienced similar visual instability or “shaking” when processing a real-time stream with Gemini?
- Is there a recommended best practice for smoothing or stabilizing the output from the model for AR/overlay applications?
Any advice on what to check next would be a huge help. Thanks in advance!