Hi everyone,
I’m currently using the Gemini API in a mobile app project and I’ve noticed that response times vary a lot depending on the prompt size and model version.I’m curious:
- What strategies have you used to consistently reduce latency?
- Does batching requests or pre-processing input make a noticeable difference?
- Are there any recent (2025) updates that improved performance for you?
I’d love to hear your experiences and suggestions. Thanks in advance!
Hi @XyzApk_Hot ,
Welcome to the Forum !!
Could you please us know which Gemini Model you are using?