We are using gemini model 1.5 flash for our logistics related queries in which we are only using it for function calling and communicate with external APIs and show real time data to the user, however, it starts working incorrectly randomly and hallucinating. What can be the solution for it as we are doubting to rely on it for our services.
Kindly, help in improving the accuracy and performance of it.
Welcome to the forum. There are two changes that will generally help:
try reducing the temperature setting to a value closer to 0.0
switch to a more capable model (flash → pro)
Neither change will guarantee hallucination free operation, both should result in a cumulative reduction in the error rate that you might find acceptable.
Also, focus on prompt engineering techniques like : Write clear and specific prompt, Provide context and relevant information, Use few-shot learning with accurate example, Use chain-of-thought prompting which can help identify potential inconsistencies.
These techniques can help reduce it.
When you instantiate the model via the API what is the exact model name you use? If you use a generic model name (gemini-1.5-flash) then it resolves to the latest stable model. Just recently the 002 stable came out which behaves differently than the earlier 001. If this is the case then name gemini-1.5-flash-001 explicitly and see if that solves your issue.