I would like to summarize a critical issue I’ve observed over the past few days regarding data processing in custom Gems.
The Issue:
I uploaded a complete dataset (JSON logs) to a custom Gem. The model only processed a portion of the logs. When I re-uploaded the full file manually, the model claimed (once more) that the data from the last few days was missing. However, when I copy-pasted the raw file content directly into the chat, the model correctly recognized the data. The file itself was identical in data, date, and length.
My Observation:
It appears that Gemini’s RAG system may be retrieving cached versions of files rather than re-processing them when the file metadata matches existing Knowledge Base entries. Additionally, the RAG seems to retrieve random chunks of data, lacking the consistency of previous iterations.
Impact:
If files are being served from a cache rather than processed anew, this poses a significant risk for data integrity and is unacceptable for professional use cases. Due to this instability, I have been forced to migrate my workflow to Anthropic’s Claude platform to continue my studies.
Additional Feedback:
I have also noticed a significant increase in hallucinations in the latest model version compared to the previous one. Currently, relying solely on this environment is no longer viable for my longitudinal case study.