Context: I have a Gemini Text Generator that I use as a Chatbot (All developed in Python). The model gets as input a prompt and a message history, and the idea is that the bot answers to the last user message.
Issue: So as my project increase so did my prompt and this decrease the model performance. I tried to solve this by classifying my user message first (With another prompt/text generator logic) into a set of categories. Now, based on the classification; I route the user message to activate a shorter prompt/tex generator logic specific to that category.
The issue I´m facing is when my user asks multiple categories in a single message.
Potential Solutions:
- Change the whole approach and explore using a RAG system
- Create an specific prompt when the user asks multiple questions, which gets me back to the huge prompt issue
- Create a new logic that identifies all message categories in that single message and create a logic that consolidates a final message.
What are your thoughts on this?
Welcome to the forum.
Both your solution approaches 1 and 3 are promising; the details of the problem space matter. Solution 3 is “agentic”; you use one agent, commonly referred to as “router”, to send the original user prompt or parts of it to specialized agents that answer it (usually referred to as “domain experts”), and then you collect their responses and feed them as context (with the original prompt added to the context) to a so-called “summarizer” agent, responsible for generating the final answer. The agentic approach is less deterministic and allows for input prompts that you haven’t explicitly designed for.
Solution 1 (RAG) is more deterministic and will work better if you have a specific knowledge base from which the answers are to be derived.
P.S. In the agentic approach, there’s nothing stopping you from configuring the “domain expert” agents to use RAG, which would make it into a mixture of solutions.
Thank you for the response, it was really useful.
Im going for the Domain Expert/Rag combo, at least as a first step. What I have on mind is the following:
- Store knowledge in a Vector Database, in which the knowledge documents basically the prompt for each Domain Expert.
- Then, based on the last user message Retrieve k number of prompts from my Data Base
- Based on the retrieve prompts create a final prompt
- Give Gemini Text Generator my prompt and user/agent conversation to generate an answer.
Will let you know how it goes!