Hi everyone,
I am building a conversational agent using the Node SDK (@google/generative-ai). I am running into a persistent issue where the model kind of repeats itself in the second step after receiving a functionResponse, instead of simply continuing the previous generated text.
In practice, first step text ends with tool response introduction (“Here is a selection of …:”), but the second step text also starts with these data introduction ('Here are some…).
The Workflow:
- Step 1: User asks a question (e.g., “How to choose a pair of shoes?”).
- Step 1 Output: The model streams text advice AND a functionCall (e.g., suggestProducts) in the same generation.
- Step 2: Function is executed client-side.
- Step 2 Input: I construct the full history (User → Model → FunctionResponse) and run a new generation to present data.
Here is a sample context input , which has been generated with gemini api (generateContent with function calls).
[
{
"role": "user",
"parts": [
{
"text": "Can you explain how to choose running shoes and then find me some highly cushioned ones?"
}
]
},
{
"role": "model",
"parts": [
{
"text": "Choosing the right pair of running shoes is essential for comfort and preventing injury. Here are key factors to consider:\n\n1. **Terrain**: Are you running on the road, on trails, or on a treadmill?\n * **Road shoes**: Designed for pavement with good cushioning to absorb impact.\n * **Trail shoes**: Offer better grip and protection for uneven surfaces.\n2. **Pronation**: This is how your foot rolls when you run.\n * **Neutral**: The foot rolls slightly inward. Most runners fall here.\n * **Overpronation**: The foot rolls inward excessively. You might need stability shoes.\n3. **Cushioning**: Ranging from barefoot feel to maximum cushion. Maximum cushion is great for long distances and joint protection.\n4. **Size**: Feet swell during runs, so ideally, leave a half thumb's width of space at the front of the shoe.\n\nHere is a selection of running shoes with maximum cushioning for your comfort:\n\n"
},
{
"functionCall": {
"name": "suggestProducts",
"args": {
"toAvoid": [
"minimalist",
"racing flats",
"hard sole",
"barefoot"
],
"universe": "sport",
"minPrice": 80,
"exclude": [],
"descriptions": [
"highly cushioned road running shoes",
"max cushion sneakers for long distance",
"comfortable running shoes shock absorption"
],
"maxPrice": 250,
"categories": [
"running shoes",
"sport shoes"
],
"targetPrice": 150
}
},
"thoughtSignature": "..."
}
]
},
{
"role": "user",
"parts": [
{
"functionResponse": {
"name": "suggestProducts",
"response": {
"products": [
{
"id": "shoes-run-cloud-101",
"name": "CloudStratus Max",
"description": "Double layers of CloudTec® for maximum cushioning on the road. Perfect for medium to long runs.",
"price": "169.95€",
"url": "https://example.com/products/cloudstratus-max"
},
{
"id": "shoes-run-foam-202",
"name": "Ultra Fly Foam 3",
"description": "Designed with our softest foam yet, these shoes provide a plush, stable ride for every mile.",
"price": "150.00€",
"url": "https://example.com/products/ultra-fly-foam"
}
]
}
}
}
]
}
]
The code to trigger the second step:
const response = await gemini.models.generateContent({
model: model,
contents,
config: {
thinkingConfig: { thinkingLevel: ThinkingLevel.LOW },
}
});
And the result:
{
"parts": [
{
"text": "To help you find the best match, here are some highly cushioned road running shoes designed for maximum shock absorption:\n\n1. **CloudStratus Max (169.95€)**: These feature double layers of CloudTec® cushioning. They are designed specifically for road running and are perfect for medium to long distances where you need the most protection from impact.\n2. **Ultra Fly Foam 3 (150.00€)**: These shoes use a specialized soft foam to provide a \"plush\" and stable ride. They are an excellent choice if you are looking for a very soft feel underfoot during every mile.\n\nWhen trying these on, make sure to wear the socks you plan to run in to ensure the fit is exactly right!"
}
],
"role": "model"
}
I tried different things without noticeable improvements:
- using a different role for the function response (‘model’ and ‘function’)
- different models and different thinking modes
- prompting : specific instruction, asking to complete the last sentence, even given explicitly
Is there a specific way to structure the history or a specific system instruction that forces Gemini to follow the flow of text ?
Thanks !