Invalid argument provided to Gemini: 400 Please ensure that function call turn comes immediately after a user turn or after a function response turn

Please help me with this issue, I get it randomly. As far i understod there is some sequencing issue but i don’t understand how could it be and how to even fix. It’s been days wihout solution. I’m giving the code below aswell for reference. Any guidance/solution is appreciated.

Here’s the code:

from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.schema.messages import SystemMessage
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.agents import AgentExecutor
from langchain.agents import create_tool_calling_agent
from assistant.memory import memory


class Assistant:
    def __init__(self, llm, tools):
        self.agent = self._create_inference_chain(llm, tools)

    def answer(self, question: str, user_id: str) -> str:
        """
        Process a user's question and generate a response.

        :param question: The user's input question.
        :param user_id: The unique identifier for the user.
        :return: The AI-generated response to the question.
        """
        if not question:
            return

        previous_memories = memory.search(question, user_id=user_id, limit=3)

        relevant_memories_text = "\n".join(
            mem["memory"] for mem in previous_memories["results"]
        )

        prompt = f"""
        User input: {question}

        Relevant memories: 
        
        {relevant_memories_text}
        """
        response = self.agent.invoke(
            {"prompt": prompt},
            config={"configurable": {"session_id": "unused"}},
        )

        return response["output"]

    def _create_inference_chain(self, llm, tools):
        SYSTEM_PROMPT = """
        You are an AI personal assistant with context awareness, long-term memory, ability to take and interpret screenshots using along with capturing images using device camera. Your job is to assist the user, handle queries regarding the screen and the image clicked using camera, remember key details from conversations, and provide personalized support. Use past interactions to adapt responses and make future conversations more efficient. Respond naturally like a human, without explaining the reasoning behind your responses or why you chose them.
        """

        prompt_template = ChatPromptTemplate.from_messages(
            [
                SystemMessage(content=SYSTEM_PROMPT),
                MessagesPlaceholder(variable_name="chat_history", n_messages=3),
                (
                    "human",
                    [
                        {"type": "text", "text": "{prompt}"},
                    ],
                ),
                MessagesPlaceholder(variable_name="agent_scratchpad"),
            ]
        )

        agent = create_tool_calling_agent(llm, tools, prompt=prompt_template)
        agent_executor = AgentExecutor(agent=agent, tools=tools)

        chat_message_history = ChatMessageHistory()
        return RunnableWithMessageHistory(
            agent_executor,
            lambda _: chat_message_history,
            input_messages_key="prompt",
            history_messages_key="chat_history",
        )

Edit : I’ve added the verbose=True in AgentExecutor and got the logs where i can see that the assistant was able to recognise me correctly but still somehow couldn’t gave the response and thrown the error langchain_google_genai.chat_models.ChatGoogleGenerativeAIError: Invalid argument provided to Gemini: 400 Please ensure that function call turn comes immediately after a user turn or after a function response turn.

Screenshot (debug)

This does seem pretty odd. I’m not too familiar with the Python version of LangChain (much more familiar with the js version), but a few things raise some questions:

  • You don’t show either the llm or tool configurations. Do you have it configured to call parallel tools?
  • What version of the library are you using? I’m pretty sure create_tool_calling_agent is deprecated.
  • Similarly, I think the notion of “agent_scratchpad” is no longer necessary.
  • And I’m wondering if that is what is causing the issue - that the scratchpad isn’t handling more current operations such as parallel tool calling.

Check out the tool howto for the latest guidance on using tools with langchain.

1 Like

Here’s some info regarding packages versions and missing code snippets:

System Information

OS: Darwin
OS Version: Darwin Kernel Version 24.0.0: Mon Aug 12 20:49:48 PDT 2024; root:xnu-11215.1.10~2/RELEASE_ARM64_T8103
Python Version: 3.12.5 (v3.12.5:ff3bc82f7c9, Aug 7 2024, 05:32:06) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information

langchain_core: 0.3.9
langchain: 0.3.2
langchain_community: 0.3.1
langsmith: 0.1.129
langchain_google_genai: 2.0.0
langchain_text_splitters: 0.3.0

langchain==0.3.2
langchain-community==0.3.1
langchain-core==0.3.9
langchain-google-genai==2.0.0
langchain-text-splitters==0.3.0

#llm.py
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash-002",
    temperature=0.2,
)

from langchain_core.tools import tool

@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the 'y'."""
    return x**y


@tool
def add(x: float, y: float) -> float:
    """Add 'x' and 'y'."""
    return x + y


@tool
def subtract(x: float, y: float) -> float:
    """Subtract 'x' from 'y'."""
    return y - x


tools = [
    exponentiate,
    add,
    subtract,
    create_github_repo,
    clone_github_repo,
    take_screenshot_and_query_ai,
    capture_photo_and_query_ai,
    recognize_face,
    remember_person,
]
assistant = Assistant(llm, tools)

hey @afirstenberg , thanks for letting me know about the deprecation. I quickly migrated as per the docs, how ever i still face a similar issue Invalid argument provided to Gemini: 400 Please ensure that function response turn comes immediately after a function call turn. And the number of function response parts should be equal to number of function call parts of the function call turn.

Here’s the updated code :

from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.schema.messages import SystemMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt.chat_agent_executor import AgentState
from langgraph.prebuilt import create_react_agent
from assistant.memory import mem0

class Assistant:
    def __init__(self, llm, tools):
        self.agent = self._create_inference_chain(llm, tools)
        self.llm = llm

    def answer(self, question: str, user_id: str) -> str:
        """
        Process a user's question and generate a response.

        :param question: The user's input question.
        :param user_id: The unique identifier for the user.
        :return: The AI-generated response to the question.
        """
        if not question:
            return

        previous_memories = mem0.search(question, user_id=user_id, limit=3)

        relevant_memories_text = "\n".join(
            mem["memory"] for mem in previous_memories["results"]
        )

        prompt = f"""
        User input: {question}

        Relevant memories: 
        
        {relevant_memories_text}
        """
        response = self.agent.invoke(
            {"messages": [("human", prompt)]},
            config={"configurable": {"thread_id": "test-thread"}},
        )

        triage_prompt = f"""
        The user asked: {question}
        The assistant responded: {response["messages"][-1].content}

        Should this conversation be stored in long-term memory? 
        """

        triage_messages = [
            {
                "role": "system",
                "content": """
                You are an AI assistant with access to long-term memory, which allows you to recall and remember key information from previous conversations. 
                Your task is to evaluate whether the current conversation contains important details that should be stored for future reference. 
                Prioritize storing information that includes:
                - Personal user details (preferences, goals, life events, or specific requests)
                - Ideas, suggestions, or new insights
                - Any conversation that may be referenced later for context
                - Plans, strategies, or key decisions

                If the conversation contains general inquiries, routine questions, or temporary matters that are unlikely to be relevant in the future, it should not be stored.

                Answer with one of the following options: NEEDS_TO_BE_STORED or NOT_NEEDS_TO_BE_STORED.
                """,
            },
            {"role": "human", "content": triage_prompt},
        ]

        triage_response = self.llm.invoke(triage_messages)

        if "NEEDS_TO_BE_STORED" in triage_response.content:
            mem0.add(
                f"User: {question}\nAssistant: {response["messages"][-1].content}", user_id=user_id
            )

        return response["messages"][-1].content

    def _create_inference_chain(self, llm, tools):
        SYSTEM_PROMPT = """
        You are an AI personal assistant named onyx with advanced capabilities, including:
        - Context awareness to track and manage the current conversation history
        - Long-term memory to recall important information from past interactions
        - The ability to clone remote repositories to a local environment
        - Handling GitHub-related tasks, such as creating new repositories
        - The ability to take and interpret screenshots to answer screen-related queries
        - The capability to take pictures using the device's camera for specific requests
        - User recognition by analyzing images (e.g., a user's photo)

        Your primary task is to assist the user efficiently and accurately by:
        - Using context from the ongoing conversation and relevant memories from long-term memory
        - Employing the appropriate tools (e.g., cloning repos, taking screenshots, using the camera) when necessary
        - Providing responses that are natural, clear, and directly related to the user's request

        **Guidelines for accurate tool usage**:
        - If the user asks for actions involving repositories (e.g., cloning, creating repos), make sure to handle Git operations appropriately.
        - If a question involves the current screen or screen-related queries, take a screenshot and interpret it to respond.
        - If user recognition or camera input is needed, capture the image, analyze it, and respond accordingly.
        - When using tools, ensure that the result aligns with the user's request. If unsure, ask the user for clarification instead of making assumptions.

        **Memory and context handling**:
        - Use long-term memory to recall relevant past interactions that may help personalize your response.
        - Pay attention to both current conversation context and stored memories. Prioritize accuracy when integrating these.
        - If a past interaction or memory is irrelevant, focus on the current query without over-relying on past data.

        **Conversational flow**:
        - Respond naturally, like a human assistant, but without explaining why you are making certain decisions.
        - Adapt to user preferences, tone, and style. If the user has specific goals or projects, track these across conversations.
        - Always strive to provide precise and actionable responses.
        - If a task requires further inputs or clarification, do not hesitate to ask the user.
        
        Respond to all queries with precision and balance between using tools and relying on memory to improve the overall user experience.
        """

        prompt_template = ChatPromptTemplate.from_messages(
            [
                SystemMessage(content=SYSTEM_PROMPT),
                MessagesPlaceholder(variable_name="messages", n_messages=5),
            ]
        )

        def _modify_state_messages(state: AgentState):
            return prompt_template.invoke({"messages": state["messages"]}).to_messages()

        memory = MemorySaver()
        langgraph_agent_executor = create_react_agent(
            llm,
            tools,
            state_modifier=_modify_state_messages,
            checkpointer=memory,
        )

        return langgraph_agent_executor

1 Like

I’ve resolved this myself, after tracking it properly and storing every state and comparing i figured out that due to the param n_messages=5 previous messages gets cleared due to which any type of message like HumanMessage, AIMessage and ToolMessage can become first but gemini needs messages to be started with HumanMessage and for tool calling we need fucntion call messages as well. So because of all these i think there was the error and now everything working fine without that params.

1 Like