Multimodal Live API Returns Executable Code Instead of Expected Function Call Response

Unable to Make Function Calling Work with Multimodal Live API

Current Issue

I am unable to get function calling to work with the multimodal live API. The API keeps returning executable code and attempting to execute it, as shown in this response format:

{
  "modelTurn": {
    "parts": [
      {
        "executableCode": {
          "language": "PYTHON",
          "code": "..." // code
        }
      }
    ]
  }
}

Configuration Code

Function Declaration

add_item = FunctionDeclaration(
    name="add_item",
    description="Add an item to the inventory list. Takes a complete item object containing all required fields.",
    parameters={
        "type": "OBJECT",
        "properties": {
            "item": {
                "type": "OBJECT",
                "description": "Complete item information",
                "properties": {
                    "Item": {
                        "type": "STRING",
                        "description": "Name of the detected item",
                    },
                    "Make": {
                        "type": "STRING",
                        "description": "Manufacturer name or NA",
                    },
                    "Model": {
                        "type": "STRING",
                        "description": "Model identifier or NA",
                    },
                    "Year": {
                        "type": "STRING",
                        "description": "Manufacturing year or NA",
                    },
                    "Serial_Number": {
                        "type": "STRING",
                        "description": "Unique identifier or NA",
                    },
                    "Description": {
                        "type": "STRING",
                        "description": "Includes cleanliness assessment, color and other general details",
                    },
                    "Approximate_price": {
                        "type": "STRING",
                        "description": "Web-searched current market value (with currency symbol)",
                    },
                    "where_was_this_detected": {
                        "type": "STRING",
                        "description": "Location + timestamp in HH:MM:ss format",
                    },
                    "questions": {
                        "type": "STRING",
                        "description": "Missing or uncertain information, price variations, or 'None'",
                    },
                },
                "required": [
                    "Item",
                    "Make",
                    "Model",
                    "Year",
                    "Serial_Number",
                    "Description",
                    "Approximate_price",
                    "where_was_this_detected",
                    "questions",
                ],
            }
        },
        "required": ["item"],
    },
)

Tool Configuration

add_item_tool = Tool(function_declarations=[add_item])

Live API Connection Configuration

live_connect_config = LiveConnectConfig(
    response_modalities=["TEXT"],
    tools=[add_item_tool],
    system_instruction=Content(parts=[{"text": SYSTEM_INSTRUCTION}]),
)

I’m having the same issue. But additionally, it writes invalid code!

Handling message: {
  "serverContent": {
    "modelTurn": {
      "parts": [
        {
          "executableCode": {
            "language": "PYTHON",
            "code": "queryManual.query(query=\"What is the recommended torque for tightening wheel bolts?\")\n"
          }
        }
      ]
    }
  }
}

In this case here, queryManual is my tool and it for some reason often thinks it should execute it with queryManual.query(…arguments) instead of just (i guess) queryManual(…arguments). I’ve also experianced it trying to print variables that don’t exist such as print(manual_result) instead of just using the result that was provided to it as a tool result.

Hi,

Looks like it’s mixing Function Calling with Code Execution.
Don’t forget that Multimodel Live API is still marked as Experimental.

Hmm, the BiDiGenerateContent method has been removed recently. Maybe it’s related to that.

Anyways. see the https://github.com/google-gemini/cookbook/blob/main/quickstarts/Get_started_LiveAPI_tools.ipynb notebook on GitHub.

And it seems that your code needs to handle the response with function call(s), or any other tool specified, ie. code_execution or google_search.

Cheers