The API documentation mentions that Gemini 2.0 Flash supports multi-tool usage. But, when I send a request with both Web Search grounding and custom functions defined, I get a 400 Bad Request error saying, “Search Grounding can’t be used with other tools”. Is this a feature supported only through the Python SDK?
Also, when I try to use Google Search grounding without any other tools, it still says:
Unable to submit request because Please use google_search field instead of google_search_retrieval field.. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini
When I changed the field to google_search
, as requested, I get the error:
Invalid JSON payload received. Unknown name \"dynamicRetrievalConfig\" at 'tools[0].google_search': Cannot find field.
Even though, in the documentation, it is a valid field.
When I removed the retrieval config, it starts working. But now the same JSON does not work for 1.5 Flash. Why is this different for Gemini 2.0 Flash? Will this be fixed in the full release?
Gemini 2.0 is awesome, but I think the lack of documentation is disappointing.
Now, with the new “googleSearch” tool, I tried adding function declarations. I just have two tools: Google Search grounding, and a custom function with no parameters. Both work with 2.0 Flash individually. But when I combine them into two tools, it just gives me a vague Request contains an invalid argument.
error.
Simultaneous use isn’t supported by the backend. We have official confirmation from the Google engineer here - codeExecution and function calling - #3 by GUNAND_MAYANGLAMBAM but it would still be helpful if the documentation makes this point more clearly.
I see. The issue is that they explicitly say in the documentation:
With Gemini 2.0, you can enable multiple tools at the same time, and the model will decide when to call them. Here’s an example that enables two tools, Grounding with Google Search and code execution, in a request using the Multimodal Live API.
With an example:
prompt = """
Hey, I need you to do three things for me.
1. Turn on the lights.
2. Then compute the largest prime palindrome under 100000.
3. Then use Google Search to look up information about the largest earthquake in California the week of Dec 5 2024.
Thanks!
"""
tools = [
{'google_search': {}},
{'code_execution': {}},
{'function_declarations': [turn_on_the_lights_schema, turn_off_the_lights_schema]}
]
await run(prompt, tools=tools, modality="AUDIO")
Hey @uralstech , Thanks for sharing the document link. Currently, the multi-tool feature is only compatible with the multimodal live/realtime API.
Thanks