I have been uploading images in chat on Openrouter to have Gemini 3 Flash use it’s ‘agentic vision’ tools to analyze them. It works incredibly well, pretty amazing actually.
BUT, when i try it with the API I do not get anywhere near the same results. I tried messing with the thinking tokens, sdk etc, but I cannot get the API to replicate it for me. Even when I am being as direct as I can for the agent to use the tools. I have enabled internet access, but nothing seems to get it to work. Does anyone have any idea?