For our bounding box detection tasks on documents, we found that gemini-flash-2.5 performs quite well, but only if thinking_budget=0.
If thinking_budget>0 the bounding boxes are much worse and sometimes quite far away from the actual object one is trying to detect.
Has anybody made similar observations?