I am trying to get bounding boxes for clothes from outfit images. I’ve recreated the exact code given in the documentation.
I successfully recreated the cupcake example with the exact bounding_box_system_instructions and the user prompt. But when I try to make it specific to my use case, by adding fashion context, it starts to give erroneous bounding boxes atleast 40% of the time.
I made minimal changes to the example prompt to get it to perform the same way but it’s not working out. I tried giving specific instructions to detect clothes as well, but didn’t work. Has anyone else faced this? How to prompt engineer here?
“Modified by Moderator”