Hi Team,
I need to extract the value from the document uploaded along with its text co-ordinates (X1, Y1, X2, Y2) from the original document. Pls suggest how do i get the value. Right now with help of prompts I am able to extract the co-ordinates but its not matching with co-ordinates in original document uploaded.
Please suggest
Welcome to the forum.
If I understood correctly, you are supplying the document as an image and you want the model to answer with a precise bounding box for the text (the model properly identifies and returns the text, but the bounding box it returns is not precise enough). If that is the case, then also check for any answers to this post - Bounding Box Alignment Problems and Image Rescaling
Hope that helps.
can you pls help me out with the answer, i need a bounding box only. Post you just provided is not having any answer.
I don’t have a solution to providing exact bounding box coordinates. I have tried and failed (the bounding box returned is approximate, never precise). If a Google engineer has such a solution, they will likely provide it. If they don’t have a solution to share, you can assume it’s outside model capabilities at this time.