Need text coordinates of the extracted value from the Original document uploaded

amit_phadtare · September 11, 2024, 6:19am

Hi Team,

I need to extract the value from the document uploaded along with its text co-ordinates (X1, Y1, X2, Y2) from the original document. Pls suggest how do i get the value. Right now with help of prompts I am able to extract the co-ordinates but its not matching with co-ordinates in original document uploaded.

Please suggest

OrangiaNebula · September 11, 2024, 6:27am

Welcome to the forum.

If I understood correctly, you are supplying the document as an image and you want the model to answer with a precise bounding box for the text (the model properly identifies and returns the text, but the bounding box it returns is not precise enough). If that is the case, then also check for any answers to this post - Bounding Box Alignment Problems and Image Rescaling

Hope that helps.

amit_phadtare · September 11, 2024, 6:34am

can you pls help me out with the answer, i need a bounding box only. Post you just provided is not having any answer.

OrangiaNebula · September 11, 2024, 7:05am

I don’t have a solution to providing exact bounding box coordinates. I have tried and failed (the bounding box returned is approximate, never precise). If a Google engineer has such a solution, they will likely provide it. If they don’t have a solution to share, you can assume it’s outside model capabilities at this time.

Topic		Replies	Views
Issues with the Accuracy of Object Coordinates Detected by Gemini 1.5 in Images Gemini API gemini-15	6	332	June 10, 2024
Bounding Box for text in a document (Flash 2.0) Gemini API models	1	94	December 30, 2024
How to optimize graphic coordinates General Discussion models , android , tflite , help_request , java	7	1542	September 15, 2021
Help on calculating bounding vox General Discussion models , help_request	1	401	April 4, 2023
Bounding Box Alignment Problems and Image Rescaling Gemini API vision	0	395	September 10, 2024

Need text coordinates of the extracted value from the Original document uploaded

Related topics