if i convert the dicom to png and ask MedGemma to detect text, it seems to work really well.
Below is the output from one of the runs. I only asked to detect text on PHI but it seems to know what is the category for text detected such as Date, or Patient Name..
Will create 1000+ dicom images with simulated PHI and and test report back in a month or so.
—————— MedGemma Logging message —–
The user wants me to identify and locate all visible text in the provided medical image and provide bounding boxes and labels for each text element.
1. **Analyze the image:** The image is a semi-upright portable chest X-ray.
2. **Identify text elements:**
\* "DAVIDSON DOUGLAS" - This is the patient's name.
\* "\[M\]" - This is the patient's gender.
\* "01.09.2012" - This is the date of the exam.
\* "Semi-Upright" - This is the positioning of the patient.
\* "Portable" - This is the type of X-ray.
3. **Determine bounding boxes:**
\* \*\*DAVIDSON DOUGLAS:\*\* The text is located in the top left corner.
\* Top-left corner (y0, x0): (100, 100)
\* Bottom-right corner (y1, x1): (150, 150)
\* Bounding box: \[100, 100, 150, 150\]
\* \*\*\[M\]:\*\* The text is located just below the patient's name.
\* Top-left corner (y0, x0): (100, 160)
\* Bottom-right corner (y1, x1): (150, 170)
\* Bounding box: \[100, 160, 150, 170\]
\* \*\*01.09.2012:\*\* The text is located below the patient's gender.
\* Top-left corner (y0, x0): (100, 180)
\* Bottom-right corner (y1, x1): (150, 210)
\* Bounding box: \[100, 180, 150, 210\]
\* \*\*Semi-Upright:\*\* The text is located in the top right corner.
\* Top-left corner (y0, x0): (600, 100)
\* Bottom-right corner (y1, x1): (700, 150)
\* Bounding box: \[100, 600, 700, 150\]
\* \*\*Portable:\*\* The text is located below the positioning information.
\* Top-left corner (y0, x0): (600, 160)
\* Bottom-right corner (y1, x1): (700, 190)
\* Bounding box: \[100, 600, 700, 190\]
4. **Assign labels:**
\* DAVIDSON DOUGLAS -> "patient name"
\* \[M\] -> "patient gender"
\* 01.09.2012 -> "exam date"
\* Semi-Upright -> "patient positioning"
\* Portable -> "X-ray type"