Hi everyone, I’m currently experimenting with Gemini 2.0 Flash for OCR on PDF documents. I have a relatively long prompt that instructs how I’d like the model to process the document (including deciphering relevant text and extracting handwritten text, formatting in a specific way, and how it should use confidence scores to judge when to correct OCR-based inaccuracies vs. retaining genuine spelling errors from the original handwritten text).
However, I feel I could improve the one-shot outputs I’m getting if I could include/attach .pdf documents in user inputs for fine-tuning jobs. Does anyone know if this is possible anytime soon, or if there’s an equivalent workaround?
Cheers!