Post to Tumblr - Preview

gadgeteer.co.za

Use gImageReader to Extract Text From Images and PDFs on Linux

gImageReader is a front-end for Tesseract Open Source OCR Engine. Tesseract was originally developed at HP and then was open-sourced in 2006.Basically, the OCR (Optical Character Recognition) engine lets you scan texts from a picture or a file (PDF). It can detect several languages by default and also supports scanning through Unicode characters.However, the Tesseract...