Monday, May 21, 2007
Glory Be! or Finally Working Free OCR!
I hope this post doesn't spam planetkde. If it does, enjoy my enlightened opinions once again. And blame Google.
The project I'm working on will gather information from numerous sources with the goal of tracking tasks from inception to invoicing. That is the secondary goal, the primary being learning Qt and C++, and keeping an interesting challenge floating in the back of my mind. So I've recently worked on reading caller id data from a modem, using threads, mutexes, communicating this to the mother ship using dbus, plugin interfaces and all the neat stuff that Qt 4 provides. Great fun. Next, I wanted to set something up that would scan and ocr supplier invoices. Scanning is the first challenge, although there is a libscan in KDE. Maybe that will be the impetus to migrate from Qt to KDE4. Which has been my intention all along.
I have been watching gocr and ocrad for a while. They are quite a ways from being useful. I started considering using wine and some windows tools. Ugh. I ran across tesseract, the ocr tool originally from HP that was freed and Google picked up. It works. I have a few scans I was using for tests, in pnm format. Tesseract requires TIFF, so I did a conversion, and tried the ocr. Very nice. There are a few errors, mostly in areas where the font was small and blurry. But it definitely works. So now I can scan documents, ocr them, use QScript to grab the important data.
This really means I don't have any more excuses. I've got to get this thing to the point where I can begin using it.
In the May 14, 2007 LugRadio podcast, there was a discussion of what needed fixing in the Linux Desktop. Someone suggested that it was already there. I spluttered and fumed as I listened, thinking where is OCR! No longer. With working OCR, the next level of tools such as ocr->pdf, and other neat stuff will come along. Great.
Subscribe to Posts [Atom]