Friday, September 28, 2007
Some decent and lasting rainfall today. It hasn't rained significantly since June. Today is cool, 10C and very wet.
Still plugging away on handling scans. Every roadblock that I hit seems to have been solved by someone. Interestingly, it seems the real challenge is defining the problem space.
Where am I? I'm about to test orsu threshold code. I'm thinking that all I really need to do is basic cleanup, get the image segments defined, pass them to ocr, check if results are reliable, and if not do more extensive (and time consuming) cleanup. It seems the only areas where there are real issues are text with shaded background, which when scanned ends up a mess of blots, and places where text size changes on the same horizontal line. Other than that, the ocr seems reliable. So why waste time preprocessing other areas?
Coincidentally, we were just recently working on an Otsu thresholding implementation for the eye tracking system at work. It turned out not to work, because the foreground was very small compared to the background. So instead we implemented Kittler's thresholding method, which is an improvement over Otsu's, and it worked extremely well. Something to look into if you have problems with it. I can send you our implementation if you want to take a look at it.
Interestingly, I implemented and tested. Same issue, very large background numbers, which skew the threshold. I also am looking at a particular situation where there is gray background with black text. On a pixel level it ends up black pixels making up the text, with grey blobs at even intervals. Where they intersect, the letters are deformed with bumps, and the image wide threshold ends up making them as black deformities to the text. I'll look at Kittler's to see if it makes more sense. I may need two or three thresholds with some contextual logic.
Subscribe to Post Comments [Atom]
Links to this post:
Subscribe to Posts [Atom]