[Mailman-Developers] Re: Opening up a few can o' worms here...
Jerry Stratton
jerry@sandiego.edu
Tue, 30 Jul 2002 14:05:46 -0700
> > Have you seen what the off the shelf OCR systems like OmniPage do these
> > days?
>
>Yes -- the performance is awful. And that's on ordinary printed text
>that's supposed to be readable, not on text that has been intentionally
>obfuscated.
My experience is otherwise; I use OmniPage 7.0--an old version on our
Macs here at the school--to OCR out-of-copyright texts for placement
on-line. All of these books are old, and many are dirty, ripped,
and/or faded. Some are in strange fonts, tiny fontsizes, and multiple
styles. Sometimes I can even see the text on the other side of the
paper. OmniPage not only gets the correct text (sometimes text that I
wasn't even sure about until I saw OmniPage's "guess"), but it also
keeps the italicization, bolding, subscripts, and superscripts. It
recognizes columns, and even recognizes and automatically reorients
when I accidentally put the book in upside down. And this is from
1997 or earlier technology!
Scanning has come a *long* way from the old Kurzweil washing-machine
that we used to scan Freud's text back in the eighties.
Jerry
--
jerry@sandiego.edu
http://www.sandiego.edu/~jerry/
Serra 188B/x8773
--
The more restrictions there are, the poorer the people become. The
greater the government's power, the more chaotic the nation would
become. The more the ruler imposes laws and prohibitions on his
people, the more frequently evil deeds would occur.
--The Silence of the Wise: The Sayings of Lao Zi