[spambayes-dev] gocr is definitely improving...

Luigi Pugnetti pl at symbolic.it
Tue Feb 6 09:39:14 CET 2007

On Mon, 2007-02-05 at 20:07 -0600, skip at pobox.com wrote:

> Without any massaging ocrad doesn't find any text.  You have to give the
> --invert flag.  Seems like it should automatically try to invert the image
> if its first attempt to extract text completely fails.

you could use a simple check to find if the inverted flag is needed

if ImageStat.Stat(image).mean[0] + ImageStat.Stat(image).mean[1] +
ImageStat.Stat(image).mean[2] >= (128 *3)
  invert flag is needed

this is a very simple check that sometimes could fail (inverted is
needed but the condition is false. I've never seen the opposite)
Probably checking if two of the mean[]s are greater than 128 could
suffice especially when one of them is very big (> 190).

> At any rate, gocr looks much better than it did.  I'm going to install it
> and give your patch a try for a couple days.  It looks fine based on a
> simple skim of the changes.  Go ahead and check it in so more people can
> play with it.
> Skip
> _______________________________________________
> spambayes-dev mailing list
> spambayes-dev at python.org
> http://mail.python.org/mailman/listinfo/spambayes-dev
Luigi Pugnetti

Symbolic S.p.A.
V.le Mentana, 29
I-43100 Parma

Tel: +39 0521 708811
Fax: +39 0521 776190

More information about the spambayes-dev mailing list