[Spambayes] Unwanted stock solicitations

Vibe Grevsen grevsen at gmail.com
Fri Nov 3 02:10:50 CET 2006


Hi again,

good news - I fiddled a bit more and got it working under Windows :) :) :)


>> ocr = os.popen( ( "ocrad -s %s -c %s -x %s < %s 2>" + os.path.devnull ) %
>>                                (scale, charset, orf, pnmfile)) 

> or better use os.popen3 and discard stderr output.

os.popen3() does not seem to support the read()-method?


> On windows you have to put quote around pnmfile to protect against space
> in path (also un linux you should have them but it's unlikely you get a
> path with a space).

Oh, YES, you're absolutely right.
Thank you for this suggestion.


> On windows there is also an other caveat.
> you should put quote also around ocrad path but if you do that you have
> to quote everything.
> to explain the command should be:

> ocr_cmd = r'""ocrad_path" -s %s -c %s  "%s""'%(scale, charset, pnmfile)
> fin, fout, ferrr = os.popen3(ocr_cmd)

I tested your suggestion, but it seemed to resolve wrong in the interpreter.
Also popen3() could not be read() so I changed it a bit

               # u: unicode support, r: raw string
               ocr_cmd = ur'ocrad -s %s -c %s "%s"' % (scale, charset, pnmfile)
               ocr = os.popen( ocr_cmd )

I also tested this

               # u: unicode support, r: raw string
               ocr_cmd = ( ur'ocrad -s %s -c %s < "%s" 2>' + os.path.devnull ) % (scale, charset, pnmfile)
               ocr = os.popen( ocr_cmd )

Both working in windows so Skip can pick whichever he likes best ;)


>> Maybe you could hint on other parts of the sources I should check for the next lead?

With the above change I only had to do one more thing...
Comment out the check for ocrad, then OCR is working. (Assuming ocrad 0.16 is in the path.)

This means that we should probably work on testing the find_program and is_executable procedures.
As soon as they are finished I could probably start on a new exe-installer-version.
I think I figured how to include PIL in the exe aswell.


>> ocrad -s4 -x out.txt >ocr.txt logo.pgm
>> did produce an ocr.txt but no out.txt for this image http://www.unlockaarhus.dk/dev/logo.pgm.

> using -s (and other flags as well) disable -x.

Hmm, bug, no, better undocumented feature? :)
(At least it's not explained in the ocrad readme as far as I can see...)


> orf file is never used. probably is there from the start before skip
> introduce the scale parameter

Actually he tries to count the number of lines in orf I think

                for line in open(orf):
                ...

But this could of course be done directly on ocr.read().


Happy coding :)
 
Vibe


More information about the SpamBayes mailing list