[Spambayes] Unwanted stock solicitations
pl at symbolic.it
Thu Nov 2 15:40:02 CET 2006
On Thu, 2006-11-02 at 15:23 +0100, Vibe Grevsen wrote:
> Hi friends,
> as promised I'm continuing my tests on implementing OCR under Windows.
> FYI I'm running from sources recently downloaded through CVS.
> > >> ocr = os.popen("ocrad -s %s -c %s -x %s -f %s 2>/dev/null" %
> > Vibe> What is the meaning of the last '2' in the os.popen()-call?
> > It's a Unix-ism that will probably not work on Windows. It sends error
> > messages to the bit bucket.
> Ok, I did a little read-up on this.
> 2> is supported by WinNT, 2k and XP I just newer saw it used before.
> 2> is not supported in Win9x and ME.
> However /dev/null is - of course - not found in Windows. Equivalent is nul (case insensitive).
> Better use os.path.devnull like shown here. Parenthesis required for string formatting!
> ocr = os.popen( ( "ocrad -s %s -c %s -x %s < %s 2>" + os.path.devnull ) %
> (scale, charset, orf, pnmfile))
or better use os.popen3 and discard stderr output.
On windows you have to put quote around pnmfile to protect against space
in path (also un linux you should have them but it's unlikely you get a
path with a space).
On windows there is also an other caveat.
you should put quote also around ocrad path but if you do that you have
to quote everything.
to explain the command should be:
ocr_cmd = r'""ocrad_path" -s %s -c %s "%s""'%(scale, charset, pnmfile)
fin, fout, ferrr = os.popen3(ocr_cmd)
but that doesn't work on linux. If you quote only ocrad_path or pnmfile
you don't need the quote around the command as a whole.
you may resolve the thing (as you have done) putting ocrad in the path
and non quoting it. it this case you need to quote only pnmfile and it
works on both linux and windows.
> Now the surprise is that this executes 100% correctly from the interpreter, but it does not when spambayes runs.
> I still need to check up on exactly what is going on in Spambayes here.
> Maybe you could hint on other parts of the sources I should check for the next lead?
> Finally I was surprised to find that
> ocrad -s4 -x out.txt >ocr.txt logo.pgm
> did produce an ocr.txt but no out.txt for this image http://www.unlockaarhus.dk/dev/logo.pgm.
> Maybe it's only a problem with small images? Could you please test if this is the case under Unix as well?
using -s (and other flags as well) disable -x.
orf file is never used. probably is there from the start before skip
introduce the scale parameter
> Happy coding :)
> SpamBayes at python.org
> Check the FAQ before asking: http://spambayes.sf.net/faq.html
V.le Mentana, 29
Tel: +39 0521 708811
Fax: +39 0521 776190
More information about the SpamBayes